How MySQL Executes JOINs Internally
MySQL uses multiple internal algorithms to execute JOIN operations. The specific strategy depends on the query, available indexes, table sizes, and optimizer decisions.
• MySQL’s default join algorithm.
• For each row in the outer table, MySQL searches matching rows in the inner table.
• Efficient when join columns are indexed.
• Worst-case O(N × M) if no indexes exist.
MySQL may use different optimizations of nested loops:
• The most common and fastest approach when the join condition column is indexed.
• MySQL probes the inner table index for each row of the outer table.
• Used when the inner table lacks a suitable index.
• MySQL loads chunks (blocks) of rows from the outer table into memory.
• Scans the entire inner table for matches.
• Can be expensive for large joins.
• MySQL supports hash joins for equality-based INNER JOINs.
• MySQL builds an in-memory hash table of the smaller table.
• Then probes it for each row in the larger table.
• Much faster than nested loops on large unindexed tables.
• Not used for non-equality joins (>, <, BETWEEN).
• Requires enabling via optimizer switch.
• MySQL gathers multiple lookup keys and performs batched index lookups.
• Reduces random disk I/O by accessing index pages in sorted order.
• Beneficial for large joins and slow disks.
• MySQL may place outer rows into a join buffer (controlled by join_buffer_size).
• Avoids repeatedly scanning outer table rows.
• Used for non-indexed joins or complex ON conditions.
• MySQL may use temp tables for ORDER BY, GROUP BY, DISTINCT, and subqueries.
• Temp tables can be in-memory or on disk.
• Slows JOIN performance when large intermediate results are needed.
In summary: MySQL primarily uses nested loop joins but automatically switches to hash joins, batched key access, or block nested loops depending on indexes and query structure. Understanding these internal methods helps in optimizing JOIN-heavy workloads.