Join Optimization Enhancements in MySQL 8.0
MySQL 8.0 introduced several major improvements to its JOIN optimizer, making join execution faster, more efficient, and more scalable than in MySQL 5.x. These improvements enhance join algorithms, join buffering, cost estimation, and the ability to handle large datasets.
MySQL 5.x relied almost entirely on nested-loop joins.
MySQL 8.0 added a cost-based Hash Join algorithm for equality joins.
MySQL builds an in-memory hash table from the smaller table and probes it using the larger table.
Particularly beneficial when JOIN columns are not indexed or when joining large analytical datasets.
For equality joins (ON t1.col = t2.col).
When the optimizer estimates that nested-loop joins are slower.
When indexes are missing or not selective.
For large datasets typical in reporting and analytics workloads.
You can see hash join usage in EXPLAIN FORMAT=JSON under "join_algorithm": "hash_join".
BKA existed since MySQL 5.6, but MySQL 8.0 improved buffering and batching efficiency.
BKA reduces random I/O by batching index lookups instead of performing single-row lookups.
Enhanced performance for joins involving secondary indexes on large tables.
BKA appears in EXPLAIN as using join buffer (BKA) when enabled.
MySQL 8.0 improved how join buffers are allocated and reused.
Join buffers now dynamically resize based on workload.
Block nested-loop joins perform fewer disk reads for non-indexed joins.
MySQL 8.0 has a new cost model for selecting join order.
Optimizer considers more join permutations than earlier versions.
Better cardinality estimates using histogram statistics.
Improves performance for JOINs involving many tables.
Introduced in MySQL 8.0 for more accurate stats on non-indexed columns.
Reduces incorrect join order choices caused by poor cardinality estimates.
Results in fewer slow query plans and more efficient join execution.
MySQL 8.0 materializes fewer derived tables during join execution.
More subqueries are merged into outer queries and optimized as joins.
Reduces temporary table usage and improves join performance.
While not full parallel JOIN execution, InnoDB now performs faster table and index scans.
Improves the performance of joins involving large sequential reads.
MySQL 8.0 significantly improved join optimization by adding hash joins, enhancing Batched Key Access, improving join buffer algorithms, introducing histogram-based cardinality estimation, and choosing better join orders. These changes make JOIN operations much faster, especially for analytical workloads and large datasets.