Understanding Index Selectivity and Its Impact on Performance
Index selectivity measures how unique the values in an indexed column are. Higher selectivity means the column contains many distinct values, which helps MySQL filter rows more efficiently.
Selectivity = (Number of distinct values) / (Total number of rows)
A selectivity value close to 1.0 means high uniqueness (very selective).
A lower value means many duplicates (poor selectivity).
MySQL prefers indexes with high selectivity because they reduce the number of rows scanned during query execution.
High selectivity: email, user_id, phone_number
Medium selectivity: last_name, category_id
Low selectivity: gender, status flags, boolean fields
High-selectivity indexes dramatically speed up WHERE lookups.
Low-selectivity indexes are often ignored by the optimizer because full table scans may be faster.
Composite index performance depends on the selectivity of the leftmost column.
Low-selectivity columns (like boolean flags) should rarely be indexed alone.
Table has 1,000,000 rows.
Column gender has only 2 values (M/F).
Selectivity = 2 / 1,000,000 = 0.000002 (very low).
MySQL will likely skip the index and perform a full table scan.
In summary, index selectivity is critical for index usefulness. High-selectivity indexes improve performance significantly, while low-selectivity indexes may be ignored because they do not filter rows effectively.