Differences Between REGEXP in MySQL 5.x and MySQL 8.0 (ICU-Based Regex)
MySQL 5.x and MySQL 8.0 use completely different regex engines. MySQL 5.x relies on an older POSIX ERE engine, while MySQL 8.0 uses the far more powerful ICU regex engine. This results in major differences in capability, performance, Unicode handling, and supported syntax.
Regex Engine: MySQL 5.x uses POSIX Extended Regular Expressions (ERE), while MySQL 8.0 uses the ICU (International Components for Unicode) regex engine, which is more modern and feature-rich.
Unicode Support: MySQL 8.0 fully supports Unicode-aware matching via ICU; MySQL 5.x handles multibyte characters but lacks full Unicode semantics.
Advanced Features: MySQL 8.0 supports many advanced constructs such as lookaheads, lookbehinds, named classes like \d, \s, \w, and Unicode properties like \p{L}. MySQL 5.x does not support any of these.
Escaping Rules: ICU in MySQL 8.0 allows common escape sequences; POSIX ERE in MySQL 5.x requires stricter escaping.
Performance: The ICU engine in MySQL 8.0 is significantly faster and more optimized for complex patterns.
Consistency Across Platforms: ICU gives cross-platform consistency. POSIX ERE depends more on system libraries and behaves less consistently.
Overall, MySQL 8.0 REGEXP behaves much closer to modern regex engines found in languages like JavaScript, Java, Python, and PHP.