MySQL Regular Expressions (Regexp)

Regular expressions or regexes are strings that define a pattern. We can use regexes to check if the data contains a string of digits or strings of alphabets or some pattern.

The main advantage of using regexes is that we can search Strings using more than just Like and % operators.

My SQL supports the regexp, rlike, not regexp operator for supporting regex patterns. The Basic syntax of using a regexp operator is

SELECT <column_list>
FROM <table-names>
WHERE <column of String datatype> REGEXP pattern;

SELECT <column_list>
FROM <table-names>
WHERE <column of String datatype> RLIKE pattern;

SELECT <column_list>
FROM <table-names>
WHERE <column of String datatype> NOT REGEXP pattern;

The regexp needs to be with a column type of String. If the column value matches the pattern, a resultSet is returned. If no matches found then NULL is returned.

To construct regex patterns, we use metacharacters. The metacharacters that can be used are as follows:

Metacharacter Behavior
^ matches the position at the beginning of the searched string
$ matches the position at the end of the searched string
. matches any single character
[…] matches any character specified inside the square brackets
[^…] matches any character not specified inside the square brackets
p1|p2 matches any of the patterns p1 or p2
* matches the preceding character zero or more times
+ matches preceding character one or more times
{n} matches n number of instances of the preceding character
{m,n} matches from m to n number of instances of the preceding character

The Advantage of Regex patterns is that we can expand the searching of various patterns. The disadvantage of the regex is that they are difficult to maintain.  

Some Examples:

Let us look at a few examples of Regex strings we would use in MySQL.

1. If we want to check for all strings that start with the letter 'A', the regex is 

SELECT * FROM FILM WHERE TITLE REGEXP '^A';

2. To check all the strings that do not start with A

SELECT * FROM FILM WHERE TITLE REGEXP '^[^A]';

 

3. To get all such records that have the last updated date other than the year 2006.

SELECT title,last_update FROM FILM WHERE LAST_UPDATE  NOT REGEXP '^2006+';

4. To get all such names which have the pattern <some character between H and M> + one character+ T

For example The name: Matthew  M (one character between H & M) + A (one single character) + T 

SELECT FIRST_NAME FROM ACTOR WHERE FIRST_NAME REGEXP '[H-M].[T]' ;