Regular Expression (RegEx) in Python

Regular Expression(RegEx) represents a sequence of special characters that are used to search a particular string pattern.  Regular expressions are mostly used in the world of UNIX. It is used to check whether a particular string contains a particular search pattern or not. In Python, the re module is used to perform such operation. This module generates error also if at the time of compilation any error occurs.

Metacharacters:

There are different types of meta characters present in that re module. These metacharacters are also characters that have special meaning. A list of some metacharacters is given below.

Operator Description Sample
\ It is used for special sequence and also used to drop the special character “\d”
[] It  represents a set of characters “[a-m]”
^ It is used to match with the beginning “^welcome”
$ It is used to matches with the end “welcome$”
. It is used to match any characters except newline characters “wel…e”
? It is used to match with less than or equal to one occurrence. “ma?n”
| It is used to match with any of the characters separated by this character. “Left|right”
* It represents one occurrences and it also includes 0 occurrences. “all*”
+ It is used for more than or equal to one occurrence. “all+”
{} It represents the exact number of occurrences to match a particular string. “hello{2}”
() It is used to create a group of Regular Expressions. “(a|m|n)bc”

Special Sequences:

Operator Description Sample
\A It returns a match pattern if that particular character presents in the starting. “\Ahello”
\d It is used to match a pattern with digits from 0 to 9 “\d”
\D It returns the pattern that does not have any digits from 0 to 9. “\D”
\s It returns a pattern that has any whitespaces. “\s”
\S It returns the pattern that has no whitespaces in the string “\S”
\w It returns the pattern that has alphanumeric character in the string “\w”
\W It returns the pattern that has no alphanumeric character in the string “\W”

Sets:

Operator Description Sample
[ arn ] It returns a pattern that has specific these 3 characters a, r and n [‘a,’,’s’,’r’,’z’,’n’]
[ a-n ] It returns a pattern that has lowercase letters from a to n. [‘a,’,’s’,’r’,’z’,’n’]
[ ^arn ] It return a pattern that does not has any of these 3 characters a, n and r [’s’,’d’,’z’,’f’]
[ 0123 ] It returns a pattern that has specific these 4 numbers 0, 1, 2 and 3 [ ]
[ 0-9 ] It returns a pattern that has digits from 0 to 9 [‘2,’,’3’,’8’,’4’,’1’]
[ 0-5 ][ 0-9 ] It returns a pattern that has a specific two-digit number from 00 to 59. [‘12,’,’80’,’45’]
[ a-zA-Z ] It returns a pattern that has any lower case letters from a to z or upper case letters from A to Z. [‘a,’,’S’,’r’,’Z’,’n’]
[ + ] It returns a pattern that has specific any + character. [ ]

RegEx Functions:

  1. The findall() Function: This function returns a list that contains all the patterns match with the specific given string. If no pattern matches, it will give an empty list.
  2. The search() Function: This function returns a math object if a pattern is matched a particular string. It returns none if the patterns do not match in the string.
  3. The split() Function: This function returns the list that is split at every whitespace.
  4. The sub() Function: This function helps to replace a particular pattern with the user given input.