Regular Expression

A Regular expression is a special sequence of the character which helps you to find or the matching set of strings. It is the highly specialized programming languages embedded into your python library.

Python has a special module re ,that provides the full feature of the regular expression. let’s see the syntax of importing the regular expression.

Syntax

import re

Regular Expression Function

Python RegExp provides many inbuilt functions which are used for finding, marching or searching the strings, which is as follows.

Function Description
findall It returns the list which contains all match.
search It returns the match object
split It return the match split string
sub It replace the one and many string

Metacharacter

Metacharacter with special meaning

Character Description
[ ] It represent the set of character
\ It represent the special sequence
. It represent the any character
^ Starting with
$ Ending with
* It represent the zero or more occurence
+ It represent one or more occurrence
{ } It represent the exactly specified number
| Either or
( ) It represent the group

Program of [ ]

import re
text = "India is the country of Diversity ,you should be try Indian food"
match = re.findall("[m-v]", text)
print(match)
if (match):
  print("Match Found")
else:
  print("Not match")

Output

['n', 's', 't', 'o', 'u', 'n', 't', 'r', 'o', 'v', 'r', 's', 't', 'o', 'u', 's', 'o', 'u', 't', 'r', 'n', 'n', 'o', 'o']
Match Found

Program of ‘\’

import re
text = "India is the country of Diversity ,you should be try Indian food"
match = re.findall("\s", text)
print(match)
if (match):
  print("Match Found")
else:
  print("Not match")

Output

[' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ']
Match Found

Program of ‘.’

import re
text = "India is the country of Diversity ,you should be try Indian food"
match = re.findall("In..a", text)
print(match)
if (match):
  print("Match Found")
else:
  print("Not match")

Output

['India', 'India']
Match Found

Program of ‘^’

import re
text = "India is the country of Diversity ,you should be try Indian food"
match = re.findall("^India", text)
print(match)
if (match):
  print("Yes start with",match)
else:
  print("Not match")

Output

Yes start with ['India']

Program of $

import re
text = "India is the country of Diversity ,you should be try Indian food"
match = re.findall("food$", text)

if (match):
  print("Yes start with",match)
else:
  print("Not match")

Output

Yes start with ['food']

Program of *

import re
text = "Salesforcedrillers is a educational blog"
match = re.findall("es*", text)
print(match)
if (match):
  print("Match Found")
else:
  print("Not match")

Output

['es', 'e', 'e', 'e']
Match Found

Program of +

import re
text = "Salesforcedrillers is a educational blog"
match = re.findall("es+", text)
print(match)
if (match):
  print("Match Found")
else:
  print("Not match")

Output

['es']
Match Found

Program of { }

import re
text = "Salesforcedrillers is a educational blog"
match = re.findall("es{2}", text)
print(match)
if (match):
  print("Match Found")
else:
  print("Not match")

Output

[]
Not match

Program of |

import re
text = "Salesforcedrillers is a educational blog"
match = re.findall("Sales|force", text)
print(match)
if (match):
  print("Match Found")
else:
  print("Not match")

Output

['Sales', 'force']
Match Found

Program of ()

import re
text = "Salesforcedrillers is a educational blog"
match = re.findall("(Salesforce)", text)
print(match)
if (match):
  print("Match Found")
else:
  print("Not match")

Output

['Salesforce']
Match Found

Special Sequence

It special sequence is a \ and followed by the character which have specific meaning.

Character Description
\A Returns the match if the specified character are match of the beginning of string
\b Returns the match if the specified character are match of the beginning of or end of word
\B Returns the match if the specified character are present but not beginning(or at the end) of a word
\d Return the match where the string contain the digit
\D Return the match where the string does not contain the digit
\s Return the match where the string contain the white Space
\S Return the match where the string does not contain the space
\w Return the match where the string contain any word character(from a to z and 0 to 9,and underscore character)
\W Return the match where the string contain does not any word
\Z Return the match if the specified character match of ending of string

Special Sequence

Program (\A)

import re
text = "India is the country of  Diversity "
match = re.findall("\AIndia", text)
print(match)
if (match):
  print("Match Found")
else:
  print("Not match")

Output

['India']
Match Found

Program (\b)

import re
text = "India is the country of  Diversity ,you should be try Indian food"
match = re.findall(r"try\b", text)
print(match)
if (match):
  print("Match Found")
else:
  print("Not match")

Output

['try', 'try']
Match Found

Program (\B)

import re
text = "India is the country of  Diversity ,you should be try Indian food"
match = re.findall(r"try\B", text)
print(match)
if (match):
  print("Match Found")
else:
  print("Not match")

Output

[]
Not match

Program (\d)

import re
text = "23 is lucky number for me "
match = re.findall("\d", text)
print(match)
if (match):
  print("Match Found")
else:
  print("Not match")

Output

['2', '3']
Match Found

Program (\D)

import re
text = "India is the country of Diversity ,you should be try Indian food"
match = re.findall("\D", text)
print(match)
if (match):
  print("Match Found")
else:
  print("Not match")

Output

['I', 'n', 'd', 'i', 'a', ' ', 'i', 's', ' ', 't', 'h', 'e', ' ', 'c', 'o', 'u', 't', 'r', 'y', ' ', 'o', 'f', ' ', ' ', 'D', 'i', 'v', 'e', 'r', 's', 'i', 't', 'y', ' ', ',', 'y', 'o', 'u', ' ', 's', 'h', 'o', 'u', 'l', 'd', ' ', 'b', 'e', ' ', 't', 'r', 'y', ' ', 'I', 'n', 'd', 'a', 'i', 'n', ' ', 'f', 'o', 'o', 'd']
Match Found

Program (\s)

import re
text = "India is the country of Diversity ,you should be try Indian food"
match = re.findall("\s", text)
print(match)
if (match):
  print("Match Found")
else:
  print("Not match")

Output

[' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ']
Match Found

Program (\S)

import re
text = "India is the country of Diversity ,you should be try Indian food"
match = re.findall("\S", text)
print(match)
if (match):
  print("Match Found")
else:
  print("Not match")

Output

['I', 'n', 'd', 'i', 'a', 'i', 's', 't', 'h', 'e', 'c', 'o', 'u', 't', 'r', 'y', 'o', 'f', 'D', 'i', 'v', 'e', 'r', 's', 'i', 't', 'y', ',', 'y', 'o', 'u', 's', 'h', 'o', 'u', 'l', 'd', 'b', 'e', 't', 'r', 'y', 'I', 'n', 'd', 'a', 'i', 'n', 'f', 'o', 'o', 'd']
Match Found

Program (\w)

import re
text = "India is the country of Diversity ,you should be try Indian food"
match = re.findall("\w", text)
print(match)
if (match):
  print("Match Found")
else:
  print("Not match")

Output

['I', 'n', 'd', 'i', 'a', 'i', 's', 't', 'h', 'e', 'c', 'o', 'u', 't', 'r', 'y', 'o', 'f', 'D', 'i', 'v', 'e', 'r', 's', 'i', 't', 'y', 'y', 'o', 'u', 's', 'h', 'o', 'u', 'l', 'd', 'b', 'e', 't', 'r', 'y', 'I', 'n', 'd', 'a', 'i', 'n', 'f', 'o', 'o', 'd']
Match Found

Program (\W)

import re
text = "India is the country of Diversity ,you should be try Indian food"
match = re.findall("\W", text)
print(match)
if (match):
  print("Match Found")
else:
  print("Not match")

Output

[' ', ' ', ' ', ' ', ' ', ' ', ',', ' ', ' ', ' ', ' ', ' ']
Match Found

Program (\Z)

import re
text = "India is the country of Diversity ,you should be try Indian food"
match = re.findall("food\Z", text)
print(match)
if (match):
  print("Match Found")
else:
  print("Not match")

Output

['food']
Match Found

SET

It is contain some instruction inside the [ ] bracket to perform any specific task in string
Set Description
[bkz] Returns the match of the specified character (b,k or z)present in string
[a-m] Return match string which is comes among a to m
[^ cds] Return the march for any character except(c,d or s)
[245] Returns the match of the specified digits (b,k or z)present in string
[0-9] Returns all the matches of digit comes between 0 to 9
[a-zA-Z] Returns all the matches of lower case or upper case character comes between a to z and A to Z
[$] Returns the all matches of specific special character in string.

Here will see the Program of findall function.

Program 1

import re 

text='salesforcedillers is a educational blog'

x=re.findall('es',text)
print(x)

Output

['es']

Program 2 finding all ‘s’ character in the strings

import re 

text='salesforcedillers is a educational blog'

x=re.findall('s',text)
print(x)

Output

['s', 's', 's', 's']

Program 3 When matching not found

import re 

text='India is the country of Diversity '

x=re.findall('Japan',text)
print(x)

Output

[ ]

Search program is RegExp

Program

import re
text = "Salesforcedrillers is a educational blog"
match = re.search("\s", text)
print(match.start())

Output

18

Program 2

import re
text = "Salesforcedrillers is a educational blog"
match = re.search("educational", text)
print(match)

Output

<_sre.SRE_Match object; span=(24, 35), match='educational'>

Split Program in RegExp

Program

import re
text = "Salesforcedrillers is a educational blog"
match = re.split("\s", text)
print(match)

Output

['Salesforcedrillers', 'is', 'a', 'educational', 'blog']

Sub Program in RegExp

Program

import re
text = "Salesforcedrillers is a educational blog"
match = re.sub("\s"," Sub ", text)
print(match)

Output

Salesforcedrillers Sub is Sub a Sub educational Sub blog
Subscribe Now