Regular Expression
A Regular expression is a special sequence of the character which helps you to find or the matching set of strings. It is the highly specialized programming languages embedded into your python library.
Python has a special module re ,that provides the full feature of the regular expression. let’s see the syntax of importing the regular expression.
Syntax
import re
Regular Expression Function
Python RegExp provides many inbuilt functions which are used for finding, marching or searching the strings, which is as follows.
Function | Description |
findall | It returns the list which contains all match. |
search | It returns the match object |
split | It return the match split string |
sub | It replace the one and many string |
Metacharacter
Metacharacter with special meaning
Character | Description |
[ ] | It represent the set of character |
\ | It represent the special sequence |
. | It represent the any character |
^ | Starting with |
$ | Ending with |
* | It represent the zero or more occurence |
+ | It represent one or more occurrence |
{ } | It represent the exactly specified number |
| | Either or |
( ) | It represent the group |
Program of [ ]
import re text = "India is the country of Diversity ,you should be try Indian food" match = re.findall("[m-v]", text) print(match) if (match): print("Match Found") else: print("Not match")
Output
['n', 's', 't', 'o', 'u', 'n', 't', 'r', 'o', 'v', 'r', 's', 't', 'o', 'u', 's', 'o', 'u', 't', 'r', 'n', 'n', 'o', 'o'] Match Found
Program of ‘\’
import re text = "India is the country of Diversity ,you should be try Indian food" match = re.findall("\s", text) print(match) if (match): print("Match Found") else: print("Not match")
Output
[' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' '] Match Found
Program of ‘.’
import re text = "India is the country of Diversity ,you should be try Indian food" match = re.findall("In..a", text) print(match) if (match): print("Match Found") else: print("Not match")
Output
['India', 'India'] Match Found
Program of ‘^’
import re text = "India is the country of Diversity ,you should be try Indian food" match = re.findall("^India", text) print(match) if (match): print("Yes start with",match) else: print("Not match")
Output
Yes start with ['India']
Program of $
import re text = "India is the country of Diversity ,you should be try Indian food" match = re.findall("food$", text) if (match): print("Yes start with",match) else: print("Not match")
Output
Yes start with ['food']
Program of *
import re text = "Salesforcedrillers is a educational blog" match = re.findall("es*", text) print(match) if (match): print("Match Found") else: print("Not match")
Output
['es', 'e', 'e', 'e'] Match Found
Program of +
import re text = "Salesforcedrillers is a educational blog" match = re.findall("es+", text) print(match) if (match): print("Match Found") else: print("Not match")
Output
['es'] Match Found
Program of { }
import re text = "Salesforcedrillers is a educational blog" match = re.findall("es{2}", text) print(match) if (match): print("Match Found") else: print("Not match")
Output
[] Not match
Program of |
import re text = "Salesforcedrillers is a educational blog" match = re.findall("Sales|force", text) print(match) if (match): print("Match Found") else: print("Not match")
Output
['Sales', 'force'] Match Found
Program of ()
import re text = "Salesforcedrillers is a educational blog" match = re.findall("(Salesforce)", text) print(match) if (match): print("Match Found") else: print("Not match")
Output
['Salesforce'] Match Found
Special Sequence
It special sequence is a \ and followed by the character which have specific meaning.
Character | Description |
\A | Returns the match if the specified character are match of the beginning of string |
\b | Returns the match if the specified character are match of the beginning of or end of word |
\B | Returns the match if the specified character are present but not beginning(or at the end) of a word |
\d | Return the match where the string contain the digit |
\D | Return the match where the string does not contain the digit |
\s | Return the match where the string contain the white Space |
\S | Return the match where the string does not contain the space |
\w | Return the match where the string contain any word character(from a to z and 0 to 9,and underscore character) |
\W | Return the match where the string contain does not any word |
\Z | Return the match if the specified character match of ending of string |
Special Sequence
Program (\A)
import re text = "India is the country of Diversity " match = re.findall("\AIndia", text) print(match) if (match): print("Match Found") else: print("Not match")
Output
['India'] Match Found
Program (\b)
import re text = "India is the country of Diversity ,you should be try Indian food" match = re.findall(r"try\b", text) print(match) if (match): print("Match Found") else: print("Not match")
Output
['try', 'try'] Match Found
Program (\B)
import re text = "India is the country of Diversity ,you should be try Indian food" match = re.findall(r"try\B", text) print(match) if (match): print("Match Found") else: print("Not match")
Output
[] Not match
Program (\d)
import re text = "23 is lucky number for me " match = re.findall("\d", text) print(match) if (match): print("Match Found") else: print("Not match")
Output
['2', '3'] Match Found
Program (\D)
import re text = "India is the country of Diversity ,you should be try Indian food" match = re.findall("\D", text) print(match) if (match): print("Match Found") else: print("Not match")
Output
['I', 'n', 'd', 'i', 'a', ' ', 'i', 's', ' ', 't', 'h', 'e', ' ', 'c', 'o', 'u', 't', 'r', 'y', ' ', 'o', 'f', ' ', ' ', 'D', 'i', 'v', 'e', 'r', 's', 'i', 't', 'y', ' ', ',', 'y', 'o', 'u', ' ', 's', 'h', 'o', 'u', 'l', 'd', ' ', 'b', 'e', ' ', 't', 'r', 'y', ' ', 'I', 'n', 'd', 'a', 'i', 'n', ' ', 'f', 'o', 'o', 'd'] Match Found
Program (\s)
import re text = "India is the country of Diversity ,you should be try Indian food" match = re.findall("\s", text) print(match) if (match): print("Match Found") else: print("Not match")
Output
[' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' '] Match Found
Program (\S)
import re text = "India is the country of Diversity ,you should be try Indian food" match = re.findall("\S", text) print(match) if (match): print("Match Found") else: print("Not match")
Output
['I', 'n', 'd', 'i', 'a', 'i', 's', 't', 'h', 'e', 'c', 'o', 'u', 't', 'r', 'y', 'o', 'f', 'D', 'i', 'v', 'e', 'r', 's', 'i', 't', 'y', ',', 'y', 'o', 'u', 's', 'h', 'o', 'u', 'l', 'd', 'b', 'e', 't', 'r', 'y', 'I', 'n', 'd', 'a', 'i', 'n', 'f', 'o', 'o', 'd'] Match Found
Program (\w)
import re text = "India is the country of Diversity ,you should be try Indian food" match = re.findall("\w", text) print(match) if (match): print("Match Found") else: print("Not match")
Output
['I', 'n', 'd', 'i', 'a', 'i', 's', 't', 'h', 'e', 'c', 'o', 'u', 't', 'r', 'y', 'o', 'f', 'D', 'i', 'v', 'e', 'r', 's', 'i', 't', 'y', 'y', 'o', 'u', 's', 'h', 'o', 'u', 'l', 'd', 'b', 'e', 't', 'r', 'y', 'I', 'n', 'd', 'a', 'i', 'n', 'f', 'o', 'o', 'd'] Match Found
Program (\W)
import re text = "India is the country of Diversity ,you should be try Indian food" match = re.findall("\W", text) print(match) if (match): print("Match Found") else: print("Not match")
Output
[' ', ' ', ' ', ' ', ' ', ' ', ',', ' ', ' ', ' ', ' ', ' '] Match Found
Program (\Z)
import re text = "India is the country of Diversity ,you should be try Indian food" match = re.findall("food\Z", text) print(match) if (match): print("Match Found") else: print("Not match")
Output
['food'] Match Found
SET
It is contain some instruction inside the [ ] bracket to perform any specific task in string
Set | Description |
[bkz] | Returns the match of the specified character (b,k or z)present in string |
[a-m] | Return match string which is comes among a to m |
[^ cds] | Return the march for any character except(c,d or s) |
[245] | Returns the match of the specified digits (b,k or z)present in string |
[0-9] | Returns all the matches of digit comes between 0 to 9 |
[a-zA-Z] | Returns all the matches of lower case or upper case character comes between a to z and A to Z |
[$] | Returns the all matches of specific special character in string. |
Here will see the Program of findall function.
Program 1
import re text='salesforcedillers is a educational blog' x=re.findall('es',text) print(x)
Output
['es']
Program 2 finding all ‘s’ character in the strings
import re text='salesforcedillers is a educational blog' x=re.findall('s',text) print(x)
Output
['s', 's', 's', 's']
Program 3 When matching not found
import re text='India is the country of Diversity ' x=re.findall('Japan',text) print(x)
Output
[ ]
Search program is RegExp
Program
import re text = "Salesforcedrillers is a educational blog" match = re.search("\s", text) print(match.start())
Output
18
Program 2
import re text = "Salesforcedrillers is a educational blog" match = re.search("educational", text) print(match)
Output
<_sre.SRE_Match object; span=(24, 35), match='educational'>
Split Program in RegExp
Program
import re text = "Salesforcedrillers is a educational blog" match = re.split("\s", text) print(match)
Output
['Salesforcedrillers', 'is', 'a', 'educational', 'blog']
Sub Program in RegExp
Program
import re text = "Salesforcedrillers is a educational blog" match = re.sub("\s"," Sub ", text) print(match)
Output
Salesforcedrillers Sub is Sub a Sub educational Sub blog