python regular expression match, search and findall

Python regular expression match, search and findall

Python beginners may sometimes get confused by this match and search functions in the regular expression module, since they are accepting the same parameters and return the same result in most of the simple use cases.  In this article, let’s discuss about the difference between these two functions.

match vs search in Python regular expression

Let’s start from an example. Let’s say if we want to get the words which ending with “ese” in the languages, both of the below match and search return the same result in match objects.

import re
languages = "Japanese,English"
m = re.match("\w+(?=ese)",languages)
#m returns : <re.Match object; span=(0, 5), match='Japan'>

m = re.search("\w+(?=ese)",languages)
#m returns : <re.Match object; span=(0, 5), match='Japan'>

But if the sequence of your languages changed, e.g. languages = “English, Japanese”, then you will see some different results:

languages = "English,Japanese" 
m = re.match("\w+(?=ese)",languages) 
#m returns empty
m = re.search("\w+(?=ese)",languages) 
#m returns : <re.Match object; span=(8, 13), match='Japan'>

The reason is that match function only starts the matching from the beginning of your string, while search function will start matching from anywhere in your string. Hence if the pattern you want to match may not start from the beginning, you shall always use search function.

In this case, if you want to restrict the matching only start from the beginning, you can also achieve it with search function by specifying “^” in your pattern:

languages = "English,Japanese,Chinese" 
m = re.search("^\w+(?=ese)",languages) 
#m returns empty
m = re.search("\w+(?=ese)",languages)
#m returns: <re.Match object; span=(8, 13), match='Japan'>

findall in Python regular expression

You may also notice when there are multiple occurrences of the pattern, search function only returns the first matched. This sometimes may not be desired when you actually want to see the full list of matched patterns. To return all the occurrences, you can use the findall function:

languages = "English,Japanese,Chinese,Burmese"
m = re.findall("\w+(?=ese)", languages)
#m returns: ['Japan', 'Chin', 'Burm']

 

 

You may also like

0 0 votes
Article Rating
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x