Skip to content

Regular Expressions in Python

New Courses Coming Soon

Join the waiting lists

Regular expressions let us find content inside strings matching a particular format.

By formulating a regular expression with a special syntax, you can

The re Python standard library module gives us a set of tools to work with regular expressions.

In particular, among others it offers us the following functions:

Both take take 3 parameters: the pattern, the string to search into, and the flags.

Before talking about how to use them, let’s introduce the basics of a regular expression pattern.

The pattern is a string wrapped in a r'' delimiter. Inside it, we can use some special combinations of characters we can use to capture the values we want.

For example:

Square brackets can contain multiple characters matches: [\d\sa] matches digits and whitespaces, and the character a. [a-z] matches characters from a to z.

\ can be used to escape, for example to match the dot ., you should use \. in your pattern.

| means or

Then we have anchors:

Then we have quantity modifiers:

Parentheses, (<expression>), create a group. Groups are interesting because we can capture the content of a group.

Those 2 examples match the whole string:

re.match('^.*Roger', 'My dog name is Roger')
re.match('.*', 'My dog name is Roger')

Printing one of those statements will result in a string like this:

<re.Match object; span=(0, 20), match='My dog name is Roger'>

If you assign the result to a result variable and call group() on it, you will see the match:

result = re.match('^.*Roger', 'My dog name is Roger')
print(result.group())
# My dog name is Roger

Let’s try to get the name of the dog, if you don’t know what is going to be the name of the dog, you can look for “name is ” and then add a group, like this:

result = re.search('name is (.*)', 'My dog name is Roger')

result.group() will print “name is Roger”, and result.group(1) will print the content of the group, “Roger”:

print(result.group())  # name is Roger
print(result.group(1)) # Roger

I mentioned re.search() and re.match() take flags as the 3rd parameter. We have a few possible flags, the most used is re.I to perform a case-insensitive match.

This is just an introduction to regular expressions, starting from this there’s a lot of rabbit holes you can go into.

I recommend trying your regular expressions on https://regex101.com for correctness. Make sure you choose the Python flavor in the sidebar.

→ Get my Python Handbook
→ Get my Python Handbook

Here is how can I help you: