A regular expression in Python is a special series of characters that helps you match or find other strings or substrings using the particular syntax held in the pattern.
Python re.sub() is a built-in regex method that specifies a regular expression pattern in the first argument, a new string in the second argument, and the source string that needs to be processed in the third argument.
The re.sub() method accepts five arguments at max and returns the replaced string. To replace one string with another in multiple places in Python, use the re.sub() method.
Why use the re.sub() method?
We should use the re.sub() method because it returns a string where the replace string replaces all matching occurrences of the specified pattern.
re.sub(pattern, replacement, string, count=0, flags=0)
- The pattern is a regular expression that you want to match.
- The replacement is a replacement string.
- The string is input.
- The count argument defines the maximum number of matches the re.sub() method should replace.
- The flags argument modifies the standard behavior of the pattern.
The re.sub() method searches for the pattern in the string and replaces the matched strings with a replacement string.
How to Use the re.sub() Method?
To use the re.sub() method, import the re module.
# app.py import re
Now, define a string in which you need to substitute a substring.
str = 'email@example.com firstname.lastname@example.org email@example.com'
In this example, we need to replace all the substrings before @ character with the TONI substring. To do that, write the following code.
# app.py import re str = 'firstname.lastname@example.org email@example.com firstname.lastname@example.org' print(re.sub('[a-z]*@', 'TONI@', str))
TONI@xxx.com TONI@yyy.com TONI@zzz.com
Getting a plain phone number using the re.sub() method
Using the regex.sub() method, you can get a plain phone number from a formatted string.
import re phone_num = "(212)-456-7890" pattern = "\D" output = re.sub(pattern, "", phone_num) print(output)
You can see that using the re.sub() method accepts a pattern, a substitute string, and input and returns the replaced output, which is a plain number in our case.
The “\D” is an inverse digit character set that matches any character that is not a digit, and the re.sub() method replaces all non-digit characters with the empty string.
Replacing multiple substrings with the exact string
To replace multiple substrings with the same string in Python, use the re.sub() method.
If you are not accustomed to regular expressions, embed a string with [ ] to match any character.
It can be used to replace several multiple characters with the same string.
# app.py import re str = 'email@example.com firstname.lastname@example.org email@example.com' print(re.sub('[ben]', '1', str))
firstname.lastname@example.org email@example.com firstname.lastname@example.org
In this example, we are replacing the character of b, e, and n with 1. No matter the occurrence of those characters, if it finds one, it will replace it with 1.
Replacing using the matched part
If part of the pattern is enclosed in () (rounded brackets), you can use the string that matches the part enclosed in () in the new string. See the following code.
# app.py import re str = 'email@example.com firstname.lastname@example.org email@example.com' print(re.sub('([a-z]*)@', '\\1-123@', str))
firstname.lastname@example.org email@example.com firstname.lastname@example.org
\1 corresponds to the part that matches (). If there are multiple (), use them like \2, \3. It is necessary to escape \ like \\1 if it is a regular string surrounded by ” or ” “, but if it is a raw string with r at the beginning like r”, you can write \1.
That is it.