Python re.sub() Method: The Basics

0
1817
Python re sub - How to Replace Substring in Python

A regular expression in Python is a special series of characters that helps you match or find other strings or substrings using the particular syntax held in the pattern.

Python re.sub

Python re.sub() is a built-in regex method that specifies a regular expression pattern in the first argument, a new string in the second argument, and the source string that needs to be processed in the third argument.

The re.sub() method accepts five arguments at max and returns the replaced string. To replace one string with another in multiple places in Python, use the re.sub() method.

Why use the re.sub() method?

We should use the re.sub() method because it returns a string where the replace string replaces all matching occurrences of the specified pattern.

Syntax

re.sub(pattern, replacement, string, count=0, flags=0)

Parameters

  1. The pattern is a regular expression that you want to match.
  2. The replacement is a replacement string.
  3. The string is input.
  4. The count argument defines the maximum number of matches the re.sub() method should replace.
  5. The flags argument modifies the standard behavior of the pattern.

Return value

The re.sub() method searches for the pattern in the string and replaces the matched strings with a replacement string.

How to Use the re.sub() Method?

To use the re.sub() method, import the re module.

# app.py

import re

Now, define a string in which you need to substitute a substring.

str = 'abel@xxx.com selena@yyy.com benny@zzz.com'

In this example, we need to replace all the substrings before @ character with the TONI substringTo do that, write the following code.

# app.py

import re

str = 'abel@xxx.com selena@yyy.com benny@zzz.com'

print(re.sub('[a-z]*@', 'TONI@', str))

Output

TONI@xxx.com TONI@yyy.com TONI@zzz.com

Getting a plain phone number using the re.sub() method

Using the regex.sub() method, you can get a plain phone number from a formatted string.

import re

phone_num = "(212)-456-7890"
pattern = "\D"
output = re.sub(pattern, "", phone_num)

print(output)

Output

2124567890

You can see that using the re.sub() method accepts a pattern, a substitute string, and input and returns the replaced output, which is a plain number in our case.

The “\D” is an inverse digit character set that matches any character that is not a digit, and the re.sub() method replaces all non-digit characters with the empty string.

Replacing multiple substrings with the exact string

To replace multiple substrings with the same string in Python, use the re.sub() method.

If you are not accustomed to regular expressions, embed a string with [ ] to match any character.

It can be used to replace several multiple characters with the same string.

# app.py

import re

str = 'abel@xxx.com selena@yyy.com benny@zzz.com'

print(re.sub('[ben]', '1', str))

Output

a11l@xxx.com s1l11a@yyy.com 1111y@zzz.com

In this example, we are replacing the character of b, e, and n with 1. No matter the occurrence of those characters, if it finds one, it will replace it with 1.

Replacing using the matched part

If part of the pattern is enclosed in () (rounded brackets), you can use the string that matches the part enclosed in () in the new string. See the following code.

# app.py

import re

str = 'abel@xxx.com selena@yyy.com benny@zzz.com'

print(re.sub('([a-z]*)@', '\\1-123@', str))

Output

abel-123@xxx.com selena-123@yyy.com benny-123@zzz.com

\1 corresponds to the part that matches (). If there are multiple (), use them like \2, \3. It is necessary to escape \ like \\1 if it is a regular string surrounded by ” or ” “, but if it is a raw string with r at the beginning like r”, you can write \1.

That is it.

Further reading

Python re replace()

Leave A Reply

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.