How to Check If a String is a Valid Email Address in Python

If you are collecting users’ email addresses for marketing purpose, then it becomes essential that the email addresses are in the correct format and potentially usable.

Here are some basic email rules that must be followed:

“Email” should not have a leading dot in the local part.
“Email” should not have consecutive dots.
It should contain valid characters in local and domain parts.
Length restrictions (total email length, domain length, local part length).
Internationalized domain names (IDNs).

Here are three ways to check if a string is a valid email address in Python:

Using “re” module
Using “email_validator” module
Using “email.utils.parseaddr()” module

Method 1: Using “re” module

You can use the “re.match()” method to search the regular expression pattern and try to match a defined pattern. If you combine this with the “is not” operator, you will get True for a valid email address and false otherwise.

import re


def is_valid_email(email):
    pattern = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'
    return re.match(pattern, email) is not None


# Usage
print(is_valid_email("user@example.com"))
print(is_valid_email("invalid-email"))

Output

True

False

This approach is efficient for basic email validation and quick to implement. However, it may not catch all edge cases or comply with RFC 5322 standards.

To cover all the use cases and check strictly, you need to expand your regular expression pattern and add various rules like this:

import re

email_regex = re.compile(r"""
    ^(?!\.)((?!.*\.{2})[a-zA-Z0-9\u00C0-\u02FF\u0370-\u1EFF]
    [\u00C0-\u02FF\u0370-\u1EFF\w\-\.!#$%&'*+\/=?^`{|}~\x27]{0,63})
    @(?=.{1,255}$)(?!-)([a-zA-Z0-9][a-zA-Z0-9\-]{0,62}[a-zA-Z0-9])
    \.(?:[a-zA-Z]{2,}|xn--[a-zA-Z0-9]{2,})$
""", re.VERBOSE | re.UNICODE)


def is_valid_email(email):
    return bool(email_regex.match(email))


# Usage
print(is_valid_email("user@example.com"))
print(is_valid_email("invalidemail@invalid.co.org"))
print(is_valid_email("krunal@appdividend.com"))
print(is_valid_email("krunal@appdividend"))

Output

True
False
True
False

In this code, we wrote a comprehensive regex pattern that includes all the rules for email addresses. If the input string doesn’t satisfy any of the rules, it will mark that email address as invalid and return false.

Validate emails from a text file

If you want to create a file containing valid and invalid email addresses from another text file, you need to perform the file operations in combination with the re.match() function.

Here is our “emails.txt” file, which we will read in another file to find the valid and invalid email addresses:

john.english@example.com
invalid.email@
kb_smith@company.co.uk
not_an_email
user456@subdomain.example.org
missingatsymbol.com
test+alias@gmail.com
spaces are @not.allowed
lastexample@valid.io

Here is our main Python code:

import re
from pathlib import Path


def is_valid_email(email):
    # This is a simple regex pattern.
    # You can replace it with more complex validation if needed.
    pattern = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'
    return re.match(pattern, email) is not None


def validate_emails_from_file(input_file, output_file):
    input_path = Path(input_file)
    output_path = Path(output_file)

    if not input_path.exists():
        print(f"Error: Input file '{input_file}' does not exist.")
        return

    valid_emails = []
    invalid_emails = []

    with input_path.open('r') as file:
        for line in file:
            email = line.strip()
            if is_valid_email(email):
                valid_emails.append(email)
            else:
                invalid_emails.append(email)

    with output_path.open('w') as file:
        file.write("Valid emails:\n")
        for email in valid_emails:
            file.write(f"{email}\n")
        file.write("\nInvalid emails:\n")
        for email in invalid_emails:
            file.write(f"{email}\n")

    print(f"Validation complete. Results written to '{output_file}'.")
    print(f"Valid emails: {len(valid_emails)}")
    print(f"Invalid emails: {len(invalid_emails)}")


# Usage
input_file = "emails.txt"
output_file = "validation_results.txt"
validate_emails_from_file(input_file, output_file)

Output

Validation complete. Results written to 'validation_results.txt'.

Valid emails: 5
Invalid emails: 4

Here is the output “validation_results.txt” file:

Valid emails:

john.english@example.com
kb_smith@company.co.uk
user456@subdomain.example.org
test+alias@gmail.com
lastexample@valid.io

Invalid emails:

invalid.email@
not_an_email
missingatsymbol.com
spaces are @not.allowed

At first, we read a text file line by line, checked the email address against the regex pattern, and created two lists: one for valid emails and the other for invalid emails. Furthermore, we wrote both lists to a new file, and if you check out that file, it looks like the above output.

Method 2: Using “email_validator” module

If you don’t want to use regular expression, you can use the third-party “email-validator” library. It provides validate_email() function that checks if the email is valid or not.

You can install the “email_validator” library using the command below:

pip install email_validator

Here is the code for the email_validator module:

from email_validator import validate_email, EmailNotValidError


def valid_email_using_email_validator(email):
    try:
        validate_email(email)
        return True
    except EmailNotValidError:
        return False


print(valid_email_using_email_validator("john.english@example.com"))
print(valid_email_using_email_validator("invalid.email@"))
print(valid_email_using_email_validator("kb_smith@company.co.uk"))
print(valid_email_using_email_validator("not_an_email"))
print(valid_email_using_email_validator("user456@subdomain.example.org"))
print(valid_email_using_email_validator("user@example.com"))
print(valid_email_using_email_validator("invalid-email"))

Output

True
False
True
False
True
True
False

This approach is robust, requires less coding, and complies with RFC 5322 standards. It handles internationalized email addresses and provides detailed error messages if anything goes sideways. However, this method requires installing third-party packages and may be overkill for simple use cases.

Method 3: Using built-in “email.utils” module

Python provides a built-in “email.utils” module that has a “parseaddr()” function.

from email.utils import parseaddr


def is_valid_email_using_parseaddr(email):
    return '@' in parseaddr(email)[1]


print(is_valid_email_using_parseaddr("john.english@example.com"))
print(is_valid_email_using_parseaddr("invalid.email@"))
print(is_valid_email_using_parseaddr("kb_smith@company.co.uk"))
print(is_valid_email_using_parseaddr("not_an_email"))
print(is_valid_email_using_parseaddr("user456@subdomain.example.org"))
print(is_valid_email_using_parseaddr("user@example.com"))
print(is_valid_email_using_parseaddr("invalid-email"))

Output

True
False
True
False
True
True
False

If you want to use any library, then I highly recommend you use the “email_validator” library instead of the “email.utils” library because “email.utils” doesn’t validate the structure of the email address.

Conclusion

If you are looking for basic email validation, use the “regular expression”. If you are looking for advanced validation, then you have to either write a “comprehensive regex pattern” to use the “re” module or use the “email_validator” library for RFC compliance.

Post Views: 21

Krunal Lathiya

With a career spanning over eight years in the field of Computer Science, Krunal’s expertise is rooted in a solid foundation of hands-on experience, complemented by a continuous pursuit of knowledge.