Here are four ways to extract a string between quotes in Python:
- Using a regular expression
- Using str.split()
- Using a custom parser
- Using startswith(), endswith(), and replace()
Whether you are parsing structured data, processing user input, extracting meaningful content, or performing data cleaning and normalization, you need to create a mechanism that will pull the string between quotes.
Method 1: Using a regular expression
The “re” module in Python provides a re.findall() method that captures a group that matches any character except a double quote, zero or more times, and does the extraction.
import re
def quote_extraction(str):
return re.findall('"([^"]*)"', str)
main_str = 'The festival of "Paryushana" is the "best" and "Holiest"'
print(quote_extraction(main_str))
Output
['Paryushana', 'best', 'Holiest']
The findall() method returns a list of elements enclosed in double quotes (“”).
I would highly recommend this approach because it finds all matching substrings, not just the first one. Furthermore, it is concise and easy to read.
However, the re.finall() method doesn’t handle escaped quotes within the string (e.g., “like \”this\””).
Time Complexity: O(n), where n is the length of the input string.
Space Complexity: O(m), where m is the total length of all matched substrings.
Method 2: Using str.split()
We can use the str.split() method to split the string for each double quotation mark, which will return a list. Then, use the list comprehension to create a list containing all the strings that were enclosed in quotes.
def quote_extraction_split(str):
parts = str.split('"')
return [parts[i] for i in range(1, len(parts), 2)]
main_str = 'The festival of "Paryushana" is the "best" and "Holiest"'
print(quote_extraction_split(main_str))
Output
['Paryushana', 'best', 'Holiest']
If you are aware that input strings are well-formatted and all quotes are properly paired, then you can use this approach.
However, this approach also does not handle escaped quotes, and it will include empty strings if there are two consecutive quotes (” “).
Time Complexity: O(n), where n is the length of the input string.
Space Complexity: O(n + m), where m is the size of the quoted strings.
Method 3: Using a custom parser
In this approach, we will create a custom function called “quote_extraction_custom()” that maintains the custom parsing logic while providing the desired list output. It will correctly handle multiple quoted strings in the input and separate them into individual list elements.
def quote_extraction_custom(s):
result = []
current_word = ""
in_quotes = False
for char in s:
if char == '"':
in_quotes = not in_quotes
if not in_quotes and current_word: # End of a quoted section
result.append(current_word)
current_word = ""
elif in_quotes:
current_word += char
return result
main_str = 'The festival of "Paryushana" is the "best" and "Holiest"'
print(quote_extraction_custom(main_str))
Output
['Paryushana', 'best', 'Holiest']
Time Complexity: O(n), where n is the length of the input string.
Space Complexity: O(n + m), where m is the total length of all quoted substrings.
Method 4: Using startswith(), endswith(), and replace()
You can use the combination of string methods such as startswith(), endswith(), replace(), split(), and list comprehension to create a list of elements containing only extracted strings from double quotes.
def quote_extraction_string_methods(str):
words = str.split() # Split the string into words
quoted_words = [word.replace('"', '') for word in words if word.startswith(
'"') and word.endswith('"')]
return quoted_words
main_str = 'The festival of "Paryushana" is the "best" and "Holiest"'
print(quote_extraction_string_methods(main_str))
Output
['Paryushana', 'best', 'Holiest']
You can use this approach, which is efficient and easy to read. It handles the case where quoted words are separated by spaces well.
Time Complexity: O(n), where n is the length of the input string
Space Complexity: O(n + m), where m is the total size of the quoted strings.
That’s all!


