Skip to content
  • (+91) 9409548155
  • support@appdividend.com
  • Home
  • Pricing
  • Instructor
  • Tutorials
    • Laravel
    • Python
    • React
    • Javascript
    • Angular
  • Become A Tutor
  • About Us
  • Contact Us
Menu
  • Home
  • Pricing
  • Instructor
  • Tutorials
    • Laravel
    • Python
    • React
    • Javascript
    • Angular
  • Become A Tutor
  • About Us
  • Contact Us
  • Home
  • Pricing
  • Instructor
  • Tutorials
    • Laravel
    • Python
    • React
    • Javascript
    • Angular
  • Become A Tutor
  • About Us
  • Contact Us
Python

How to Extract String from Between Quotations in Python

  • 19 Sep, 2024
  • Com 0
How to Extract String from Between Quotations in Python

Here are four ways to extract a string between quotes in Python:

  1. Using a regular expression
  2. Using str.split()
  3. Using a custom parser
  4. Using startswith(), endswith(), and replace()

Whether you are parsing structured data, processing user input, extracting meaningful content, or performing data cleaning and normalization, you need to create a mechanism that will pull the string between quotes.

Method 1: Using a regular expression

The “re” module in Python provides a re.findall() method that captures a group that matches any character except a double quote, zero or more times, and does the extraction.

Method 1 - Using regular expression

import re


def quote_extraction(str):
    return re.findall('"([^"]*)"', str)


main_str = 'The festival of "Paryushana" is the "best" and "Holiest"'
print(quote_extraction(main_str))

Output

['Paryushana', 'best', 'Holiest']

The findall() method returns a list of elements enclosed in double quotes (“”).

I would highly recommend this approach because it finds all matching substrings, not just the first one. Furthermore, it is concise and easy to read.

However, the re.finall() method doesn’t handle escaped quotes within the string (e.g., “like \”this\””).

Time Complexity: O(n), where n is the length of the input string.

Space Complexity: O(m), where m is the total length of all matched substrings.

Method 2: Using str.split()

We can use the str.split() method to split the string for each double quotation mark, which will return a list. Then, use the list comprehension to create a list containing all the strings that were enclosed in quotes.

Method 2 - Using str.split()

def quote_extraction_split(str):
    parts = str.split('"')
    return [parts[i] for i in range(1, len(parts), 2)]


main_str = 'The festival of "Paryushana" is the "best" and "Holiest"'

print(quote_extraction_split(main_str))

Output

['Paryushana', 'best', 'Holiest']

If you are aware that input strings are well-formatted and all quotes are properly paired, then you can use this approach.

However, this approach also does not handle escaped quotes, and it will include empty strings if there are two consecutive quotes (” “).

Time Complexity: O(n), where n is the length of the input string.

Space Complexity: O(n + m), where m is the size of the quoted strings.

Method 3: Using a custom parser

In this approach, we will create a custom function called “quote_extraction_custom()” that maintains the custom parsing logic while providing the desired list output. It will correctly handle multiple quoted strings in the input and separate them into individual list elements.

def quote_extraction_custom(s):
    result = []
    current_word = ""
    in_quotes = False
    for char in s:
        if char == '"':
            in_quotes = not in_quotes
            if not in_quotes and current_word:  # End of a quoted section
                result.append(current_word)
                current_word = ""
        elif in_quotes:
            current_word += char
    return result


main_str = 'The festival of "Paryushana" is the "best" and "Holiest"'

print(quote_extraction_custom(main_str))

Output

['Paryushana', 'best', 'Holiest']

Time Complexity: O(n), where n is the length of the input string.

Space Complexity: O(n + m), where m is the total length of all quoted substrings.

Method 4: Using startswith(), endswith(), and replace()

You can use the combination of string methods such as startswith(), endswith(), replace(), split(), and list comprehension to create a list of elements containing only extracted strings from double quotes.

def quote_extraction_string_methods(str):
    words = str.split()  # Split the string into words
    quoted_words = [word.replace('"', '') for word in words if word.startswith(
        '"') and word.endswith('"')]
    return quoted_words


main_str = 'The festival of "Paryushana" is the "best" and "Holiest"'

print(quote_extraction_string_methods(main_str))

Output

['Paryushana', 'best', 'Holiest']

You can use this approach, which is efficient and easy to read. It handles the case where quoted words are separated by spaces well.

Time Complexity: O(n), where n is the length of the input string

Space Complexity: O(n + m), where m is the total size of the quoted strings.

That’s all!

Post Views: 180
Share on:
Krunal Lathiya

With a career spanning over eight years in the field of Computer Science, Krunal’s expertise is rooted in a solid foundation of hands-on experience, complemented by a continuous pursuit of knowledge.

Extracting Email Address from Text in Python
How to Extract Top-Level Domain (TLD) from URL in Python

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Address: TwinStar, South Block – 1202, 150 Ft Ring Road, Nr. Nana Mauva Circle, Rajkot(360005), Gujarat, India

Call: (+91) 9409548155

Email: support@appdividend.com

Online Platform

  • Pricing
  • Instructors
  • FAQ
  • Refund Policy
  • Support

Links

  • About Us
  • Contact Us
  • Privacy Policy
  • Terms of services

Tutorials

  • Angular
  • React
  • Python
  • Laravel
  • Javascript
Copyright @2024 AppDividend. All Rights Reserved
Appdividend