Padding a string with 0s has its advantage, but it can be annoying too! Imagine you are working on a Machine Learning project, and you come across a dataset that has a column of strings. Each string is padded with leading or trailing zeros.
Real-life example
Picture this: you are working on an inventory management system for an electronics store. The store uses a barcode system to track its products. However, due to some eccentricities in the barcode scanner and database system, the product codes end up with unnecessary leading and trailing zeros like this:
Product Codes: Digital Camera: "000045678000" Smartphone: "000078901000" Laptop Charger: "000012345000"
Let’s improve these product codes by removing leading and trailing zeros and seeing the after-effects.
Product Codes: Digital Camera: "45678" Smartphone: "78901" Laptop Charger: "12345"
Method 1: String strip()
If performance and speed are priorities, the strip() function is well-suited for this task. The strip() is a specialized method to strip a specific character from a string. If you don’t provide any character as an argument, it returns a string with removed whitespaces from both ends.
Since this method is built-in, it provides efficient performance and is easy for developers to understand.
Here is the exact Python script demonstrating it:
def remove_zeros(str): return str.strip('0') main_str = "00211900" print("Before Removing leading and trailing zeros") print(main_str) print("After Removing leading and trailing zeros") print(remove_zeros(main_str))
Output
Before Removing leading and trailing zeros 00211900 After Removing leading and trailing zeros 2119
As described in the above output, the strip() method removes 0s from the start and end of the string, making it more human-readable.
Method 2: Regular expression (regex)
The re.sub() method replaces one or many matches with a string in the provided text. How might this be advantageous to us? you ask. Well, here, you can replace all the 0s with empty spaces ( ), and as a result, we will have an output string stripped of 0s.
I’m not sure if that was clear; let me explain this to you via a code example:
import re def remove_zeros(str): return re.sub(r'^0+|0+$', '', str) main_str = "00211900" print("Before Removing leading and trailing zeros") print(main_str) print("After Removing leading and trailing zeros") print(remove_zeros(main_str))
Output
Before Removing leading and trailing zeros 00211900 After Removing leading and trailing zeros 2119
In this program, we imported the “re” module to use all the regular expression-related methods in Python. It is a built-in module, so you don’t have to install it separately.
Regular expressions are used in programming when you have a complex task at your disposal, like this one, where you need to strip a specific character from the leading or trailing part.
We used re.sub() method, and it takes the three arguments:
- Pattern: Identify and find 0 at the beginning and end of the string.
- Replacement character: It is space in our case.
- Input string: The string we need to find and replace.
If you come across complex patterns, I highly recommend using this approach, but if you have a simple string, go with the strip() method.
Method 3: Built-in Functions (For numeric strings)
If you are working with numerical strings and want to remove unnecessary decimal places, you should check out the combination of built-in functions, including str(), rstrip(), and float() methods.
This approach handles decimal points correctly and removes unnecessary decimal places from the numerical string. It will help you normalize number representations.
Here is a programmatic implementation:
def remove_zeros(s): return str(float(s)).rstrip('0').rstrip('.') main_str = "00211.900" print("Before Removing zeros") print(main_str) print("After Removing zeros") print(remove_zeros(main_str))
Output
Before Removing zeros 00211.900 After Removing zeros 211.9
If you carefully observe the above output, you will see that leading zeros have been removed, and after the decimal point, trailing zeros have been removed also. That means it is a normalized numeric representation that only works for valid numeric strings.
Method 4: for loop
The least helpful method is always using a for loop. The iterative method is not cool because it’s verbose and slow, but it will do the job for you.
The main advantage of using a for loop is that it can be optimized for specific use cases and is easy to modify for different requirements.
Here is an ugly code that has a for loop:
def remove_zeros(s): left = 0 right = len(s) - 1 while left < len(s) and s[left] == '0': left += 1 while right > left and s[right] == '0': right -= 1 return s[left:right+1] main_str = "00211900" print("Before Removing zeros") print(main_str) print("After Removing zeros") print(remove_zeros(main_str))
Output
Before Removing zeros 00211900 After Removing zeros 2119
It doffed zeros from both sides of a string as expected, but you can see that careful implementation is required to avoid edge cases.
Final observations
My expert opinion would be to use the string.strip() method for normal and simple strings.
If you are dealing with a complex string with complex logic, use the re.sub() method because regular expressions provide a pattern that can handle any scenario.
If your input string is purely numerical, combine built-in methods. I discourage you from using “for loop” because why would you do that when you have other options?