Removing characters from a string in Python can be done using replace(), translate() or re.sub() among other methods.
During the data cleaning process, you might have encountered situations where you’ve needed to remove specific characters from a string. In this article, I’ll explain how to remove different characters from a string in Python.
Ways to Remove Characters From a Python String
- Remove specific characters from the string.
- Remove all characters except alphabets from a string.
- Remove all characters except the alphabets and the numbers from a string.
- Remove all numbers from a string using a regular expression.
- Remove all characters from the string except numbers.
1. Remove Specific Characters From the String
Using ‘replace()’
Using replace(), we can replace a specific character. If we want to remove that specific character, we can replace that character with an empty string. The replace() method will replace all occurrences of the specific character mentioned.
s="Hello$ Python3$"
s1=s.replace("$","")
print(s1)
#Output: Hello Python3
If we want to remove one occurrence of that character mentioned, we can mention the count:
str.replace(old,new,count)
s="Hello$ Python3$"
s1=s.replace("$","",1)
print(s1)
#Output: Hello Python3$
Using ‘translate()’
Python’s translate() method allows for the removal or replacement of certain characters in a string. Characters can be replaced with nothing or new characters as specified in a dictionary or mapping table.
For example, let’s use translate() to remove “$” from the the following string:
s="Hello$ Python3$"
print(s.translate({ord('$'): None}))
#Output: Hello Python3
Translate() can also remove multiple specified characters in a string at once, as shown below:
s="Hello$@& Python3$"
print(s.translate({ord(i): None for i in '$@&'}))
#Output: Hello Python3
Using ‘re.sub()’
re.sub(pattern, repl, string, count=0, flags=0)
“Return the string obtained by replacing the leftmost nonoverlapping occurrences of pattern in the string by the replacement repl. If the pattern isn’t found, the string is returned unchanged,” according to Python’s documentation.
If we want to remove specific characters, the replacement string is mentioned as an empty string.
s="Hello$@& Python3$"
import re
s1=re.sub("[$@&]","",s)
print(s1)
#Output: Hello Python3
s1=re.sub(“[$@&]”,””,s)
- Pattern to be replaced: “
[$@&]
” []
used to indicate a set of characters.[$@&]
: will match either$
or@
or&
.- The replacement string is given as an empty string.
- If these characters are found in the string, they’ll be replaced with an empty string.
2. Remove All Characters Except Alphabets From a String
Using ‘isalpha()’
isalpha()
is used to check whether the string contains the alphabet or not. It returns True
if it contains only the alphabet.
It’ll iterate through the string and check whether each character in the string is an alphabet or not, and return it if it’s an alphabet.
Example
s="Hello$@ Python3&"
s1="".join(c for c in s if c.isalpha())
print(s1)
#Output: HelloPython
s=”Hello$@ Python3&”
(c for c in s if c.isalpha())
Result: [‘H’, ‘e’, ‘l’, ‘l’, ‘o’, ‘P’, ‘y’, ‘t’, ‘h’, ‘o’, ‘n’]
It’s a generator expression. It returns a generator object containing all alphabets from the string.
s1=””.join(c for c in s if c.isalpha())
””.join
will join all of the elements in the iterable using an empty string.
Using ‘filter()’
s="Hello$@ Python3&"
f=filter(str.isalpha,s)
s1="".join(f)
print(s1)
#Output: HelloPython
f=filter(str.isalpha,s)
The filter()
function will apply the str.isalpha
method to each element in the string, and if it’s True
, it’ll return the item. Otherwise, it’ll skip the item.
s1=””.join(f)
filter()
will return an iterator containing all of the alphabets in the string, and join()
will join all of the elements in the iterator with an empty string.
Using ‘re.sub()’
s="Hello$@ Python3$"
import re
s1=re.sub("[^A-Za-z]","",s)
print(s1)
#Output: HelloPython
s1=re.sub(“[^A-Za-z]”,””,s)
“[^A-Za-z]”
: It’ll match all of the characters except the alphabets. If the first character of the set is'^'
, then all of the characters not in the set will be matched.- All of the characters matched will be replaced with an empty string.
- All of the characters except the alphabets are removed.
3. Remove All Characters Except the Alphabets and the Numbers From a String
Using ‘isalnum()’
isalnum()
is used to check whether characters in the string are alphanumeric. Alphabets [A-Z, a-z] and numbers [0-9] are alphanumeric.
It’ll iterate through the string and check whether each character in the string is alphanumeric or not and return it if it’s an alphabet/number.
s="Hello$@ Python3&"
s1="".join(c for c in s if c.isalnum())
print(s1)
#Output: HelloPython3
Using ‘re.sub()’
s="Hello$@ Python3&_"
import re
s1=re.sub("[^A-Za-z0-9]","",s)
print(s1)
#Output: HelloPython3
s1=re.sub(“[^A-Za-z0–9]”,””,s)
“[^A-Za-z0–9]”
: This will match all of the characters except the alphabets and the numbers. If the first character of the set is'^'
, then all of the characters not in the set will be matched.- All of the characters matched will be replaced with an empty string.
- All of the characters except the alphabets and numbers are removed.
4. Remove All Numbers From a String Using a Regular Expression
Using ‘re.sub()’
s="Hello347 Python3$"
import re
s1=re.sub("[0-9]","",s)
print(s1)
#Output: Hello Python$
s1=re.sub(“[0–9]”,””,s)
[0–9]
will match the numbers from 0-9.re.sub(“[0–9]”,””,s
, if found, will be replaced with an empty string.
5. Remove All Characters From the String Except Numbers
Using ‘isdecimal()’
“isdecimal()
returns True
if all characters in the string are decimals and there’s at least one character. Otherwise, it returns False
. The decimal numbers are numbers that can be used to form numbers in base-10,” according to Python’s documentation.
Example
s="1-2$3%4 5a"
s1="".join(c for c in s if c.isdecimal())
print(s1)
#Output: 12345
s1=””.join(c for c in s if c.isdecimal())
This iterates through the string and checks whether each character in the string is a number or not and returns it if it’s a number.
””.join()
will join all of the elements returned with an empty string.
Using ‘re.sub()’
s="1-2$3%4 5a"
import re
s1=re.sub("[^0-9]","",s)
print(s1)
#Output: 12345
s1=re.sub(“[^0–9]”,””,s)
[^0–9]
will match all characters except numbers 0-9.re.sub(“[^0–9]”,””,s
: If any character that’s not a number is found, it’ll be replaced with an empty string.
Using ‘filter()’
s="1-2$3%4 5a"
f=filter(str.isdecimal,s)
s1="".join(f)
print(s1)
#Output: 12345
f=filter(str.isdecimal,s)
The filter()
function will apply the str.isdecimal
method to each element in the string, and if it’s True
, it’ll return the item. Otherwise, it’ll skip the item.
s1=””.join(f)
filter()
will return an iterator containing all of the numbers in the string, and join()
will join all of the elements in the iterator with an empty string.
Ultimately, Python strings are immutable, so all of the mentioned methods will remove characters from the string and return a new string. It won’t modify the original string.
Frequently Asked Questions
What is a string in Python?
In Python, a string represents a series of characters. A string is bookended with single quotes or double quotes on each side.
How do you remove characters from a string in Python?
In Python, you can remove specific characters from a string using replace(), translate(), re.sub() or a variety of methods, depending on if you want to remove a specific character, or if you want to remove all characters except alphabetical or numerical ones.
How do you change a character in a string in Python?
You can use the replace() or translate() method to replace a character in a string in Python, or other methods depending on string needs.