Best way to strip punctuation from a string

| |

👻 Check our latest review to choose the best laptop for Machine Learning engineers and Deep learning tasks!

It seems like there should be a simpler way than:

import string
s = "string. With. Punctuation?" # Sample string 
out = s.translate(string.maketrans("";""), string.punctuation)

Is there?

👻 Read also: what is the best laptop for engineering students?

Best way to strip punctuation from a string

3 answers

Lawrence Johnston By Lawrence Johnston

It seems like there should be a simpler way than:

import string
s = "string. With. Punctuation?" # Sample string 
out = s.translate(string.maketrans("";""), string.punctuation)

Is there?

741

Answer #1

From an efficiency perspective, you"re not going to beat

s.translate(None, string.punctuation)

For higher versions of Python use the following code:

s.translate(str.maketrans("", "", string.punctuation))

It"s performing raw string operations in C with a lookup table - there"s not much that will beat that but writing your own C code.

If speed isn"t a worry, another option though is:

exclude = set(string.punctuation)
s = "".join(ch for ch in s if ch not in exclude)

This is faster than s.replace with each char, but won"t perform as well as non-pure python approaches such as regexes or string.translate, as you can see from the below timings. For this type of problem, doing it at as low a level as possible pays off.

Timing code:

import re, string, timeit

s = "string. With. Punctuation"
exclude = set(string.punctuation)
table = string.maketrans("";"")
regex = re.compile("[%s]" % re.escape(string.punctuation))

def test_set(s):
    return "".join(ch for ch in s if ch not in exclude)

def test_re(s):  # From Vinko"s solution, with fix.
    return regex.sub("", s)

def test_trans(s):
    return s.translate(table, string.punctuation)

def test_repl(s):  # From S.Lott"s solution
    for c in string.punctuation:
        s=s.replace(c,"")
    return s

print "sets      :",timeit.Timer("f(s)", "from __main__ import s,test_set as f").timeit(1000000)
print "regex     :",timeit.Timer("f(s)", "from __main__ import s,test_re as f").timeit(1000000)
print "translate :",timeit.Timer("f(s)", "from __main__ import s,test_trans as f").timeit(1000000)
print "replace   :",timeit.Timer("f(s)", "from __main__ import s,test_repl as f").timeit(1000000)

This gives the following results:

sets      : 19.8566138744
regex     : 6.86155414581
translate : 2.12455511093
replace   : 28.4436721802

Remove all special characters, punctuation and spaces from string

3 answers

I need to remove all special characters, punctuation and spaces from a string so that I only have letters and numbers.

315

Answer #1

This can be done without regex:

>>> string = "Special $#! characters   spaces 888323"
>>> "".join(e for e in string if e.isalnum())
"Specialcharactersspaces888323"

You can use str.isalnum:

S.isalnum() -> bool

Return True if all characters in S are alphanumeric
and there is at least one character in S, False otherwise.

If you insist on using regex, other solutions will do fine. However note that if it can be done without using a regular expression, that"s the best way to go about it.

315

Answer #2

Here is a regex to match a string of characters that are not a letters or numbers:

[^A-Za-z0-9]+

Here is the Python command to do a regex substitution:

re.sub("[^A-Za-z0-9]+", "", mystring)

Shop

Learn programming in R: courses

$

Best Python online courses for 2022

$

Best laptop for Fortnite

$

Best laptop for Excel

$

Best laptop for Solidworks

$

Best laptop for Roblox

$

Best computer for crypto mining

$

Best laptop for Sims 4

$

Latest questions

NUMPYNUMPY

psycopg2: insert multiple rows with one query

12 answers

NUMPYNUMPY

How to convert Nonetype to int or string?

12 answers

NUMPYNUMPY

How to specify multiple return types using type-hints

12 answers

NUMPYNUMPY

Javascript Error: IPython is not defined in JupyterLab

12 answers

News


Wiki

Python OpenCV | cv2.putText () method

numpy.arctan2 () in Python

Python | os.path.realpath () method

Python OpenCV | cv2.circle () method

Python OpenCV cv2.cvtColor () method

Python - Move item to the end of the list

time.perf_counter () function in Python

Check if one list is a subset of another in Python

Python os.path.join () method