I think what I want to do is a fairly common task but I"ve found no reference on the web. I have text with punctuation, and I want a list of the words.
"Hey, you - what are you doing here!?"
["hey", "you", "what", "are", "you", "doing", "here"]
str.split() only works with one argument, so I have all words with the punctuation after I split with whitespace. Any ideas?
Split Strings into words with multiple word boundary delimiters __del__: Questions
How can I make a time delay in Python?
I would like to know how to put a time delay in a Python script.
import time time.sleep(5) # Delays for 5 seconds. You can also use a float value.
Here is another example where something is run approximately once a minute:
import time while True: print("This prints once a minute.") time.sleep(60) # Delay for 1 minute (60 seconds).
You can use the
sleep() function in the
time module. It can take a float argument for sub-second resolution.
from time import sleep sleep(0.1) # Time in seconds
How to delete a file or folder in Python?
How do I delete a file or folder in Python?
os.remove()removes a file.
os.rmdir()removes an empty directory.
shutil.rmtree()deletes a directory and all its contents.
Split Strings into words with multiple word boundary delimiters punctuation: Questions
Best way to strip punctuation from a string
It seems like there should be a simpler way than:
import string s = "string. With. Punctuation?" # Sample string out = s.translate(string.maketrans("";""), string.punctuation)
From an efficiency perspective, you"re not going to beat
For higher versions of Python use the following code:
s.translate(str.maketrans("", "", string.punctuation))
It"s performing raw string operations in C with a lookup table - there"s not much that will beat that but writing your own C code.
If speed isn"t a worry, another option though is:
exclude = set(string.punctuation) s = "".join(ch for ch in s if ch not in exclude)
This is faster than s.replace with each char, but won"t perform as well as non-pure python approaches such as regexes or string.translate, as you can see from the below timings. For this type of problem, doing it at as low a level as possible pays off.
import re, string, timeit s = "string. With. Punctuation" exclude = set(string.punctuation) table = string.maketrans("";"") regex = re.compile("[%s]" % re.escape(string.punctuation)) def test_set(s): return "".join(ch for ch in s if ch not in exclude) def test_re(s): # From Vinko"s solution, with fix. return regex.sub("", s) def test_trans(s): return s.translate(table, string.punctuation) def test_repl(s): # From S.Lott"s solution for c in string.punctuation: s=s.replace(c,"") return s print "sets :",timeit.Timer("f(s)", "from __main__ import s,test_set as f").timeit(1000000) print "regex :",timeit.Timer("f(s)", "from __main__ import s,test_re as f").timeit(1000000) print "translate :",timeit.Timer("f(s)", "from __main__ import s,test_trans as f").timeit(1000000) print "replace :",timeit.Timer("f(s)", "from __main__ import s,test_repl as f").timeit(1000000)
This gives the following results:
sets : 19.8566138744 regex : 6.86155414581 translate : 2.12455511093 replace : 28.4436721802
Remove all special characters, punctuation and spaces from string
I need to remove all special characters, punctuation and spaces from a string so that I only have letters and numbers.
This can be done without regex:
>>> string = "Special $#! characters spaces 888323" >>> "".join(e for e in string if e.isalnum()) "Specialcharactersspaces888323"
You can use
S.isalnum() -> bool Return True if all characters in S are alphanumeric and there is at least one character in S, False otherwise.
If you insist on using regex, other solutions will do fine. However note that if it can be done without using a regular expression, that"s the best way to go about it.
Here is a regex to match a string of characters that are not a letters or numbers:
Here is the Python command to do a regex substitution:
re.sub("[^A-Za-z0-9]+", "", mystring)