How can I remove non-ASCII characters but leave periods and spaces?

| |

👻 See our latest reviews to choose the best laptop for Machine Learning and Deep learning tasks!

I"m working with a .txt file. I want a string of the text from the file with no non-ASCII characters. However, I want to leave spaces and periods. At present, I"m stripping those too. Here"s the code:

def onlyascii(char):
    if ord(char) < 48 or ord(char) > 127: return ""
    else: return char

def get_my_string(file_path):
    f=open(file_path,"r")
    data=f.read()
    f.close()
    filtered_data=filter(onlyascii, data)
    filtered_data = filtered_data.lower()
    return filtered_data

How should I modify onlyascii() to leave spaces and periods? I imagine it"s not too complicated but I can"t figure it out.

👻 Read also: what is the best laptop for engineering students in 2022?

How can I remove non-ASCII characters but leave periods and spaces? filter: Questions

List comprehension vs. lambda + filter

5 answers

I happened to find myself having a basic filtering need: I have a list and I have to filter it by an attribute of the items.

My code looked like this:

my_list = [x for x in my_list if x.attribute == value]

But then I thought, wouldn"t it be better to write it like this?

my_list = filter(lambda x: x.attribute == value, my_list)

It"s more readable, and if needed for performance the lambda could be taken out to gain something.

Question is: are there any caveats in using the second way? Any performance difference? Am I missing the Pythonic Way‚Ñ¢ entirely and should do it in yet another way (such as using itemgetter instead of the lambda)?

957

Answer #1

It is strange how much beauty varies for different people. I find the list comprehension much clearer than filter+lambda, but use whichever you find easier.

There are two things that may slow down your use of filter.

The first is the function call overhead: as soon as you use a Python function (whether created by def or lambda) it is likely that filter will be slower than the list comprehension. It almost certainly is not enough to matter, and you shouldn"t think much about performance until you"ve timed your code and found it to be a bottleneck, but the difference will be there.

The other overhead that might apply is that the lambda is being forced to access a scoped variable (value). That is slower than accessing a local variable and in Python 2.x the list comprehension only accesses local variables. If you are using Python 3.x the list comprehension runs in a separate function so it will also be accessing value through a closure and this difference won"t apply.

The other option to consider is to use a generator instead of a list comprehension:

def filterbyvalue(seq, value):
   for el in seq:
       if el.attribute==value: yield el

Then in your main code (which is where readability really matters) you"ve replaced both list comprehension and filter with a hopefully meaningful function name.

957

Answer #2

This is a somewhat religious issue in Python. Even though Guido considered removing map, filter and reduce from Python 3, there was enough of a backlash that in the end only reduce was moved from built-ins to functools.reduce.

Personally I find list comprehensions easier to read. It is more explicit what is happening from the expression [i for i in list if i.attribute == value] as all the behaviour is on the surface not inside the filter function.

I would not worry too much about the performance difference between the two approaches as it is marginal. I would really only optimise this if it proved to be the bottleneck in your application which is unlikely.

Also since the BDFL wanted filter gone from the language then surely that automatically makes list comprehensions more Pythonic ;-)

How can I remove non-ASCII characters but leave periods and spaces? filter: Questions

How do I do a not equal in Django queryset filtering?

5 answers

MikeN By MikeN

In Django model QuerySets, I see that there is a __gt and __lt for comparative values, but is there a __ne or != (not equals)? I want to filter out using a not equals. For example, for

Model:
    bool a;
    int x;

I want to do

results = Model.objects.exclude(a=True, x!=5)

The != is not correct syntax. I also tried __ne.

I ended up using:

results = Model.objects.exclude(a=True, x__lt=5).exclude(a=True, x__gt=5)
784

Answer #1

You can use Q objects for this. They can be negated with the ~ operator and combined much like normal Python expressions:

from myapp.models import Entry
from django.db.models import Q

Entry.objects.filter(~Q(id=3))

will return all entries except the one(s) with 3 as their ID:

[<Entry: Entry object>, <Entry: Entry object>, <Entry: Entry object>, ...]

How can I open multiple files using "with open" in Python?

5 answers

I want to change a couple of files at one time, iff I can write to all of them. I"m wondering if I somehow can combine the multiple open calls with the with statement:

try:
  with open("a", "w") as a and open("b", "w") as b:
    do_something()
except IOError as e:
  print "Operation failed: %s" % e.strerror

If that"s not possible, what would an elegant solution to this problem look like?

788

Answer #1

As of Python 2.7 (or 3.1 respectively) you can write

with open("a", "w") as a, open("b", "w") as b:
    do_something()

In earlier versions of Python, you can sometimes use contextlib.nested() to nest context managers. This won"t work as expected for opening multiples files, though -- see the linked documentation for details.


In the rare case that you want to open a variable number of files all at the same time, you can use contextlib.ExitStack, starting from Python version 3.3:

with ExitStack() as stack:
    files = [stack.enter_context(open(fname)) for fname in filenames]
    # Do something with "files"

Most of the time you have a variable set of files, you likely want to open them one after the other, though.

open() in Python does not create a file if it doesn"t exist

5 answers

What is the best way to open a file as read/write if it exists, or if it does not, then create it and open it as read/write? From what I read, file = open("myfile.dat", "rw") should do this, right?

It is not working for me (Python 2.6.2) and I"m wondering if it is a version problem, or not supposed to work like that or what.

The bottom line is, I just need a solution for the problem. I am curious about the other stuff, but all I need is a nice way to do the opening part.

The enclosing directory was writeable by user and group, not other (I"m on a Linux system... so permissions 775 in other words), and the exact error was:

IOError: no such file or directory.

778

Answer #1

You should use open with the w+ mode:

file = open("myfile.dat", "w+")

Difference between modes a, a+, w, w+, and r+ in built-in open function?

5 answers

In the python built-in open function, what is the exact difference between the modes w, a, w+, a+, and r+?

In particular, the documentation implies that all of these will allow writing to the file, and says that it opens the files for "appending", "writing", and "updating" specifically, but does not define what these terms mean.

721

Answer #1

The opening modes are exactly the same as those for the C standard library function fopen().

The BSD fopen manpage defines them as follows:

 The argument mode points to a string beginning with one of the following
 sequences (Additional characters may follow these sequences.):

 ``r""   Open text file for reading.  The stream is positioned at the
         beginning of the file.

 ``r+""  Open for reading and writing.  The stream is positioned at the
         beginning of the file.

 ``w""   Truncate file to zero length or create text file for writing.
         The stream is positioned at the beginning of the file.

 ``w+""  Open for reading and writing.  The file is created if it does not
         exist, otherwise it is truncated.  The stream is positioned at
         the beginning of the file.

 ``a""   Open for writing.  The file is created if it does not exist.  The
         stream is positioned at the end of the file.  Subsequent writes
         to the file will always end up at the then current end of file,
         irrespective of any intervening fseek(3) or similar.

 ``a+""  Open for reading and writing.  The file is created if it does not
         exist.  The stream is positioned at the end of the file.  Subse-
         quent writes to the file will always end up at the then current
         end of file, irrespective of any intervening fseek(3) or similar.

Shop

Learn programming in R: courses

$

Best Python online courses for 2022

$

Best laptop for Fortnite

$

Best laptop for Excel

$

Best laptop for Solidworks

$

Best laptop for Roblox

$

Best computer for crypto mining

$

Best laptop for Sims 4

$

Latest questions

NUMPYNUMPY

psycopg2: insert multiple rows with one query

12 answers

NUMPYNUMPY

How to convert Nonetype to int or string?

12 answers

NUMPYNUMPY

How to specify multiple return types using type-hints

12 answers

NUMPYNUMPY

Javascript Error: IPython is not defined in JupyterLab

12 answers

News


Wiki

Python OpenCV | cv2.putText () method

numpy.arctan2 () in Python

Python | os.path.realpath () method

Python OpenCV | cv2.circle () method

Python OpenCV cv2.cvtColor () method

Python - Move item to the end of the list

time.perf_counter () function in Python

Check if one list is a subset of another in Python

Python os.path.join () method