I"m looking at how to do file input and output in Python. I"ve written the following code to read a list of names (one per line) from a file into another file while checking a name against the names in the file and appending text to the occurrences in the file. The code works. Could it be done better?
I"d wanted to use the with open(...
statement for both input and output files but can"t see how they could be in the same block meaning I"d need to store the names in a temporary location.
def filter(txt, oldfile, newfile):
"""
Read a list of names from a file line by line into an output file.
If a line begins with a particular name, insert a string of text
after the name before appending the line to the output file.
"""
outfile = open(newfile, "w")
with open(oldfile, "r", encoding="utf-8") as infile:
for line in infile:
if line.startswith(txt):
line = line[0:len(txt)] + " - Truly a great person!
"
outfile.write(line)
outfile.close()
return # Do I gain anything by including this?
# input the name you want to check against
text = input("Please enter the name of a great person: ")
letsgo = filter(text,"Spanish", "Spanish2")
How to open a file using the open with statement _files: Questions
How do I list all files of a directory?
5 answers
How can I list all files of a directory in Python and add them to a list
?
Answer #1
os.listdir()
will get you everything that"s in a directory - files and directories.
If you want just files, you could either filter this down using os.path
:
from os import listdir
from os.path import isfile, join
onlyfiles = [f for f in listdir(mypath) if isfile(join(mypath, f))]
or you could use os.walk()
which will yield two lists for each directory it visits - splitting into files and dirs for you. If you only want the top directory you can break the first time it yields
from os import walk
f = []
for (dirpath, dirnames, filenames) in walk(mypath):
f.extend(filenames)
break
or, shorter:
from os import walk
filenames = next(walk(mypath), (None, None, []))[2] # [] if no file
Answer #2
I prefer using the glob
module, as it does pattern matching and expansion.
import glob
print(glob.glob("/home/adam/*"))
It does pattern matching intuitively
import glob
# All files ending with .txt
print(glob.glob("/home/adam/*.txt"))
# All files ending with .txt with depth of 2 folder
print(glob.glob("/home/adam/*/*.txt"))
It will return a list with the queried files:
["/home/adam/file1.txt", "/home/adam/file2.txt", .... ]
Answer #3
os.listdir()
- list in the current directory
With listdir in os module you get the files and the folders in the current dir
import os
arr = os.listdir()
print(arr)
>>> ["$RECYCLE.BIN", "work.txt", "3ebooks.txt", "documents"]
Looking in a directory
arr = os.listdir("c:\files")
glob
from glob
with glob you can specify a type of file to list like this
import glob
txtfiles = []
for file in glob.glob("*.txt"):
txtfiles.append(file)
glob
in a list comprehension
mylist = [f for f in glob.glob("*.txt")]
get the full path of only files in the current directory
import os
from os import listdir
from os.path import isfile, join
cwd = os.getcwd()
onlyfiles = [os.path.join(cwd, f) for f in os.listdir(cwd) if
os.path.isfile(os.path.join(cwd, f))]
print(onlyfiles)
["G:\getfilesname\getfilesname.py", "G:\getfilesname\example.txt"]
Getting the full path name with
os.path.abspath
You get the full path in return
import os
files_path = [os.path.abspath(x) for x in os.listdir()]
print(files_path)
["F:\documentiapplications.txt", "F:\documenticollections.txt"]
Walk: going through sub directories
os.walk returns the root, the directories list and the files list, that is why I unpacked them in r, d, f in the for loop; it, then, looks for other files and directories in the subfolders of the root and so on until there are no subfolders.
import os
# Getting the current work directory (cwd)
thisdir = os.getcwd()
# r=root, d=directories, f = files
for r, d, f in os.walk(thisdir):
for file in f:
if file.endswith(".docx"):
print(os.path.join(r, file))
os.listdir()
: get files in the current directory (Python 2)
In Python 2, if you want the list of the files in the current directory, you have to give the argument as "." or os.getcwd() in the os.listdir method.
import os
arr = os.listdir(".")
print(arr)
>>> ["$RECYCLE.BIN", "work.txt", "3ebooks.txt", "documents"]
To go up in the directory tree
# Method 1
x = os.listdir("..")
# Method 2
x= os.listdir("/")
Get files:
os.listdir()
in a particular directory (Python 2 and 3)
import os
arr = os.listdir("F:\python")
print(arr)
>>> ["$RECYCLE.BIN", "work.txt", "3ebooks.txt", "documents"]
Get files of a particular subdirectory with
os.listdir()
import os
x = os.listdir("./content")
os.walk(".")
- current directory
import os
arr = next(os.walk("."))[2]
print(arr)
>>> ["5bs_Turismo1.pdf", "5bs_Turismo1.pptx", "esperienza.txt"]
next(os.walk("."))
andos.path.join("dir", "file")
import os
arr = []
for d,r,f in next(os.walk("F:\_python")):
for file in f:
arr.append(os.path.join(r,file))
for f in arr:
print(files)
>>> F:\_python\dict_class.py
>>> F:\_python\programmi.txt
next(os.walk("F:\")
- get the full path - list comprehension
[os.path.join(r,file) for r,d,f in next(os.walk("F:\_python")) for file in f]
>>> ["F:\_python\dict_class.py", "F:\_python\programmi.txt"]
os.walk
- get full path - all files in sub dirs**
x = [os.path.join(r,file) for r,d,f in os.walk("F:\_python") for file in f]
print(x)
>>> ["F:\_python\dict.py", "F:\_python\progr.txt", "F:\_python\readl.py"]
os.listdir()
- get only txt files
arr_txt = [x for x in os.listdir() if x.endswith(".txt")]
print(arr_txt)
>>> ["work.txt", "3ebooks.txt"]
Using
glob
to get the full path of the files
If I should need the absolute path of the files:
from path import path
from glob import glob
x = [path(f).abspath() for f in glob("F:\*.txt")]
for f in x:
print(f)
>>> F:acquistionline.txt
>>> F:acquisti_2018.txt
>>> F:ootstrap_jquery_ecc.txt
Using
os.path.isfile
to avoid directories in the list
import os.path
listOfFiles = [f for f in os.listdir() if os.path.isfile(f)]
print(listOfFiles)
>>> ["a simple game.py", "data.txt", "decorator.py"]
Using
pathlib
from Python 3.4
import pathlib
flist = []
for p in pathlib.Path(".").iterdir():
if p.is_file():
print(p)
flist.append(p)
>>> error.PNG
>>> exemaker.bat
>>> guiprova.mp3
>>> setup.py
>>> speak_gui2.py
>>> thumb.PNG
With list comprehension
:
flist = [p for p in pathlib.Path(".").iterdir() if p.is_file()]
Alternatively, use pathlib.Path()
instead of pathlib.Path(".")
Use glob method in pathlib.Path()
import pathlib
py = pathlib.Path().glob("*.py")
for file in py:
print(file)
>>> stack_overflow_list.py
>>> stack_overflow_list_tkinter.py
Get all and only files with os.walk
import os
x = [i[2] for i in os.walk(".")]
y=[]
for t in x:
for f in t:
y.append(f)
print(y)
>>> ["append_to_list.py", "data.txt", "data1.txt", "data2.txt", "data_180617", "os_walk.py", "READ2.py", "read_data.py", "somma_defaltdic.py", "substitute_words.py", "sum_data.py", "data.txt", "data1.txt", "data_180617"]
Get only files with next and walk in a directory
import os
x = next(os.walk("F://python"))[2]
print(x)
>>> ["calculator.bat","calculator.py"]
Get only directories with next and walk in a directory
import os
next(os.walk("F://python"))[1] # for the current dir use (".")
>>> ["python3","others"]
Get all the subdir names with
walk
for r,d,f in os.walk("F:\_python"):
for dirs in d:
print(dirs)
>>> .vscode
>>> pyexcel
>>> pyschool.py
>>> subtitles
>>> _metaprogramming
>>> .ipynb_checkpoints
os.scandir()
from Python 3.5 and greater
import os
x = [f.name for f in os.scandir() if f.is_file()]
print(x)
>>> ["calculator.bat","calculator.py"]
# Another example with scandir (a little variation from docs.python.org)
# This one is more efficient than os.listdir.
# In this case, it shows the files only in the current directory
# where the script is executed.
import os
with os.scandir() as i:
for entry in i:
if entry.is_file():
print(entry.name)
>>> ebookmaker.py
>>> error.PNG
>>> exemaker.bat
>>> guiprova.mp3
>>> setup.py
>>> speakgui4.py
>>> speak_gui2.py
>>> speak_gui3.py
>>> thumb.PNG
Examples:
Ex. 1: How many files are there in the subdirectories?
In this example, we look for the number of files that are included in all the directory and its subdirectories.
import os
def count(dir, counter=0):
"returns number of files in dir and subdirs"
for pack in os.walk(dir):
for f in pack[2]:
counter += 1
return dir + " : " + str(counter) + "files"
print(count("F:\python"))
>>> "F:\python" : 12057 files"
Ex.2: How to copy all files from a directory to another?
A script to make order in your computer finding all files of a type (default: pptx) and copying them in a new folder.
import os
import shutil
from path import path
destination = "F:\file_copied"
# os.makedirs(destination)
def copyfile(dir, filetype="pptx", counter=0):
"Searches for pptx (or other - pptx is the default) files and copies them"
for pack in os.walk(dir):
for f in pack[2]:
if f.endswith(filetype):
fullpath = pack[0] + "\" + f
print(fullpath)
shutil.copy(fullpath, destination)
counter += 1
if counter > 0:
print("-" * 30)
print(" ==> Found in: `" + dir + "` : " + str(counter) + " files
")
for dir in os.listdir():
"searches for folders that starts with `_`"
if dir[0] == "_":
# copyfile(dir, filetype="pdf")
copyfile(dir, filetype="txt")
>>> _compiti18Compito Contabilità 1conti.txt
>>> _compiti18Compito Contabilità 1modula4.txt
>>> _compiti18Compito Contabilità 1moduloa4.txt
>>> ------------------------
>>> ==> Found in: `_compiti18` : 3 files
Ex. 3: How to get all the files in a txt file
In case you want to create a txt file with all the file names:
import os
mylist = ""
with open("filelist.txt", "w", encoding="utf-8") as file:
for eachfile in os.listdir():
mylist += eachfile + "
"
file.write(mylist)
Example: txt with all the files of an hard drive
"""
We are going to save a txt file with all the files in your directory.
We will use the function walk()
"""
import os
# see all the methods of os
# print(*dir(os), sep=", ")
listafile = []
percorso = []
with open("lista_file.txt", "w", encoding="utf-8") as testo:
for root, dirs, files in os.walk("D:\"):
for file in files:
listafile.append(file)
percorso.append(root + "\" + file)
testo.write(file + "
")
listafile.sort()
print("N. of files", len(listafile))
with open("lista_file_ordinata.txt", "w", encoding="utf-8") as testo_ordinato:
for file in listafile:
testo_ordinato.write(file + "
")
with open("percorso.txt", "w", encoding="utf-8") as file_percorso:
for file in percorso:
file_percorso.write(file + "
")
os.system("lista_file.txt")
os.system("lista_file_ordinata.txt")
os.system("percorso.txt")
All the file of C: in one text file
This is a shorter version of the previous code. Change the folder where to start finding the files if you need to start from another position. This code generate a 50 mb on text file on my computer with something less then 500.000 lines with files with the complete path.
import os
with open("file.txt", "w", encoding="utf-8") as filewrite:
for r, d, f in os.walk("C:\"):
for file in f:
filewrite.write(f"{r + file}
")
How to write a file with all paths in a folder of a type
With this function you can create a txt file that will have the name of a type of file that you look for (ex. pngfile.txt) with all the full path of all the files of that type. It can be useful sometimes, I think.
import os
def searchfiles(extension=".ttf", folder="H:\"):
"Create a txt file with all the file of a type"
with open(extension[1:] + "file.txt", "w", encoding="utf-8") as filewrite:
for r, d, f in os.walk(folder):
for file in f:
if file.endswith(extension):
filewrite.write(f"{r + file}
")
# looking for png file (fonts) in the hard disk H:
searchfiles(".png", "H:\")
>>> H:4bs_18Dolphins5.png
>>> H:4bs_18Dolphins6.png
>>> H:4bs_18Dolphins7.png
>>> H:5_18marketing htmlassetsimageslogo2.png
>>> H:7z001.png
>>> H:7z002.png
(New) Find all files and open them with tkinter GUI
I just wanted to add in this 2019 a little app to search for all files in a dir and be able to open them by doubleclicking on the name of the file in the list.
import tkinter as tk
import os
def searchfiles(extension=".txt", folder="H:\"):
"insert all files in the listbox"
for r, d, f in os.walk(folder):
for file in f:
if file.endswith(extension):
lb.insert(0, r + "\" + file)
def open_file():
os.startfile(lb.get(lb.curselection()[0]))
root = tk.Tk()
root.geometry("400x400")
bt = tk.Button(root, text="Search", command=lambda:searchfiles(".png", "H:\"))
bt.pack()
lb = tk.Listbox(root)
lb.pack(fill="both", expand=1)
lb.bind("<Double-Button>", lambda x: open_file())
root.mainloop()
How to open a file using the open with statement filter: Questions
List comprehension vs. lambda + filter
5 answers
I happened to find myself having a basic filtering need: I have a list and I have to filter it by an attribute of the items.
My code looked like this:
my_list = [x for x in my_list if x.attribute == value]
But then I thought, wouldn"t it be better to write it like this?
my_list = filter(lambda x: x.attribute == value, my_list)
It"s more readable, and if needed for performance the lambda could be taken out to gain something.
Question is: are there any caveats in using the second way? Any performance difference? Am I missing the Pythonic Way‚Ñ¢ entirely and should do it in yet another way (such as using itemgetter instead of the lambda)?
Answer #1
It is strange how much beauty varies for different people. I find the list comprehension much clearer than filter
+lambda
, but use whichever you find easier.
There are two things that may slow down your use of filter
.
The first is the function call overhead: as soon as you use a Python function (whether created by def
or lambda
) it is likely that filter will be slower than the list comprehension. It almost certainly is not enough to matter, and you shouldn"t think much about performance until you"ve timed your code and found it to be a bottleneck, but the difference will be there.
The other overhead that might apply is that the lambda is being forced to access a scoped variable (value
). That is slower than accessing a local variable and in Python 2.x the list comprehension only accesses local variables. If you are using Python 3.x the list comprehension runs in a separate function so it will also be accessing value
through a closure and this difference won"t apply.
The other option to consider is to use a generator instead of a list comprehension:
def filterbyvalue(seq, value):
for el in seq:
if el.attribute==value: yield el
Then in your main code (which is where readability really matters) you"ve replaced both list comprehension and filter with a hopefully meaningful function name.
Answer #2
This is a somewhat religious issue in Python. Even though Guido considered removing map
, filter
and reduce
from Python 3, there was enough of a backlash that in the end only reduce
was moved from built-ins to functools.reduce.
Personally I find list comprehensions easier to read. It is more explicit what is happening from the expression [i for i in list if i.attribute == value]
as all the behaviour is on the surface not inside the filter function.
I would not worry too much about the performance difference between the two approaches as it is marginal. I would really only optimise this if it proved to be the bottleneck in your application which is unlikely.
Also since the BDFL wanted filter
gone from the language then surely that automatically makes list comprehensions more Pythonic ;-)
How do I do a not equal in Django queryset filtering?
5 answers
In Django model QuerySets, I see that there is a __gt
and __lt
for comparative values, but is there a __ne
or !=
(not equals)? I want to filter out using a not equals. For example, for
Model:
bool a;
int x;
I want to do
results = Model.objects.exclude(a=True, x!=5)
The !=
is not correct syntax. I also tried __ne
.
I ended up using:
results = Model.objects.exclude(a=True, x__lt=5).exclude(a=True, x__gt=5)
Answer #1
You can use Q objects for this. They can be negated with the ~
operator and combined much like normal Python expressions:
from myapp.models import Entry
from django.db.models import Q
Entry.objects.filter(~Q(id=3))
will return all entries except the one(s) with 3
as their ID:
[<Entry: Entry object>, <Entry: Entry object>, <Entry: Entry object>, ...]