// Store the file name in a variable
$file
=
’filename.pdf’
;
$filename
=
’filename.pdf ’
;
// Title content type
header (
’Content-type: application / pdf’
);
header (
’ Content-Disposition: inline; filename = "’
.
$filename
.
’ "’
);
header (
’ Content-Transfer-Encoding: binary’
);
header (
’ Accept-Ranges: bytes’
);
// Read the file
@ readfile (
$file
);
?>
Output: Example 2:
This example displays the format and explains each code section
// Location of the PDF file
// on the server
$filename
=
"/ path / to / the / file.pdf"
;
// Title content type
header (
"Content-type: application / pdf"
);
header (
" Content-Length: "
.
filesize
( $filename
));
// Send the file to the browser.
readfile (
$filename
);
?>
Output:
How do I open PDFs in a web browser using PHP?: StackOverflow Questions
How do I list all files of a directory?
How can I list all files of a directory in Python and add them to a list
?
Answer #1:
os.listdir()
will get you everything that"s in a directory - files and directories.
If you want just files, you could either filter this down using os.path
:
from os import listdir
from os.path import isfile, join
onlyfiles = [f for f in listdir(mypath) if isfile(join(mypath, f))]
or you could use os.walk()
which will yield two lists for each directory it visits - splitting into files and dirs for you. If you only want the top directory you can break the first time it yields
from os import walk
f = []
for (dirpath, dirnames, filenames) in walk(mypath):
f.extend(filenames)
break
or, shorter:
from os import walk
filenames = next(walk(mypath), (None, None, []))[2] # [] if no file
Answer #2:
I prefer using the glob
module, as it does pattern matching and expansion.
import glob
print(glob.glob("/home/adam/*"))
It does pattern matching intuitively
import glob
# All files ending with .txt
print(glob.glob("/home/adam/*.txt"))
# All files ending with .txt with depth of 2 folder
print(glob.glob("/home/adam/*/*.txt"))
It will return a list with the queried files:
["/home/adam/file1.txt", "/home/adam/file2.txt", .... ]
Answer #3:
os.listdir()
- list in the current directory
With listdir in os module you get the files and the folders in the current dir
import os
arr = os.listdir()
print(arr)
>>> ["$RECYCLE.BIN", "work.txt", "3ebooks.txt", "documents"]
Looking in a directory
arr = os.listdir("c:\files")
glob
from glob
with glob you can specify a type of file to list like this
import glob
txtfiles = []
for file in glob.glob("*.txt"):
txtfiles.append(file)
glob
in a list comprehension
mylist = [f for f in glob.glob("*.txt")]
get the full path of only files in the current directory
import os
from os import listdir
from os.path import isfile, join
cwd = os.getcwd()
onlyfiles = [os.path.join(cwd, f) for f in os.listdir(cwd) if
os.path.isfile(os.path.join(cwd, f))]
print(onlyfiles)
["G:\getfilesname\getfilesname.py", "G:\getfilesname\example.txt"]
Getting the full path name with os.path.abspath
You get the full path in return
import os
files_path = [os.path.abspath(x) for x in os.listdir()]
print(files_path)
["F:\documentiapplications.txt", "F:\documenticollections.txt"]
Walk: going through sub directories
os.walk returns the root, the directories list and the files list, that is why I unpacked them in r, d, f in the for loop; it, then, looks for other files and directories in the subfolders of the root and so on until there are no subfolders.
import os
# Getting the current work directory (cwd)
thisdir = os.getcwd()
# r=root, d=directories, f = files
for r, d, f in os.walk(thisdir):
for file in f:
if file.endswith(".docx"):
print(os.path.join(r, file))
os.listdir()
: get files in the current directory (Python 2)
In Python 2, if you want the list of the files in the current directory, you have to give the argument as "." or os.getcwd() in the os.listdir method.
import os
arr = os.listdir(".")
print(arr)
>>> ["$RECYCLE.BIN", "work.txt", "3ebooks.txt", "documents"]
To go up in the directory tree
# Method 1
x = os.listdir("..")
# Method 2
x= os.listdir("/")
Get files: os.listdir()
in a particular directory (Python 2 and 3)
import os
arr = os.listdir("F:\python")
print(arr)
>>> ["$RECYCLE.BIN", "work.txt", "3ebooks.txt", "documents"]
Get files of a particular subdirectory with os.listdir()
import os
x = os.listdir("./content")
os.walk(".")
- current directory
import os
arr = next(os.walk("."))[2]
print(arr)
>>> ["5bs_Turismo1.pdf", "5bs_Turismo1.pptx", "esperienza.txt"]
next(os.walk("."))
and os.path.join("dir", "file")
import os
arr = []
for d,r,f in next(os.walk("F:\_python")):
for file in f:
arr.append(os.path.join(r,file))
for f in arr:
print(files)
>>> F:\_python\dict_class.py
>>> F:\_python\programmi.txt
next(os.walk("F:\")
- get the full path - list comprehension
[os.path.join(r,file) for r,d,f in next(os.walk("F:\_python")) for file in f]
>>> ["F:\_python\dict_class.py", "F:\_python\programmi.txt"]
os.walk
- get full path - all files in sub dirs**
x = [os.path.join(r,file) for r,d,f in os.walk("F:\_python") for file in f]
print(x)
>>> ["F:\_python\dict.py", "F:\_python\progr.txt", "F:\_python\readl.py"]
os.listdir()
- get only txt files
arr_txt = [x for x in os.listdir() if x.endswith(".txt")]
print(arr_txt)
>>> ["work.txt", "3ebooks.txt"]
Using glob
to get the full path of the files
If I should need the absolute path of the files:
from path import path
from glob import glob
x = [path(f).abspath() for f in glob("F:\*.txt")]
for f in x:
print(f)
>>> F:acquistionline.txt
>>> F:acquisti_2018.txt
>>> F:ootstrap_jquery_ecc.txt
Using os.path.isfile
to avoid directories in the list
import os.path
listOfFiles = [f for f in os.listdir() if os.path.isfile(f)]
print(listOfFiles)
>>> ["a simple game.py", "data.txt", "decorator.py"]
Using pathlib
from Python 3.4
import pathlib
flist = []
for p in pathlib.Path(".").iterdir():
if p.is_file():
print(p)
flist.append(p)
>>> error.PNG
>>> exemaker.bat
>>> guiprova.mp3
>>> setup.py
>>> speak_gui2.py
>>> thumb.PNG
With list comprehension
:
flist = [p for p in pathlib.Path(".").iterdir() if p.is_file()]
Alternatively, use pathlib.Path()
instead of pathlib.Path(".")
Use glob method in pathlib.Path()
import pathlib
py = pathlib.Path().glob("*.py")
for file in py:
print(file)
>>> stack_overflow_list.py
>>> stack_overflow_list_tkinter.py
Get all and only files with os.walk
import os
x = [i[2] for i in os.walk(".")]
y=[]
for t in x:
for f in t:
y.append(f)
print(y)
>>> ["append_to_list.py", "data.txt", "data1.txt", "data2.txt", "data_180617", "os_walk.py", "READ2.py", "read_data.py", "somma_defaltdic.py", "substitute_words.py", "sum_data.py", "data.txt", "data1.txt", "data_180617"]
Get only files with next and walk in a directory
import os
x = next(os.walk("F://python"))[2]
print(x)
>>> ["calculator.bat","calculator.py"]
Get only directories with next and walk in a directory
import os
next(os.walk("F://python"))[1] # for the current dir use (".")
>>> ["python3","others"]
Get all the subdir names with walk
for r,d,f in os.walk("F:\_python"):
for dirs in d:
print(dirs)
>>> .vscode
>>> pyexcel
>>> pyschool.py
>>> subtitles
>>> _metaprogramming
>>> .ipynb_checkpoints
os.scandir()
from Python 3.5 and greater
import os
x = [f.name for f in os.scandir() if f.is_file()]
print(x)
>>> ["calculator.bat","calculator.py"]
# Another example with scandir (a little variation from docs.python.org)
# This one is more efficient than os.listdir.
# In this case, it shows the files only in the current directory
# where the script is executed.
import os
with os.scandir() as i:
for entry in i:
if entry.is_file():
print(entry.name)
>>> ebookmaker.py
>>> error.PNG
>>> exemaker.bat
>>> guiprova.mp3
>>> setup.py
>>> speakgui4.py
>>> speak_gui2.py
>>> speak_gui3.py
>>> thumb.PNG
Examples:
Ex. 1: How many files are there in the subdirectories?
In this example, we look for the number of files that are included in all the directory and its subdirectories.
import os
def count(dir, counter=0):
"returns number of files in dir and subdirs"
for pack in os.walk(dir):
for f in pack[2]:
counter += 1
return dir + " : " + str(counter) + "files"
print(count("F:\python"))
>>> "F:\python" : 12057 files"
Ex.2: How to copy all files from a directory to another?
A script to make order in your computer finding all files of a type (default: pptx) and copying them in a new folder.
import os
import shutil
from path import path
destination = "F:\file_copied"
# os.makedirs(destination)
def copyfile(dir, filetype="pptx", counter=0):
"Searches for pptx (or other - pptx is the default) files and copies them"
for pack in os.walk(dir):
for f in pack[2]:
if f.endswith(filetype):
fullpath = pack[0] + "\" + f
print(fullpath)
shutil.copy(fullpath, destination)
counter += 1
if counter > 0:
print("-" * 30)
print(" ==> Found in: `" + dir + "` : " + str(counter) + " files
")
for dir in os.listdir():
"searches for folders that starts with `_`"
if dir[0] == "_":
# copyfile(dir, filetype="pdf")
copyfile(dir, filetype="txt")
>>> _compiti18Compito Contabilità 1conti.txt
>>> _compiti18Compito Contabilità 1modula4.txt
>>> _compiti18Compito Contabilità 1moduloa4.txt
>>> ------------------------
>>> ==> Found in: `_compiti18` : 3 files
Ex. 3: How to get all the files in a txt file
In case you want to create a txt file with all the file names:
import os
mylist = ""
with open("filelist.txt", "w", encoding="utf-8") as file:
for eachfile in os.listdir():
mylist += eachfile + "
"
file.write(mylist)
Example: txt with all the files of an hard drive
"""
We are going to save a txt file with all the files in your directory.
We will use the function walk()
"""
import os
# see all the methods of os
# print(*dir(os), sep=", ")
listafile = []
percorso = []
with open("lista_file.txt", "w", encoding="utf-8") as testo:
for root, dirs, files in os.walk("D:\"):
for file in files:
listafile.append(file)
percorso.append(root + "\" + file)
testo.write(file + "
")
listafile.sort()
print("N. of files", len(listafile))
with open("lista_file_ordinata.txt", "w", encoding="utf-8") as testo_ordinato:
for file in listafile:
testo_ordinato.write(file + "
")
with open("percorso.txt", "w", encoding="utf-8") as file_percorso:
for file in percorso:
file_percorso.write(file + "
")
os.system("lista_file.txt")
os.system("lista_file_ordinata.txt")
os.system("percorso.txt")
All the file of C: in one text file
This is a shorter version of the previous code. Change the folder where to start finding the files if you need to start from another position. This code generate a 50 mb on text file on my computer with something less then 500.000 lines with files with the complete path.
import os
with open("file.txt", "w", encoding="utf-8") as filewrite:
for r, d, f in os.walk("C:\"):
for file in f:
filewrite.write(f"{r + file}
")
How to write a file with all paths in a folder of a type
With this function you can create a txt file that will have the name of a type of file that you look for (ex. pngfile.txt) with all the full path of all the files of that type. It can be useful sometimes, I think.
import os
def searchfiles(extension=".ttf", folder="H:\"):
"Create a txt file with all the file of a type"
with open(extension[1:] + "file.txt", "w", encoding="utf-8") as filewrite:
for r, d, f in os.walk(folder):
for file in f:
if file.endswith(extension):
filewrite.write(f"{r + file}
")
# looking for png file (fonts) in the hard disk H:
searchfiles(".png", "H:\")
>>> H:4bs_18Dolphins5.png
>>> H:4bs_18Dolphins6.png
>>> H:4bs_18Dolphins7.png
>>> H:5_18marketing htmlassetsimageslogo2.png
>>> H:7z001.png
>>> H:7z002.png
(New) Find all files and open them with tkinter GUI
I just wanted to add in this 2019 a little app to search for all files in a dir and be able to open them by doubleclicking on the name of the file in the list.

import tkinter as tk
import os
def searchfiles(extension=".txt", folder="H:\"):
"insert all files in the listbox"
for r, d, f in os.walk(folder):
for file in f:
if file.endswith(extension):
lb.insert(0, r + "\" + file)
def open_file():
os.startfile(lb.get(lb.curselection()[0]))
root = tk.Tk()
root.geometry("400x400")
bt = tk.Button(root, text="Search", command=lambda:searchfiles(".png", "H:\"))
bt.pack()
lb = tk.Listbox(root)
lb.pack(fill="both", expand=1)
lb.bind("<Double-Button>", lambda x: open_file())
root.mainloop()
Answer #4:
import os
os.listdir("somedirectory")
will return a list of all files and directories in "somedirectory".
Answer #5:
A one-line solution to get only list of files (no subdirectories):
filenames = next(os.walk(path))[2]
or absolute pathnames:
paths = [os.path.join(path, fn) for fn in next(os.walk(path))[2]]
How do I open PDFs in a web browser using PHP?: StackOverflow Questions
How do I merge two dictionaries in a single expression (taking union of dictionaries)?
Question by Carl Meyer
I have two Python dictionaries, and I want to write a single expression that returns these two dictionaries, merged (i.e. taking the union). The update()
method would be what I need, if it returned its result instead of modifying a dictionary in-place.
>>> x = {"a": 1, "b": 2}
>>> y = {"b": 10, "c": 11}
>>> z = x.update(y)
>>> print(z)
None
>>> x
{"a": 1, "b": 10, "c": 11}
How can I get that final merged dictionary in z
, not x
?
(To be extra-clear, the last-one-wins conflict-handling of dict.update()
is what I"m looking for as well.)
Answer #1:
How can I merge two Python dictionaries in a single expression?
For dictionaries x
and y
, z
becomes a shallowly-merged dictionary with values from y
replacing those from x
.
In Python 3.9.0 or greater (released 17 October 2020): PEP-584, discussed here, was implemented and provides the simplest method:
z = x | y # NOTE: 3.9+ ONLY
In Python 3.5 or greater:
z = {**x, **y}
In Python 2, (or 3.4 or lower) write a function:
def merge_two_dicts(x, y):
z = x.copy() # start with keys and values of x
z.update(y) # modifies z with keys and values of y
return z
and now:
z = merge_two_dicts(x, y)
Explanation
Say you have two dictionaries and you want to merge them into a new dictionary without altering the original dictionaries:
x = {"a": 1, "b": 2}
y = {"b": 3, "c": 4}
The desired result is to get a new dictionary (z
) with the values merged, and the second dictionary"s values overwriting those from the first.
>>> z
{"a": 1, "b": 3, "c": 4}
A new syntax for this, proposed in PEP 448 and available as of Python 3.5, is
z = {**x, **y}
And it is indeed a single expression.
Note that we can merge in with literal notation as well:
z = {**x, "foo": 1, "bar": 2, **y}
and now:
>>> z
{"a": 1, "b": 3, "foo": 1, "bar": 2, "c": 4}
It is now showing as implemented in the release schedule for 3.5, PEP 478, and it has now made its way into the What"s New in Python 3.5 document.
However, since many organizations are still on Python 2, you may wish to do this in a backward-compatible way. The classically Pythonic way, available in Python 2 and Python 3.0-3.4, is to do this as a two-step process:
z = x.copy()
z.update(y) # which returns None since it mutates z
In both approaches, y
will come second and its values will replace x
"s values, thus b
will point to 3
in our final result.
Not yet on Python 3.5, but want a single expression
If you are not yet on Python 3.5 or need to write backward-compatible code, and you want this in a single expression, the most performant while the correct approach is to put it in a function:
def merge_two_dicts(x, y):
"""Given two dictionaries, merge them into a new dict as a shallow copy."""
z = x.copy()
z.update(y)
return z
and then you have a single expression:
z = merge_two_dicts(x, y)
You can also make a function to merge an arbitrary number of dictionaries, from zero to a very large number:
def merge_dicts(*dict_args):
"""
Given any number of dictionaries, shallow copy and merge into a new dict,
precedence goes to key-value pairs in latter dictionaries.
"""
result = {}
for dictionary in dict_args:
result.update(dictionary)
return result
This function will work in Python 2 and 3 for all dictionaries. e.g. given dictionaries a
to g
:
z = merge_dicts(a, b, c, d, e, f, g)
and key-value pairs in g
will take precedence over dictionaries a
to f
, and so on.
Critiques of Other Answers
Don"t use what you see in the formerly accepted answer:
z = dict(x.items() + y.items())
In Python 2, you create two lists in memory for each dict, create a third list in memory with length equal to the length of the first two put together, and then discard all three lists to create the dict. In Python 3, this will fail because you"re adding two dict_items
objects together, not two lists -
>>> c = dict(a.items() + b.items())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: "dict_items" and "dict_items"
and you would have to explicitly create them as lists, e.g. z = dict(list(x.items()) + list(y.items()))
. This is a waste of resources and computation power.
Similarly, taking the union of items()
in Python 3 (viewitems()
in Python 2.7) will also fail when values are unhashable objects (like lists, for example). Even if your values are hashable, since sets are semantically unordered, the behavior is undefined in regards to precedence. So don"t do this:
>>> c = dict(a.items() | b.items())
This example demonstrates what happens when values are unhashable:
>>> x = {"a": []}
>>> y = {"b": []}
>>> dict(x.items() | y.items())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unhashable type: "list"
Here"s an example where y
should have precedence, but instead the value from x
is retained due to the arbitrary order of sets:
>>> x = {"a": 2}
>>> y = {"a": 1}
>>> dict(x.items() | y.items())
{"a": 2}
Another hack you should not use:
z = dict(x, **y)
This uses the dict
constructor and is very fast and memory-efficient (even slightly more so than our two-step process) but unless you know precisely what is happening here (that is, the second dict is being passed as keyword arguments to the dict constructor), it"s difficult to read, it"s not the intended usage, and so it is not Pythonic.
Here"s an example of the usage being remediated in django.
Dictionaries are intended to take hashable keys (e.g. frozenset
s or tuples), but this method fails in Python 3 when keys are not strings.
>>> c = dict(a, **b)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: keyword arguments must be strings
From the mailing list, Guido van Rossum, the creator of the language, wrote:
I am fine with
declaring dict({}, **{1:3}) illegal, since after all it is abuse of
the ** mechanism.
and
Apparently dict(x, **y) is going around as "cool hack" for "call
x.update(y) and return x". Personally, I find it more despicable than
cool.
It is my understanding (as well as the understanding of the creator of the language) that the intended usage for dict(**y)
is for creating dictionaries for readability purposes, e.g.:
dict(a=1, b=10, c=11)
instead of
{"a": 1, "b": 10, "c": 11}
Response to comments
Despite what Guido says, dict(x, **y)
is in line with the dict specification, which btw. works for both Python 2 and 3. The fact that this only works for string keys is a direct consequence of how keyword parameters work and not a short-coming of dict. Nor is using the ** operator in this place an abuse of the mechanism, in fact, ** was designed precisely to pass dictionaries as keywords.
Again, it doesn"t work for 3 when keys are not strings. The implicit calling contract is that namespaces take ordinary dictionaries, while users must only pass keyword arguments that are strings. All other callables enforced it. dict
broke this consistency in Python 2:
>>> foo(**{("a", "b"): None})
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: foo() keywords must be strings
>>> dict(**{("a", "b"): None})
{("a", "b"): None}
This inconsistency was bad given other implementations of Python (PyPy, Jython, IronPython). Thus it was fixed in Python 3, as this usage could be a breaking change.
I submit to you that it is malicious incompetence to intentionally write code that only works in one version of a language or that only works given certain arbitrary constraints.
More comments:
dict(x.items() + y.items())
is still the most readable solution for Python 2. Readability counts.
My response: merge_two_dicts(x, y)
actually seems much clearer to me, if we"re actually concerned about readability. And it is not forward compatible, as Python 2 is increasingly deprecated.
{**x, **y}
does not seem to handle nested dictionaries. the contents of nested keys are simply overwritten, not merged [...] I ended up being burnt by these answers that do not merge recursively and I was surprised no one mentioned it. In my interpretation of the word "merging" these answers describe "updating one dict with another", and not merging.
Yes. I must refer you back to the question, which is asking for a shallow merge of two dictionaries, with the first"s values being overwritten by the second"s - in a single expression.
Assuming two dictionaries of dictionaries, one might recursively merge them in a single function, but you should be careful not to modify the dictionaries from either source, and the surest way to avoid that is to make a copy when assigning values. As keys must be hashable and are usually therefore immutable, it is pointless to copy them:
from copy import deepcopy
def dict_of_dicts_merge(x, y):
z = {}
overlapping_keys = x.keys() & y.keys()
for key in overlapping_keys:
z[key] = dict_of_dicts_merge(x[key], y[key])
for key in x.keys() - overlapping_keys:
z[key] = deepcopy(x[key])
for key in y.keys() - overlapping_keys:
z[key] = deepcopy(y[key])
return z
Usage:
>>> x = {"a":{1:{}}, "b": {2:{}}}
>>> y = {"b":{10:{}}, "c": {11:{}}}
>>> dict_of_dicts_merge(x, y)
{"b": {2: {}, 10: {}}, "a": {1: {}}, "c": {11: {}}}
Coming up with contingencies for other value types is far beyond the scope of this question, so I will point you at my answer to the canonical question on a "Dictionaries of dictionaries merge".
Less Performant But Correct Ad-hocs
These approaches are less performant, but they will provide correct behavior.
They will be much less performant than copy
and update
or the new unpacking because they iterate through each key-value pair at a higher level of abstraction, but they do respect the order of precedence (latter dictionaries have precedence)
You can also chain the dictionaries manually inside a dict comprehension:
{k: v for d in dicts for k, v in d.items()} # iteritems in Python 2.7
or in Python 2.6 (and perhaps as early as 2.4 when generator expressions were introduced):
dict((k, v) for d in dicts for k, v in d.items()) # iteritems in Python 2
itertools.chain
will chain the iterators over the key-value pairs in the correct order:
from itertools import chain
z = dict(chain(x.items(), y.items())) # iteritems in Python 2
Performance Analysis
I"m only going to do the performance analysis of the usages known to behave correctly. (Self-contained so you can copy and paste yourself.)
from timeit import repeat
from itertools import chain
x = dict.fromkeys("abcdefg")
y = dict.fromkeys("efghijk")
def merge_two_dicts(x, y):
z = x.copy()
z.update(y)
return z
min(repeat(lambda: {**x, **y}))
min(repeat(lambda: merge_two_dicts(x, y)))
min(repeat(lambda: {k: v for d in (x, y) for k, v in d.items()}))
min(repeat(lambda: dict(chain(x.items(), y.items()))))
min(repeat(lambda: dict(item for d in (x, y) for item in d.items())))
In Python 3.8.1, NixOS:
>>> min(repeat(lambda: {**x, **y}))
1.0804965235292912
>>> min(repeat(lambda: merge_two_dicts(x, y)))
1.636518670246005
>>> min(repeat(lambda: {k: v for d in (x, y) for k, v in d.items()}))
3.1779992282390594
>>> min(repeat(lambda: dict(chain(x.items(), y.items()))))
2.740647904574871
>>> min(repeat(lambda: dict(item for d in (x, y) for item in d.items())))
4.266070580109954
$ uname -a
Linux nixos 4.19.113 #1-NixOS SMP Wed Mar 25 07:06:15 UTC 2020 x86_64 GNU/Linux
Resources on Dictionaries
- My explanation of Python"s dictionary implementation, updated for 3.6.
- Answer on how to add new keys to a dictionary
- Mapping two lists into a dictionary
- The official Python docs on dictionaries
- The Dictionary Even Mightier - talk by Brandon Rhodes at Pycon 2017
- Modern Python Dictionaries, A Confluence of Great Ideas - talk by Raymond Hettinger at Pycon 2017
Answer #2:
In your case, what you can do is:
z = dict(list(x.items()) + list(y.items()))
This will, as you want it, put the final dict in z
, and make the value for key b
be properly overridden by the second (y
) dict"s value:
>>> x = {"a":1, "b": 2}
>>> y = {"b":10, "c": 11}
>>> z = dict(list(x.items()) + list(y.items()))
>>> z
{"a": 1, "c": 11, "b": 10}
If you use Python 2, you can even remove the list()
calls. To create z:
>>> z = dict(x.items() + y.items())
>>> z
{"a": 1, "c": 11, "b": 10}
If you use Python version 3.9.0a4 or greater, then you can directly use:
x = {"a":1, "b": 2}
y = {"b":10, "c": 11}
z = x | y
print(z)
{"a": 1, "c": 11, "b": 10}
Answer #3:
An alternative:
z = x.copy()
z.update(y)
Answer #4:
Another, more concise, option:
z = dict(x, **y)
Note: this has become a popular answer, but it is important to point out that if y
has any non-string keys, the fact that this works at all is an abuse of a CPython implementation detail, and it does not work in Python 3, or in PyPy, IronPython, or Jython. Also, Guido is not a fan. So I can"t recommend this technique for forward-compatible or cross-implementation portable code, which really means it should be avoided entirely.
Answer #5:
This probably won"t be a popular answer, but you almost certainly do not want to do this. If you want a copy that"s a merge, then use copy (or deepcopy, depending on what you want) and then update. The two lines of code are much more readable - more Pythonic - than the single line creation with .items() + .items(). Explicit is better than implicit.
In addition, when you use .items() (pre Python 3.0), you"re creating a new list that contains the items from the dict. If your dictionaries are large, then that is quite a lot of overhead (two large lists that will be thrown away as soon as the merged dict is created). update() can work more efficiently, because it can run through the second dict item-by-item.
In terms of time:
>>> timeit.Timer("dict(x, **y)", "x = dict(zip(range(1000), range(1000)))
y=dict(zip(range(1000,2000), range(1000,2000)))").timeit(100000)
15.52571702003479
>>> timeit.Timer("temp = x.copy()
temp.update(y)", "x = dict(zip(range(1000), range(1000)))
y=dict(zip(range(1000,2000), range(1000,2000)))").timeit(100000)
15.694622993469238
>>> timeit.Timer("dict(x.items() + y.items())", "x = dict(zip(range(1000), range(1000)))
y=dict(zip(range(1000,2000), range(1000,2000)))").timeit(100000)
41.484580039978027
IMO the tiny slowdown between the first two is worth it for the readability. In addition, keyword arguments for dictionary creation was only added in Python 2.3, whereas copy() and update() will work in older versions.
How do I open PDFs in a web browser using PHP?: StackOverflow Questions
How can I open multiple files using "with open" in Python?
I want to change a couple of files at one time, iff I can write to all of them. I"m wondering if I somehow can combine the multiple open calls with the with
statement:
try:
with open("a", "w") as a and open("b", "w") as b:
do_something()
except IOError as e:
print "Operation failed: %s" % e.strerror
If that"s not possible, what would an elegant solution to this problem look like?
Answer #1:
As of Python 2.7 (or 3.1 respectively) you can write
with open("a", "w") as a, open("b", "w") as b:
do_something()
In earlier versions of Python, you can sometimes use
contextlib.nested()
to nest context managers. This won"t work as expected for opening multiples files, though -- see the linked documentation for details.
In the rare case that you want to open a variable number of files all at the same time, you can use contextlib.ExitStack
, starting from Python version 3.3:
with ExitStack() as stack:
files = [stack.enter_context(open(fname)) for fname in filenames]
# Do something with "files"
Most of the time you have a variable set of files, you likely want to open them one after the other, though.
Answer #2:
For opening many files at once or for long file paths, it may be useful to break things up over multiple lines. From the Python Style Guide as suggested by @Sven Marnach in comments to another answer:
with open("/path/to/InFile.ext", "r") as file_1,
open("/path/to/OutFile.ext", "w") as file_2:
file_2.write(file_1.read())
open() in Python does not create a file if it doesn"t exist
What is the best way to open a file as read/write if it exists, or if it does not, then create it and open it as read/write? From what I read, file = open("myfile.dat", "rw")
should do this, right?
It is not working for me (Python 2.6.2) and I"m wondering if it is a version problem, or not supposed to work like that or what.
The bottom line is, I just need a solution for the problem. I am curious about the other stuff, but all I need is a nice way to do the opening part.
The enclosing directory was writeable by user and group, not other (I"m on a Linux system... so permissions 775 in other words), and the exact error was:
IOError: no such file or directory.
Answer #1:
You should use open
with the w+
mode:
file = open("myfile.dat", "w+")
Answer #2:
The advantage of the following approach is that the file is properly closed at the block"s end, even if an exception is raised on the way. It"s equivalent to try-finally
, but much shorter.
with open("file.dat";"a+") as f:
f.write(...)
...
a+ Opens a file for both appending and reading. The file pointer is
at the end of the file if the file exists. The file opens in the
append mode. If the file does not exist, it creates a new file for
reading and writing. -Python file modes
seek() method sets the file"s current position.
f.seek(pos [, (0|1|2)])
pos .. position of the r/w pointer
[] .. optionally
() .. one of ->
0 .. absolute position
1 .. relative position to current
2 .. relative position from end
Only "rwab+" characters are allowed; there must be exactly one of "rwa" - see Stack Overflow question Python file modes detail.
Difference between modes a, a+, w, w+, and r+ in built-in open function?
In the python built-in open function, what is the exact difference between the modes w
, a
, w+
, a+
, and r+
?
In particular, the documentation implies that all of these will allow writing to the file, and says that it opens the files for "appending", "writing", and "updating" specifically, but does not define what these terms mean.
Answer #1:
The opening modes are exactly the same as those for the C standard library function fopen()
.
The BSD fopen
manpage defines them as follows:
The argument mode points to a string beginning with one of the following
sequences (Additional characters may follow these sequences.):
``r"" Open text file for reading. The stream is positioned at the
beginning of the file.
``r+"" Open for reading and writing. The stream is positioned at the
beginning of the file.
``w"" Truncate file to zero length or create text file for writing.
The stream is positioned at the beginning of the file.
``w+"" Open for reading and writing. The file is created if it does not
exist, otherwise it is truncated. The stream is positioned at
the beginning of the file.
``a"" Open for writing. The file is created if it does not exist. The
stream is positioned at the end of the file. Subsequent writes
to the file will always end up at the then current end of file,
irrespective of any intervening fseek(3) or similar.
``a+"" Open for reading and writing. The file is created if it does not
exist. The stream is positioned at the end of the file. Subse-
quent writes to the file will always end up at the then current
end of file, irrespective of any intervening fseek(3) or similar.
How do I open PDFs in a web browser using PHP?: StackOverflow Questions
How do I merge two dictionaries in a single expression (taking union of dictionaries)?
Question by Carl Meyer
I have two Python dictionaries, and I want to write a single expression that returns these two dictionaries, merged (i.e. taking the union). The update()
method would be what I need, if it returned its result instead of modifying a dictionary in-place.
>>> x = {"a": 1, "b": 2}
>>> y = {"b": 10, "c": 11}
>>> z = x.update(y)
>>> print(z)
None
>>> x
{"a": 1, "b": 10, "c": 11}
How can I get that final merged dictionary in z
, not x
?
(To be extra-clear, the last-one-wins conflict-handling of dict.update()
is what I"m looking for as well.)
Answer #1:
How can I merge two Python dictionaries in a single expression?
For dictionaries x
and y
, z
becomes a shallowly-merged dictionary with values from y
replacing those from x
.
In Python 3.9.0 or greater (released 17 October 2020): PEP-584, discussed here, was implemented and provides the simplest method:
z = x | y # NOTE: 3.9+ ONLY
In Python 3.5 or greater:
z = {**x, **y}
In Python 2, (or 3.4 or lower) write a function:
def merge_two_dicts(x, y):
z = x.copy() # start with keys and values of x
z.update(y) # modifies z with keys and values of y
return z
and now:
z = merge_two_dicts(x, y)
Explanation
Say you have two dictionaries and you want to merge them into a new dictionary without altering the original dictionaries:
x = {"a": 1, "b": 2}
y = {"b": 3, "c": 4}
The desired result is to get a new dictionary (z
) with the values merged, and the second dictionary"s values overwriting those from the first.
>>> z
{"a": 1, "b": 3, "c": 4}
A new syntax for this, proposed in PEP 448 and available as of Python 3.5, is
z = {**x, **y}
And it is indeed a single expression.
Note that we can merge in with literal notation as well:
z = {**x, "foo": 1, "bar": 2, **y}
and now:
>>> z
{"a": 1, "b": 3, "foo": 1, "bar": 2, "c": 4}
It is now showing as implemented in the release schedule for 3.5, PEP 478, and it has now made its way into the What"s New in Python 3.5 document.
However, since many organizations are still on Python 2, you may wish to do this in a backward-compatible way. The classically Pythonic way, available in Python 2 and Python 3.0-3.4, is to do this as a two-step process:
z = x.copy()
z.update(y) # which returns None since it mutates z
In both approaches, y
will come second and its values will replace x
"s values, thus b
will point to 3
in our final result.
Not yet on Python 3.5, but want a single expression
If you are not yet on Python 3.5 or need to write backward-compatible code, and you want this in a single expression, the most performant while the correct approach is to put it in a function:
def merge_two_dicts(x, y):
"""Given two dictionaries, merge them into a new dict as a shallow copy."""
z = x.copy()
z.update(y)
return z
and then you have a single expression:
z = merge_two_dicts(x, y)
You can also make a function to merge an arbitrary number of dictionaries, from zero to a very large number:
def merge_dicts(*dict_args):
"""
Given any number of dictionaries, shallow copy and merge into a new dict,
precedence goes to key-value pairs in latter dictionaries.
"""
result = {}
for dictionary in dict_args:
result.update(dictionary)
return result
This function will work in Python 2 and 3 for all dictionaries. e.g. given dictionaries a
to g
:
z = merge_dicts(a, b, c, d, e, f, g)
and key-value pairs in g
will take precedence over dictionaries a
to f
, and so on.
Critiques of Other Answers
Don"t use what you see in the formerly accepted answer:
z = dict(x.items() + y.items())
In Python 2, you create two lists in memory for each dict, create a third list in memory with length equal to the length of the first two put together, and then discard all three lists to create the dict. In Python 3, this will fail because you"re adding two dict_items
objects together, not two lists -
>>> c = dict(a.items() + b.items())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: "dict_items" and "dict_items"
and you would have to explicitly create them as lists, e.g. z = dict(list(x.items()) + list(y.items()))
. This is a waste of resources and computation power.
Similarly, taking the union of items()
in Python 3 (viewitems()
in Python 2.7) will also fail when values are unhashable objects (like lists, for example). Even if your values are hashable, since sets are semantically unordered, the behavior is undefined in regards to precedence. So don"t do this:
>>> c = dict(a.items() | b.items())
This example demonstrates what happens when values are unhashable:
>>> x = {"a": []}
>>> y = {"b": []}
>>> dict(x.items() | y.items())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unhashable type: "list"
Here"s an example where y
should have precedence, but instead the value from x
is retained due to the arbitrary order of sets:
>>> x = {"a": 2}
>>> y = {"a": 1}
>>> dict(x.items() | y.items())
{"a": 2}
Another hack you should not use:
z = dict(x, **y)
This uses the dict
constructor and is very fast and memory-efficient (even slightly more so than our two-step process) but unless you know precisely what is happening here (that is, the second dict is being passed as keyword arguments to the dict constructor), it"s difficult to read, it"s not the intended usage, and so it is not Pythonic.
Here"s an example of the usage being remediated in django.
Dictionaries are intended to take hashable keys (e.g. frozenset
s or tuples), but this method fails in Python 3 when keys are not strings.
>>> c = dict(a, **b)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: keyword arguments must be strings
From the mailing list, Guido van Rossum, the creator of the language, wrote:
I am fine with
declaring dict({}, **{1:3}) illegal, since after all it is abuse of
the ** mechanism.
and
Apparently dict(x, **y) is going around as "cool hack" for "call
x.update(y) and return x". Personally, I find it more despicable than
cool.
It is my understanding (as well as the understanding of the creator of the language) that the intended usage for dict(**y)
is for creating dictionaries for readability purposes, e.g.:
dict(a=1, b=10, c=11)
instead of
{"a": 1, "b": 10, "c": 11}
Response to comments
Despite what Guido says, dict(x, **y)
is in line with the dict specification, which btw. works for both Python 2 and 3. The fact that this only works for string keys is a direct consequence of how keyword parameters work and not a short-coming of dict. Nor is using the ** operator in this place an abuse of the mechanism, in fact, ** was designed precisely to pass dictionaries as keywords.
Again, it doesn"t work for 3 when keys are not strings. The implicit calling contract is that namespaces take ordinary dictionaries, while users must only pass keyword arguments that are strings. All other callables enforced it. dict
broke this consistency in Python 2:
>>> foo(**{("a", "b"): None})
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: foo() keywords must be strings
>>> dict(**{("a", "b"): None})
{("a", "b"): None}
This inconsistency was bad given other implementations of Python (PyPy, Jython, IronPython). Thus it was fixed in Python 3, as this usage could be a breaking change.
I submit to you that it is malicious incompetence to intentionally write code that only works in one version of a language or that only works given certain arbitrary constraints.
More comments:
dict(x.items() + y.items())
is still the most readable solution for Python 2. Readability counts.
My response: merge_two_dicts(x, y)
actually seems much clearer to me, if we"re actually concerned about readability. And it is not forward compatible, as Python 2 is increasingly deprecated.
{**x, **y}
does not seem to handle nested dictionaries. the contents of nested keys are simply overwritten, not merged [...] I ended up being burnt by these answers that do not merge recursively and I was surprised no one mentioned it. In my interpretation of the word "merging" these answers describe "updating one dict with another", and not merging.
Yes. I must refer you back to the question, which is asking for a shallow merge of two dictionaries, with the first"s values being overwritten by the second"s - in a single expression.
Assuming two dictionaries of dictionaries, one might recursively merge them in a single function, but you should be careful not to modify the dictionaries from either source, and the surest way to avoid that is to make a copy when assigning values. As keys must be hashable and are usually therefore immutable, it is pointless to copy them:
from copy import deepcopy
def dict_of_dicts_merge(x, y):
z = {}
overlapping_keys = x.keys() & y.keys()
for key in overlapping_keys:
z[key] = dict_of_dicts_merge(x[key], y[key])
for key in x.keys() - overlapping_keys:
z[key] = deepcopy(x[key])
for key in y.keys() - overlapping_keys:
z[key] = deepcopy(y[key])
return z
Usage:
>>> x = {"a":{1:{}}, "b": {2:{}}}
>>> y = {"b":{10:{}}, "c": {11:{}}}
>>> dict_of_dicts_merge(x, y)
{"b": {2: {}, 10: {}}, "a": {1: {}}, "c": {11: {}}}
Coming up with contingencies for other value types is far beyond the scope of this question, so I will point you at my answer to the canonical question on a "Dictionaries of dictionaries merge".
Less Performant But Correct Ad-hocs
These approaches are less performant, but they will provide correct behavior.
They will be much less performant than copy
and update
or the new unpacking because they iterate through each key-value pair at a higher level of abstraction, but they do respect the order of precedence (latter dictionaries have precedence)
You can also chain the dictionaries manually inside a dict comprehension:
{k: v for d in dicts for k, v in d.items()} # iteritems in Python 2.7
or in Python 2.6 (and perhaps as early as 2.4 when generator expressions were introduced):
dict((k, v) for d in dicts for k, v in d.items()) # iteritems in Python 2
itertools.chain
will chain the iterators over the key-value pairs in the correct order:
from itertools import chain
z = dict(chain(x.items(), y.items())) # iteritems in Python 2
Performance Analysis
I"m only going to do the performance analysis of the usages known to behave correctly. (Self-contained so you can copy and paste yourself.)
from timeit import repeat
from itertools import chain
x = dict.fromkeys("abcdefg")
y = dict.fromkeys("efghijk")
def merge_two_dicts(x, y):
z = x.copy()
z.update(y)
return z
min(repeat(lambda: {**x, **y}))
min(repeat(lambda: merge_two_dicts(x, y)))
min(repeat(lambda: {k: v for d in (x, y) for k, v in d.items()}))
min(repeat(lambda: dict(chain(x.items(), y.items()))))
min(repeat(lambda: dict(item for d in (x, y) for item in d.items())))
In Python 3.8.1, NixOS:
>>> min(repeat(lambda: {**x, **y}))
1.0804965235292912
>>> min(repeat(lambda: merge_two_dicts(x, y)))
1.636518670246005
>>> min(repeat(lambda: {k: v for d in (x, y) for k, v in d.items()}))
3.1779992282390594
>>> min(repeat(lambda: dict(chain(x.items(), y.items()))))
2.740647904574871
>>> min(repeat(lambda: dict(item for d in (x, y) for item in d.items())))
4.266070580109954
$ uname -a
Linux nixos 4.19.113 #1-NixOS SMP Wed Mar 25 07:06:15 UTC 2020 x86_64 GNU/Linux
Resources on Dictionaries
- My explanation of Python"s dictionary implementation, updated for 3.6.
- Answer on how to add new keys to a dictionary
- Mapping two lists into a dictionary
- The official Python docs on dictionaries
- The Dictionary Even Mightier - talk by Brandon Rhodes at Pycon 2017
- Modern Python Dictionaries, A Confluence of Great Ideas - talk by Raymond Hettinger at Pycon 2017
Answer #2:
In your case, what you can do is:
z = dict(list(x.items()) + list(y.items()))
This will, as you want it, put the final dict in z
, and make the value for key b
be properly overridden by the second (y
) dict"s value:
>>> x = {"a":1, "b": 2}
>>> y = {"b":10, "c": 11}
>>> z = dict(list(x.items()) + list(y.items()))
>>> z
{"a": 1, "c": 11, "b": 10}
If you use Python 2, you can even remove the list()
calls. To create z:
>>> z = dict(x.items() + y.items())
>>> z
{"a": 1, "c": 11, "b": 10}
If you use Python version 3.9.0a4 or greater, then you can directly use:
x = {"a":1, "b": 2}
y = {"b":10, "c": 11}
z = x | y
print(z)
{"a": 1, "c": 11, "b": 10}
Answer #3:
An alternative:
z = x.copy()
z.update(y)
Answer #4:
Another, more concise, option:
z = dict(x, **y)
Note: this has become a popular answer, but it is important to point out that if y
has any non-string keys, the fact that this works at all is an abuse of a CPython implementation detail, and it does not work in Python 3, or in PyPy, IronPython, or Jython. Also, Guido is not a fan. So I can"t recommend this technique for forward-compatible or cross-implementation portable code, which really means it should be avoided entirely.
Answer #5:
This probably won"t be a popular answer, but you almost certainly do not want to do this. If you want a copy that"s a merge, then use copy (or deepcopy, depending on what you want) and then update. The two lines of code are much more readable - more Pythonic - than the single line creation with .items() + .items(). Explicit is better than implicit.
In addition, when you use .items() (pre Python 3.0), you"re creating a new list that contains the items from the dict. If your dictionaries are large, then that is quite a lot of overhead (two large lists that will be thrown away as soon as the merged dict is created). update() can work more efficiently, because it can run through the second dict item-by-item.
In terms of time:
>>> timeit.Timer("dict(x, **y)", "x = dict(zip(range(1000), range(1000)))
y=dict(zip(range(1000,2000), range(1000,2000)))").timeit(100000)
15.52571702003479
>>> timeit.Timer("temp = x.copy()
temp.update(y)", "x = dict(zip(range(1000), range(1000)))
y=dict(zip(range(1000,2000), range(1000,2000)))").timeit(100000)
15.694622993469238
>>> timeit.Timer("dict(x.items() + y.items())", "x = dict(zip(range(1000), range(1000)))
y=dict(zip(range(1000,2000), range(1000,2000)))").timeit(100000)
41.484580039978027
IMO the tiny slowdown between the first two is worth it for the readability. In addition, keyword arguments for dictionary creation was only added in Python 2.3, whereas copy() and update() will work in older versions.