- array_combine() function :Function array_combine() - it is a built-in function in PHP that is used to combine two arrays and create a new array using one array for keys and another array for values. That is, all the elements of one array will be the keys of the new array, and all the elements of the second array will be the values ​​of this new array.
- Function range(): function range() - it is a built-in function in PHP that is used to create an array of elements of any type, such as integers, alphabets in a given range (from low to high), i.e. the first element of the list is considered low and the last one is considered high.
- count() Function : function count() is used to count the current elements in an array. The function can return 0 for a variable that has been set to an empty array. Also, for a variable that is not set, the function returns 0.
- array_values ​​() Function :This function is used to get an array of values ​​from another array, which can contain key-value pairs or just values. The function creates another array that stores all the values ​​and by default assigns numeric keys to the values.
// Declare an associative array
$arr
=
array
(
0 = >
’Tony’
,
1 = >
’Stark’
,
2 = >
’Iron’
,
3 = >
’Man’
);
echo
" Array before re-indexing "
;
// Using a foreach loop to print array elements
foreach
(
$arr
as
$key
= >
$value
) {
echo
"Index:"
.
$key
.
"Value:"
.
$value
.
""
;
}
// Set a sequential number of three
$New_start_index
= 3;
$arr
=
array_combine
(range (
$New_start_index
,
count
(
$arr
) + (
$New_start_index
- 1)),
array_values ​​ (
$arr
));
echo
" Array after re-indexing "
;
// Using a foreach loop to print array elements
foreach
(
$arr
as
$key
= >
$value
) {
echo
"Index:"
.
$key
.
"Value:"
.
$value
.
""
;
}
?>
Exit:Array before re -indexing Index: 0 Value: Tony Index: 1 Value: Stark Index: 2 Value: Iron Index: 3 Value: Man Array after re-indexing Index: 3 Value: Tony Index: 4 Value: Stark Index: 5 Value: Iron Index : 6 Value: Man
Example 2.Add some data to the beginning of the array and then extract the array from the index.
// Declare an associative array
$arr
=
array
(
0 = >
’Tony’
,
1 = >
’Stark’
,
2 = >
’Iron’
,
3 = >
’Man’
);
echo
" Array before re-indexing "
;
// Using a foreach loop to print array elements
foreach
(
$arr
as
$key
= >
$value
) {
echo
"Index:"
.
$key
.
"Value:"
.
$value
.
""
;
}
// Set a sequential number of three
$New_start_index
= 3;
$raw_data
= range (0,
$New_start_index
- 1);
// Add data to the beginning of the array
foreach
(
$raw_data
as
$key
= >
$value
) {
array_unshift
(
$arr
,
$value );
}
$arr
=
array_values ​​
(
$arr
);
// Remove unnecessary index so we start at 10 $arr
=
array_slice
(
$arr
,
$New_start_index
,
count
(
$arr
), true);
echo
"Array after re indexing"
;
// Using a foreach loop to print an array
foreach
(
$arr
as
$key
= >
$value ) {
echo
"Index:"
.
$key
.
"Value:"
.
$value
.
""
;
}
?>
Exit:Array before re -indexing Index: 0 Value: Tony Index: 1 Value: Stark Index: 2 Value: Iron Index: 3 Value: Man Array after re indexing Index: 3 Value: Tony Index: 4 Value: Stark Index: 5 Value: Iron Index: 6 Value: Man
In this example, first add some data to the array, and for that we do it again with a loop, and then delete the data we added, so this is also not the best choice for re-indexing the array ... This method is not suitable for reindexing alphabetic keys.Example 3:This example reindexes an alphabetical array ’ p & # 39 ;. Two additional functions are used to reindex alphabets: - Function ord() :The ord() function - is a built-in PHP function that returns the ASCII value of the first character of a string.
- CHR() Function:The CHR() function is a built-in function in PHP that is used to convert an ASCII value to a character.
// Declare an associative array
$arr
=
array
(
’a’
= >
’India’
,
’b’
= >
’America’
,
’c’
= >
’Russia’
,
’d’
= >
’China’
);
echo
" Array before re-indexing "
;
// Using a foreach loop to print array elements
foreach
(
$arr
as
$key
= >
$value
) {
echo
"Index:"
.
$key
.
"Value:"
.
$value
.
""
;
}
// Set index from ’ p ’
$New_start_index
=
’ p’
;
// The ord() function is used to get the ascii value
// chr() function to convert a number to ASCII
$arr
=
array_combine
(range (
$New_start_index
,
chr (
count
(
$arr
) + (ord (
$New_start_index
) - 1))),
array_values ​​
(
$arr
));
echo
" Array after re indexing "
;
// Using a foreach loop to print an array
foreach
(
$arr
as
$key
= >
$value ) {
echo
"Index:"
.
$key
.
"Value:"
.
$value
.
""
;
}
?>
Exit:Array before re -indexing Index: a Value: India Index: b Value: America Index: c Value: Russia Index: d Value: China Array after re indexing Index: p Value: India Index: q Value: America Index: r Value: Russia Index: s Value: China
How to reindex an array in PHP?: StackOverflow Questions
How can I make a time delay in Python?
I would like to know how to put a time delay in a Python script.
Answer #1:
import time
time.sleep(5) # Delays for 5 seconds. You can also use a float value.
Here is another example where something is run approximately once a minute:
import time
while True:
print("This prints once a minute.")
time.sleep(60) # Delay for 1 minute (60 seconds).
Answer #2:
You can use the sleep()
function in the time
module. It can take a float argument for sub-second resolution.
from time import sleep
sleep(0.1) # Time in seconds
Answer #3:
How can I make a time delay in Python?
In a single thread I suggest the sleep function:
>>> from time import sleep
>>> sleep(4)
This function actually suspends the processing of the thread in which it is called by the operating system, allowing other threads and processes to execute while it sleeps.
Use it for that purpose, or simply to delay a function from executing. For example:
>>> def party_time():
... print("hooray!")
...
>>> sleep(3); party_time()
hooray!
"hooray!" is printed 3 seconds after I hit Enter.
Example using sleep
with multiple threads and processes
Again, sleep
suspends your thread - it uses next to zero processing power.
To demonstrate, create a script like this (I first attempted this in an interactive Python 3.5 shell, but sub-processes can"t find the party_later
function for some reason):
from concurrent.futures import ThreadPoolExecutor, ProcessPoolExecutor, as_completed
from time import sleep, time
def party_later(kind="", n=""):
sleep(3)
return kind + n + " party time!: " + __name__
def main():
with ProcessPoolExecutor() as proc_executor:
with ThreadPoolExecutor() as thread_executor:
start_time = time()
proc_future1 = proc_executor.submit(party_later, kind="proc", n="1")
proc_future2 = proc_executor.submit(party_later, kind="proc", n="2")
thread_future1 = thread_executor.submit(party_later, kind="thread", n="1")
thread_future2 = thread_executor.submit(party_later, kind="thread", n="2")
for f in as_completed([
proc_future1, proc_future2, thread_future1, thread_future2,]):
print(f.result())
end_time = time()
print("total time to execute four 3-sec functions:", end_time - start_time)
if __name__ == "__main__":
main()
Example output from this script:
thread1 party time!: __main__
thread2 party time!: __main__
proc1 party time!: __mp_main__
proc2 party time!: __mp_main__
total time to execute four 3-sec functions: 3.4519670009613037
Multithreading
You can trigger a function to be called at a later time in a separate thread with the Timer
threading object:
>>> from threading import Timer
>>> t = Timer(3, party_time, args=None, kwargs=None)
>>> t.start()
>>>
>>> hooray!
>>>
The blank line illustrates that the function printed to my standard output, and I had to hit Enter to ensure I was on a prompt.
The upside of this method is that while the Timer
thread was waiting, I was able to do other things, in this case, hitting Enter one time - before the function executed (see the first empty prompt).
There isn"t a respective object in the multiprocessing library. You can create one, but it probably doesn"t exist for a reason. A sub-thread makes a lot more sense for a simple timer than a whole new subprocess.
Answer #4:
Delays can be also implemented by using the following methods.
The first method:
import time
time.sleep(5) # Delay for 5 seconds.
The second method to delay would be using the implicit wait method:
driver.implicitly_wait(5)
The third method is more useful when you have to wait until a particular action is completed or until an element is found:
self.wait.until(EC.presence_of_element_located((By.ID, "UserName"))
How to delete a file or folder in Python?
How do I delete a file or folder in Python?
Answer #1:
os.remove()
removes a file.
os.rmdir()
removes an empty directory.
shutil.rmtree()
deletes a directory and all its contents.
Path
objects from the Python 3.4+ pathlib
module also expose these instance methods:
pathlib.Path.unlink()
removes a file or symbolic link.
pathlib.Path.rmdir()
removes an empty directory.
How to reindex an array in PHP?: StackOverflow Questions
Answer #2:
os.remove()
removes a file.
os.rmdir()
removes an empty directory.
shutil.rmtree()
deletes a directory and all its contents.
Path
objects from the Python 3.4+ pathlib
module also expose these instance methods:
pathlib.Path.unlink()
removes a file or symbolic link.
pathlib.Path.rmdir()
removes an empty directory.
Answer #3:
Python syntax to delete a file
import os
os.remove("/tmp/<file_name>.txt")
Or
import os
os.unlink("/tmp/<file_name>.txt")
Or
pathlib Library for Python version >= 3.4
file_to_rem = pathlib.Path("/tmp/<file_name>.txt")
file_to_rem.unlink()
Path.unlink(missing_ok=False)
Unlink method used to remove the file or the symbolik link.
If missing_ok is false (the default), FileNotFoundError is raised if the path does not exist.
If missing_ok is true, FileNotFoundError exceptions will be ignored (same behavior as the POSIX rm -f command).
Changed in version 3.8: The missing_ok parameter was added.
Best practice
- First, check whether the file or folder exists or not then only delete that file. This can be achieved in two ways :
a. os.path.isfile("/path/to/file")
b. Use exception handling.
EXAMPLE for os.path.isfile
#!/usr/bin/python
import os
myfile="/tmp/foo.txt"
## If file exists, delete it ##
if os.path.isfile(myfile):
os.remove(myfile)
else: ## Show an error ##
print("Error: %s file not found" % myfile)
Exception Handling
#!/usr/bin/python
import os
## Get input ##
myfile= raw_input("Enter file name to delete: ")
## Try to delete the file ##
try:
os.remove(myfile)
except OSError as e: ## if failed, report it back to the user ##
print ("Error: %s - %s." % (e.filename, e.strerror))
RESPECTIVE OUTPUT
Enter file name to delete : demo.txt
Error: demo.txt - No such file or directory.
Enter file name to delete : rrr.txt
Error: rrr.txt - Operation not permitted.
Enter file name to delete : foo.txt
Python syntax to delete a folder
shutil.rmtree()
Example for shutil.rmtree()
#!/usr/bin/python
import os
import sys
import shutil
# Get directory name
mydir= raw_input("Enter directory name: ")
## Try to remove tree; if failed show an error using try...except on screen
try:
shutil.rmtree(mydir)
except OSError as e:
print ("Error: %s - %s." % (e.filename, e.strerror))
Answer #4:
Here is a robust function that uses both os.remove
and shutil.rmtree
:
def remove(path):
""" param <path> could either be relative or absolute. """
if os.path.isfile(path) or os.path.islink(path):
os.remove(path) # remove the file
elif os.path.isdir(path):
shutil.rmtree(path) # remove dir and all contains
else:
raise ValueError("file {} is not a file or dir.".format(path))
Is there a simple way to delete a list element by value?
I want to remove a value from a list if it exists in the list (which it may not).
a = [1, 2, 3, 4]
b = a.index(6)
del a[b]
print(a)
The above case (in which it does not exist) shows the following error:
Traceback (most recent call last):
File "D:zjm_codea.py", line 6, in <module>
b = a.index(6)
ValueError: list.index(x): x not in list
So I have to do this:
a = [1, 2, 3, 4]
try:
b = a.index(6)
del a[b]
except:
pass
print(a)
But is there not a simpler way to do this?
Answer #1:
To remove an element"s first occurrence in a list, simply use list.remove
:
>>> a = ["a", "b", "c", "d"]
>>> a.remove("b")
>>> print(a)
["a", "c", "d"]
Mind that it does not remove all occurrences of your element. Use a list comprehension for that.
>>> a = [10, 20, 30, 40, 20, 30, 40, 20, 70, 20]
>>> a = [x for x in a if x != 20]
>>> print(a)
[10, 30, 40, 30, 40, 70]
Answer #2:
Usually Python will throw an Exception if you tell it to do something it can"t so you"ll have to do either:
if c in a:
a.remove(c)
or:
try:
a.remove(c)
except ValueError:
pass
An Exception isn"t necessarily a bad thing as long as it"s one you"re expecting and handle properly.
How to reindex an array in PHP?: StackOverflow Questions
InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately
Tried to perform REST GET through python requests with the following code and I got error.
Code snip:
import requests
header = {"Authorization": "Bearer..."}
url = az_base_url + az_subscription_id + "/resourcegroups/Default-Networking/resources?" + az_api_version
r = requests.get(url, headers=header)
Error:
/usr/local/lib/python2.7/dist-packages/requests/packages/urllib3/util/ssl_.py:79:
InsecurePlatformWarning: A true SSLContext object is not available.
This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail.
For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
InsecurePlatformWarning
My python version is 2.7.3. I tried to install urllib3 and requests[security] as some other thread suggests, I still got the same error.
Wonder if anyone can provide some tips?
Answer #1:
The docs give a fair indicator of what"s required., however requests
allow us to skip a few steps:
You only need to install the security
package extras (thanks @admdrew for pointing it out)
$ pip install requests[security]
or, install them directly:
$ pip install pyopenssl ndg-httpsclient pyasn1
Requests will then automatically inject pyopenssl
into urllib3
If you"re on ubuntu, you may run into trouble installing pyopenssl
, you"ll need these dependencies:
$ apt-get install libffi-dev libssl-dev
Answer #2:
If you are not able to upgrade your Python version to 2.7.9, and want to suppress warnings,
you can downgrade your "requests" version to 2.5.3:
pip install requests==2.5.3
Bugfix disclosure / Warning introduced in 2.6.0
Dynamic instantiation from string name of a class in dynamically imported module?
In python, I have to instantiate certain class, knowing its name in a string, but this class "lives" in a dynamically imported module. An example follows:
loader-class script:
import sys
class loader:
def __init__(self, module_name, class_name): # both args are strings
try:
__import__(module_name)
modul = sys.modules[module_name]
instance = modul.class_name() # obviously this doesn"t works, here is my main problem!
except ImportError:
# manage import error
some-dynamically-loaded-module script:
class myName:
# etc...
I use this arrangement to make any dynamically-loaded-module to be used by the loader-class following certain predefined behaviours in the dyn-loaded-modules...
Answer #1:
You can use getattr
getattr(module, class_name)
to access the class. More complete code:
module = __import__(module_name)
class_ = getattr(module, class_name)
instance = class_()
As mentioned below, we may use importlib
import importlib
module = importlib.import_module(module_name)
class_ = getattr(module, class_name)
instance = class_()
Answer #2:
tl;dr
Import the root module with importlib.import_module
and load the class by its name using getattr
function:
# Standard import
import importlib
# Load "module.submodule.MyClass"
MyClass = getattr(importlib.import_module("module.submodule"), "MyClass")
# Instantiate the class (pass arguments to the constructor, if needed)
instance = MyClass()
explanations
You probably don"t want to use __import__
to dynamically import a module by name, as it does not allow you to import submodules:
>>> mod = __import__("os.path")
>>> mod.join
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: "module" object has no attribute "join"
Here is what the python doc says about __import__
:
Note: This is an advanced function that is not needed in everyday
Python programming, unlike importlib.import_module().
Instead, use the standard importlib
module to dynamically import a module by name. With getattr
you can then instantiate a class by its name:
import importlib
my_module = importlib.import_module("module.submodule")
MyClass = getattr(my_module, "MyClass")
instance = MyClass()
You could also write:
import importlib
module_name, class_name = "module.submodule.MyClass".rsplit(".", 1)
MyClass = getattr(importlib.import_module(module_name), class_name)
instance = MyClass()
This code is valid in python ‚â• 2.7 (including python 3).
pandas loc vs. iloc vs. at vs. iat?
Recently began branching out from my safe place (R) into Python and and am a bit confused by the cell localization/selection in Pandas
. I"ve read the documentation but I"m struggling to understand the practical implications of the various localization/selection options.
Is there a reason why I should ever use .loc
or .iloc
over at
, and iat
or vice versa? In what situations should I use which method?
Note: future readers be aware that this question is old and was written before pandas v0.20 when there used to exist a function called .ix
. This method was later split into two - loc
and iloc
- to make the explicit distinction between positional and label based indexing. Please beware that ix
was discontinued due to inconsistent behavior and being hard to grok, and no longer exists in current versions of pandas (>= 1.0).
Answer #1:
loc: only work on index
iloc: work on position
at: get scalar values. It"s a very fast loc
iat: Get scalar values. It"s a very fast iloc
Also,
at
and iat
are meant to access a scalar, that is, a single element
in the dataframe, while loc
and iloc
are ments to access several
elements at the same time, potentially to perform vectorized
operations.
http://pyciencia.blogspot.com/2015/05/obtener-y-filtrar-datos-de-un-dataframe.html
How to reindex an array in PHP?: StackOverflow Questions
How do I merge two dictionaries in a single expression (taking union of dictionaries)?
Question by Carl Meyer
I have two Python dictionaries, and I want to write a single expression that returns these two dictionaries, merged (i.e. taking the union). The update()
method would be what I need, if it returned its result instead of modifying a dictionary in-place.
>>> x = {"a": 1, "b": 2}
>>> y = {"b": 10, "c": 11}
>>> z = x.update(y)
>>> print(z)
None
>>> x
{"a": 1, "b": 10, "c": 11}
How can I get that final merged dictionary in z
, not x
?
(To be extra-clear, the last-one-wins conflict-handling of dict.update()
is what I"m looking for as well.)
Answer #1:
How can I merge two Python dictionaries in a single expression?
For dictionaries x
and y
, z
becomes a shallowly-merged dictionary with values from y
replacing those from x
.
In Python 3.9.0 or greater (released 17 October 2020): PEP-584, discussed here, was implemented and provides the simplest method:
z = x | y # NOTE: 3.9+ ONLY
In Python 3.5 or greater:
z = {**x, **y}
In Python 2, (or 3.4 or lower) write a function:
def merge_two_dicts(x, y):
z = x.copy() # start with keys and values of x
z.update(y) # modifies z with keys and values of y
return z
and now:
z = merge_two_dicts(x, y)
Explanation
Say you have two dictionaries and you want to merge them into a new dictionary without altering the original dictionaries:
x = {"a": 1, "b": 2}
y = {"b": 3, "c": 4}
The desired result is to get a new dictionary (z
) with the values merged, and the second dictionary"s values overwriting those from the first.
>>> z
{"a": 1, "b": 3, "c": 4}
A new syntax for this, proposed in PEP 448 and available as of Python 3.5, is
z = {**x, **y}
And it is indeed a single expression.
Note that we can merge in with literal notation as well:
z = {**x, "foo": 1, "bar": 2, **y}
and now:
>>> z
{"a": 1, "b": 3, "foo": 1, "bar": 2, "c": 4}
It is now showing as implemented in the release schedule for 3.5, PEP 478, and it has now made its way into the What"s New in Python 3.5 document.
However, since many organizations are still on Python 2, you may wish to do this in a backward-compatible way. The classically Pythonic way, available in Python 2 and Python 3.0-3.4, is to do this as a two-step process:
z = x.copy()
z.update(y) # which returns None since it mutates z
In both approaches, y
will come second and its values will replace x
"s values, thus b
will point to 3
in our final result.
Not yet on Python 3.5, but want a single expression
If you are not yet on Python 3.5 or need to write backward-compatible code, and you want this in a single expression, the most performant while the correct approach is to put it in a function:
def merge_two_dicts(x, y):
"""Given two dictionaries, merge them into a new dict as a shallow copy."""
z = x.copy()
z.update(y)
return z
and then you have a single expression:
z = merge_two_dicts(x, y)
You can also make a function to merge an arbitrary number of dictionaries, from zero to a very large number:
def merge_dicts(*dict_args):
"""
Given any number of dictionaries, shallow copy and merge into a new dict,
precedence goes to key-value pairs in latter dictionaries.
"""
result = {}
for dictionary in dict_args:
result.update(dictionary)
return result
This function will work in Python 2 and 3 for all dictionaries. e.g. given dictionaries a
to g
:
z = merge_dicts(a, b, c, d, e, f, g)
and key-value pairs in g
will take precedence over dictionaries a
to f
, and so on.
Critiques of Other Answers
Don"t use what you see in the formerly accepted answer:
z = dict(x.items() + y.items())
In Python 2, you create two lists in memory for each dict, create a third list in memory with length equal to the length of the first two put together, and then discard all three lists to create the dict. In Python 3, this will fail because you"re adding two dict_items
objects together, not two lists -
>>> c = dict(a.items() + b.items())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: "dict_items" and "dict_items"
and you would have to explicitly create them as lists, e.g. z = dict(list(x.items()) + list(y.items()))
. This is a waste of resources and computation power.
Similarly, taking the union of items()
in Python 3 (viewitems()
in Python 2.7) will also fail when values are unhashable objects (like lists, for example). Even if your values are hashable, since sets are semantically unordered, the behavior is undefined in regards to precedence. So don"t do this:
>>> c = dict(a.items() | b.items())
This example demonstrates what happens when values are unhashable:
>>> x = {"a": []}
>>> y = {"b": []}
>>> dict(x.items() | y.items())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unhashable type: "list"
Here"s an example where y
should have precedence, but instead the value from x
is retained due to the arbitrary order of sets:
>>> x = {"a": 2}
>>> y = {"a": 1}
>>> dict(x.items() | y.items())
{"a": 2}
Another hack you should not use:
z = dict(x, **y)
This uses the dict
constructor and is very fast and memory-efficient (even slightly more so than our two-step process) but unless you know precisely what is happening here (that is, the second dict is being passed as keyword arguments to the dict constructor), it"s difficult to read, it"s not the intended usage, and so it is not Pythonic.
Here"s an example of the usage being remediated in django.
Dictionaries are intended to take hashable keys (e.g. frozenset
s or tuples), but this method fails in Python 3 when keys are not strings.
>>> c = dict(a, **b)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: keyword arguments must be strings
From the mailing list, Guido van Rossum, the creator of the language, wrote:
I am fine with
declaring dict({}, **{1:3}) illegal, since after all it is abuse of
the ** mechanism.
and
Apparently dict(x, **y) is going around as "cool hack" for "call
x.update(y) and return x". Personally, I find it more despicable than
cool.
It is my understanding (as well as the understanding of the creator of the language) that the intended usage for dict(**y)
is for creating dictionaries for readability purposes, e.g.:
dict(a=1, b=10, c=11)
instead of
{"a": 1, "b": 10, "c": 11}
Response to comments
Despite what Guido says, dict(x, **y)
is in line with the dict specification, which btw. works for both Python 2 and 3. The fact that this only works for string keys is a direct consequence of how keyword parameters work and not a short-coming of dict. Nor is using the ** operator in this place an abuse of the mechanism, in fact, ** was designed precisely to pass dictionaries as keywords.
Again, it doesn"t work for 3 when keys are not strings. The implicit calling contract is that namespaces take ordinary dictionaries, while users must only pass keyword arguments that are strings. All other callables enforced it. dict
broke this consistency in Python 2:
>>> foo(**{("a", "b"): None})
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: foo() keywords must be strings
>>> dict(**{("a", "b"): None})
{("a", "b"): None}
This inconsistency was bad given other implementations of Python (PyPy, Jython, IronPython). Thus it was fixed in Python 3, as this usage could be a breaking change.
I submit to you that it is malicious incompetence to intentionally write code that only works in one version of a language or that only works given certain arbitrary constraints.
More comments:
dict(x.items() + y.items())
is still the most readable solution for Python 2. Readability counts.
My response: merge_two_dicts(x, y)
actually seems much clearer to me, if we"re actually concerned about readability. And it is not forward compatible, as Python 2 is increasingly deprecated.
{**x, **y}
does not seem to handle nested dictionaries. the contents of nested keys are simply overwritten, not merged [...] I ended up being burnt by these answers that do not merge recursively and I was surprised no one mentioned it. In my interpretation of the word "merging" these answers describe "updating one dict with another", and not merging.
Yes. I must refer you back to the question, which is asking for a shallow merge of two dictionaries, with the first"s values being overwritten by the second"s - in a single expression.
Assuming two dictionaries of dictionaries, one might recursively merge them in a single function, but you should be careful not to modify the dictionaries from either source, and the surest way to avoid that is to make a copy when assigning values. As keys must be hashable and are usually therefore immutable, it is pointless to copy them:
from copy import deepcopy
def dict_of_dicts_merge(x, y):
z = {}
overlapping_keys = x.keys() & y.keys()
for key in overlapping_keys:
z[key] = dict_of_dicts_merge(x[key], y[key])
for key in x.keys() - overlapping_keys:
z[key] = deepcopy(x[key])
for key in y.keys() - overlapping_keys:
z[key] = deepcopy(y[key])
return z
Usage:
>>> x = {"a":{1:{}}, "b": {2:{}}}
>>> y = {"b":{10:{}}, "c": {11:{}}}
>>> dict_of_dicts_merge(x, y)
{"b": {2: {}, 10: {}}, "a": {1: {}}, "c": {11: {}}}
Coming up with contingencies for other value types is far beyond the scope of this question, so I will point you at my answer to the canonical question on a "Dictionaries of dictionaries merge".
Less Performant But Correct Ad-hocs
These approaches are less performant, but they will provide correct behavior.
They will be much less performant than copy
and update
or the new unpacking because they iterate through each key-value pair at a higher level of abstraction, but they do respect the order of precedence (latter dictionaries have precedence)
You can also chain the dictionaries manually inside a dict comprehension:
{k: v for d in dicts for k, v in d.items()} # iteritems in Python 2.7
or in Python 2.6 (and perhaps as early as 2.4 when generator expressions were introduced):
dict((k, v) for d in dicts for k, v in d.items()) # iteritems in Python 2
itertools.chain
will chain the iterators over the key-value pairs in the correct order:
from itertools import chain
z = dict(chain(x.items(), y.items())) # iteritems in Python 2
Performance Analysis
I"m only going to do the performance analysis of the usages known to behave correctly. (Self-contained so you can copy and paste yourself.)
from timeit import repeat
from itertools import chain
x = dict.fromkeys("abcdefg")
y = dict.fromkeys("efghijk")
def merge_two_dicts(x, y):
z = x.copy()
z.update(y)
return z
min(repeat(lambda: {**x, **y}))
min(repeat(lambda: merge_two_dicts(x, y)))
min(repeat(lambda: {k: v for d in (x, y) for k, v in d.items()}))
min(repeat(lambda: dict(chain(x.items(), y.items()))))
min(repeat(lambda: dict(item for d in (x, y) for item in d.items())))
In Python 3.8.1, NixOS:
>>> min(repeat(lambda: {**x, **y}))
1.0804965235292912
>>> min(repeat(lambda: merge_two_dicts(x, y)))
1.636518670246005
>>> min(repeat(lambda: {k: v for d in (x, y) for k, v in d.items()}))
3.1779992282390594
>>> min(repeat(lambda: dict(chain(x.items(), y.items()))))
2.740647904574871
>>> min(repeat(lambda: dict(item for d in (x, y) for item in d.items())))
4.266070580109954
$ uname -a
Linux nixos 4.19.113 #1-NixOS SMP Wed Mar 25 07:06:15 UTC 2020 x86_64 GNU/Linux
Resources on Dictionaries
- My explanation of Python"s dictionary implementation, updated for 3.6.
- Answer on how to add new keys to a dictionary
- Mapping two lists into a dictionary
- The official Python docs on dictionaries
- The Dictionary Even Mightier - talk by Brandon Rhodes at Pycon 2017
- Modern Python Dictionaries, A Confluence of Great Ideas - talk by Raymond Hettinger at Pycon 2017
Answer #2:
In your case, what you can do is:
z = dict(list(x.items()) + list(y.items()))
This will, as you want it, put the final dict in z
, and make the value for key b
be properly overridden by the second (y
) dict"s value:
>>> x = {"a":1, "b": 2}
>>> y = {"b":10, "c": 11}
>>> z = dict(list(x.items()) + list(y.items()))
>>> z
{"a": 1, "c": 11, "b": 10}
If you use Python 2, you can even remove the list()
calls. To create z:
>>> z = dict(x.items() + y.items())
>>> z
{"a": 1, "c": 11, "b": 10}
If you use Python version 3.9.0a4 or greater, then you can directly use:
x = {"a":1, "b": 2}
y = {"b":10, "c": 11}
z = x | y
print(z)
{"a": 1, "c": 11, "b": 10}
Answer #3:
An alternative:
z = x.copy()
z.update(y)
Answer #4:
Another, more concise, option:
z = dict(x, **y)
Note: this has become a popular answer, but it is important to point out that if y
has any non-string keys, the fact that this works at all is an abuse of a CPython implementation detail, and it does not work in Python 3, or in PyPy, IronPython, or Jython. Also, Guido is not a fan. So I can"t recommend this technique for forward-compatible or cross-implementation portable code, which really means it should be avoided entirely.
Answer #5:
This probably won"t be a popular answer, but you almost certainly do not want to do this. If you want a copy that"s a merge, then use copy (or deepcopy, depending on what you want) and then update. The two lines of code are much more readable - more Pythonic - than the single line creation with .items() + .items(). Explicit is better than implicit.
In addition, when you use .items() (pre Python 3.0), you"re creating a new list that contains the items from the dict. If your dictionaries are large, then that is quite a lot of overhead (two large lists that will be thrown away as soon as the merged dict is created). update() can work more efficiently, because it can run through the second dict item-by-item.
In terms of time:
>>> timeit.Timer("dict(x, **y)", "x = dict(zip(range(1000), range(1000)))
y=dict(zip(range(1000,2000), range(1000,2000)))").timeit(100000)
15.52571702003479
>>> timeit.Timer("temp = x.copy()
temp.update(y)", "x = dict(zip(range(1000), range(1000)))
y=dict(zip(range(1000,2000), range(1000,2000)))").timeit(100000)
15.694622993469238
>>> timeit.Timer("dict(x.items() + y.items())", "x = dict(zip(range(1000), range(1000)))
y=dict(zip(range(1000,2000), range(1000,2000)))").timeit(100000)
41.484580039978027
IMO the tiny slowdown between the first two is worth it for the readability. In addition, keyword arguments for dictionary creation was only added in Python 2.3, whereas copy() and update() will work in older versions.