NLP | Parallel list processing with execnet

| | | | | | |

👻 Check our latest review to choose the best laptop for Machine Learning engineers and Deep learning tasks!

In the code below, the integers are simply doubled, any clean computation can be done. This is the module to be executed by execnet. It receives 2 tuples (i, arg), assumes arg is a number, and sends back (i, arg * 2).

Code :

if __ name__ = = ’__channelexec__’ :

for (i, arg) in channel:

channel.send ((i, arg * 2 ))

To use this module to double each item in the list, import the plists module and call plists .map () with a remote_double module and a list of integers to double.

Code: Using plist

import plists, remote_double

plists. map (remote_double, range ( 10 ))

Output:

 [0, 2, 4, 6, 8, 10, 12, 14, 16, 18] 

The map () function is defined in plists.py. It takes a pure module, an argument list, and an optional two-tuple list of (spec, count). By default, the specs are used [(& # 39; popen & # 39 ;, 2)], which means that the user will open two local gateways and channels. Once these pipes are open, the user can place them in an itertools loop, which creates an infinite iterator that goes back to the beginning as soon as it reaches the end.

Now each argument can be sent as arguments to the pipe for processing , and since the channels are cyclical, each channel receives an almost equal distribution of arguments. That’s where I came in — the order in which the results are returned is unknown, so i , as the index of each argument in the list, is passed to and from the channel so that the user can combine the results in the original order. Then wait for the results with the MultiChannel receive queue and insert them into a prepopulated list of the same length as the original arguments. After getting all expected results, exit the gateways and return the results as shown in the code below —

Code :

import itertools, execnet

def map (mod, args, specs = [( ’popen’ , 2 )]):

gateways = []

channels = []

for spec, count in specs:

for i in range (count):

gw = execnet.makegateway (spec)

gateways.append (gw)

channels.append (gw.remote_exec (mod))

cyc = itertools.cycle (channels)

for i, arg in enumerate (args):

channel = next (cyc )

channel.send ((i, arg))

mch = execnet.MultiChannel (channels)

queue = mch.make_receive_queue ()

l = len (args)

# creates a list of length l,

# where each element is None

results = [ None ] * l

for i in range (l):

channel, (i, result) = queue.get ()

results [i] = result

for gw in gateways:

gw.exit ()

return results

Code: Increase parallelization by changing the spec

plists. map (remote_double, range ( 10 ), [( ’popen’ , 4 )])

Output:

 [0, 2, 4, 6, 8, 10, 12, 14, 16, 18] 

However, no more parallelization necessarily means faster processing. It depends on the resources available, and the more gateways and channels open, the more overhead. Ideally, there should be one gateway and channel per CPU core to maximize resource utilization. Use plists.map () with any clean module as long as it receives and sends back 2 tuples, where i is the first element. This pattern is most useful when there are many numbers that need to be processed in order to process them as quickly as possible.

👻 Read also: what is the best laptop for engineering students?

NLP | Parallel list processing with execnet exp: Questions

exp

How do I merge two dictionaries in a single expression (taking union of dictionaries)?

5 answers

Carl Meyer By Carl Meyer

I have two Python dictionaries, and I want to write a single expression that returns these two dictionaries, merged (i.e. taking the union). The update() method would be what I need, if it returned its result instead of modifying a dictionary in-place.

>>> x = {"a": 1, "b": 2}
>>> y = {"b": 10, "c": 11}
>>> z = x.update(y)
>>> print(z)
None
>>> x
{"a": 1, "b": 10, "c": 11}

How can I get that final merged dictionary in z, not x?

(To be extra-clear, the last-one-wins conflict-handling of dict.update() is what I"m looking for as well.)

5839

Answer #1

How can I merge two Python dictionaries in a single expression?

For dictionaries x and y, z becomes a shallowly-merged dictionary with values from y replacing those from x.

  • In Python 3.9.0 or greater (released 17 October 2020): PEP-584, discussed here, was implemented and provides the simplest method:

    z = x | y          # NOTE: 3.9+ ONLY
    
  • In Python 3.5 or greater:

    z = {**x, **y}
    
  • In Python 2, (or 3.4 or lower) write a function:

    def merge_two_dicts(x, y):
        z = x.copy()   # start with keys and values of x
        z.update(y)    # modifies z with keys and values of y
        return z
    

    and now:

    z = merge_two_dicts(x, y)
    

Explanation

Say you have two dictionaries and you want to merge them into a new dictionary without altering the original dictionaries:

x = {"a": 1, "b": 2}
y = {"b": 3, "c": 4}

The desired result is to get a new dictionary (z) with the values merged, and the second dictionary"s values overwriting those from the first.

>>> z
{"a": 1, "b": 3, "c": 4}

A new syntax for this, proposed in PEP 448 and available as of Python 3.5, is

z = {**x, **y}

And it is indeed a single expression.

Note that we can merge in with literal notation as well:

z = {**x, "foo": 1, "bar": 2, **y}

and now:

>>> z
{"a": 1, "b": 3, "foo": 1, "bar": 2, "c": 4}

It is now showing as implemented in the release schedule for 3.5, PEP 478, and it has now made its way into the What"s New in Python 3.5 document.

However, since many organizations are still on Python 2, you may wish to do this in a backward-compatible way. The classically Pythonic way, available in Python 2 and Python 3.0-3.4, is to do this as a two-step process:

z = x.copy()
z.update(y) # which returns None since it mutates z

In both approaches, y will come second and its values will replace x"s values, thus b will point to 3 in our final result.

Not yet on Python 3.5, but want a single expression

If you are not yet on Python 3.5 or need to write backward-compatible code, and you want this in a single expression, the most performant while the correct approach is to put it in a function:

def merge_two_dicts(x, y):
    """Given two dictionaries, merge them into a new dict as a shallow copy."""
    z = x.copy()
    z.update(y)
    return z

and then you have a single expression:

z = merge_two_dicts(x, y)

You can also make a function to merge an arbitrary number of dictionaries, from zero to a very large number:

def merge_dicts(*dict_args):
    """
    Given any number of dictionaries, shallow copy and merge into a new dict,
    precedence goes to key-value pairs in latter dictionaries.
    """
    result = {}
    for dictionary in dict_args:
        result.update(dictionary)
    return result

This function will work in Python 2 and 3 for all dictionaries. e.g. given dictionaries a to g:

z = merge_dicts(a, b, c, d, e, f, g) 

and key-value pairs in g will take precedence over dictionaries a to f, and so on.

Critiques of Other Answers

Don"t use what you see in the formerly accepted answer:

z = dict(x.items() + y.items())

In Python 2, you create two lists in memory for each dict, create a third list in memory with length equal to the length of the first two put together, and then discard all three lists to create the dict. In Python 3, this will fail because you"re adding two dict_items objects together, not two lists -

>>> c = dict(a.items() + b.items())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: "dict_items" and "dict_items"

and you would have to explicitly create them as lists, e.g. z = dict(list(x.items()) + list(y.items())). This is a waste of resources and computation power.

Similarly, taking the union of items() in Python 3 (viewitems() in Python 2.7) will also fail when values are unhashable objects (like lists, for example). Even if your values are hashable, since sets are semantically unordered, the behavior is undefined in regards to precedence. So don"t do this:

>>> c = dict(a.items() | b.items())

This example demonstrates what happens when values are unhashable:

>>> x = {"a": []}
>>> y = {"b": []}
>>> dict(x.items() | y.items())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: "list"

Here"s an example where y should have precedence, but instead the value from x is retained due to the arbitrary order of sets:

>>> x = {"a": 2}
>>> y = {"a": 1}
>>> dict(x.items() | y.items())
{"a": 2}

Another hack you should not use:

z = dict(x, **y)

This uses the dict constructor and is very fast and memory-efficient (even slightly more so than our two-step process) but unless you know precisely what is happening here (that is, the second dict is being passed as keyword arguments to the dict constructor), it"s difficult to read, it"s not the intended usage, and so it is not Pythonic.

Here"s an example of the usage being remediated in django.

Dictionaries are intended to take hashable keys (e.g. frozensets or tuples), but this method fails in Python 3 when keys are not strings.

>>> c = dict(a, **b)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: keyword arguments must be strings

From the mailing list, Guido van Rossum, the creator of the language, wrote:

I am fine with declaring dict({}, **{1:3}) illegal, since after all it is abuse of the ** mechanism.

and

Apparently dict(x, **y) is going around as "cool hack" for "call x.update(y) and return x". Personally, I find it more despicable than cool.

It is my understanding (as well as the understanding of the creator of the language) that the intended usage for dict(**y) is for creating dictionaries for readability purposes, e.g.:

dict(a=1, b=10, c=11)

instead of

{"a": 1, "b": 10, "c": 11}

Response to comments

Despite what Guido says, dict(x, **y) is in line with the dict specification, which btw. works for both Python 2 and 3. The fact that this only works for string keys is a direct consequence of how keyword parameters work and not a short-coming of dict. Nor is using the ** operator in this place an abuse of the mechanism, in fact, ** was designed precisely to pass dictionaries as keywords.

Again, it doesn"t work for 3 when keys are not strings. The implicit calling contract is that namespaces take ordinary dictionaries, while users must only pass keyword arguments that are strings. All other callables enforced it. dict broke this consistency in Python 2:

>>> foo(**{("a", "b"): None})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: foo() keywords must be strings
>>> dict(**{("a", "b"): None})
{("a", "b"): None}

This inconsistency was bad given other implementations of Python (PyPy, Jython, IronPython). Thus it was fixed in Python 3, as this usage could be a breaking change.

I submit to you that it is malicious incompetence to intentionally write code that only works in one version of a language or that only works given certain arbitrary constraints.

More comments:

dict(x.items() + y.items()) is still the most readable solution for Python 2. Readability counts.

My response: merge_two_dicts(x, y) actually seems much clearer to me, if we"re actually concerned about readability. And it is not forward compatible, as Python 2 is increasingly deprecated.

{**x, **y} does not seem to handle nested dictionaries. the contents of nested keys are simply overwritten, not merged [...] I ended up being burnt by these answers that do not merge recursively and I was surprised no one mentioned it. In my interpretation of the word "merging" these answers describe "updating one dict with another", and not merging.

Yes. I must refer you back to the question, which is asking for a shallow merge of two dictionaries, with the first"s values being overwritten by the second"s - in a single expression.

Assuming two dictionaries of dictionaries, one might recursively merge them in a single function, but you should be careful not to modify the dictionaries from either source, and the surest way to avoid that is to make a copy when assigning values. As keys must be hashable and are usually therefore immutable, it is pointless to copy them:

from copy import deepcopy

def dict_of_dicts_merge(x, y):
    z = {}
    overlapping_keys = x.keys() & y.keys()
    for key in overlapping_keys:
        z[key] = dict_of_dicts_merge(x[key], y[key])
    for key in x.keys() - overlapping_keys:
        z[key] = deepcopy(x[key])
    for key in y.keys() - overlapping_keys:
        z[key] = deepcopy(y[key])
    return z

Usage:

>>> x = {"a":{1:{}}, "b": {2:{}}}
>>> y = {"b":{10:{}}, "c": {11:{}}}
>>> dict_of_dicts_merge(x, y)
{"b": {2: {}, 10: {}}, "a": {1: {}}, "c": {11: {}}}

Coming up with contingencies for other value types is far beyond the scope of this question, so I will point you at my answer to the canonical question on a "Dictionaries of dictionaries merge".

Less Performant But Correct Ad-hocs

These approaches are less performant, but they will provide correct behavior. They will be much less performant than copy and update or the new unpacking because they iterate through each key-value pair at a higher level of abstraction, but they do respect the order of precedence (latter dictionaries have precedence)

You can also chain the dictionaries manually inside a dict comprehension:

{k: v for d in dicts for k, v in d.items()} # iteritems in Python 2.7

or in Python 2.6 (and perhaps as early as 2.4 when generator expressions were introduced):

dict((k, v) for d in dicts for k, v in d.items()) # iteritems in Python 2

itertools.chain will chain the iterators over the key-value pairs in the correct order:

from itertools import chain
z = dict(chain(x.items(), y.items())) # iteritems in Python 2

Performance Analysis

I"m only going to do the performance analysis of the usages known to behave correctly. (Self-contained so you can copy and paste yourself.)

from timeit import repeat
from itertools import chain

x = dict.fromkeys("abcdefg")
y = dict.fromkeys("efghijk")

def merge_two_dicts(x, y):
    z = x.copy()
    z.update(y)
    return z

min(repeat(lambda: {**x, **y}))
min(repeat(lambda: merge_two_dicts(x, y)))
min(repeat(lambda: {k: v for d in (x, y) for k, v in d.items()}))
min(repeat(lambda: dict(chain(x.items(), y.items()))))
min(repeat(lambda: dict(item for d in (x, y) for item in d.items())))

In Python 3.8.1, NixOS:

>>> min(repeat(lambda: {**x, **y}))
1.0804965235292912
>>> min(repeat(lambda: merge_two_dicts(x, y)))
1.636518670246005
>>> min(repeat(lambda: {k: v for d in (x, y) for k, v in d.items()}))
3.1779992282390594
>>> min(repeat(lambda: dict(chain(x.items(), y.items()))))
2.740647904574871
>>> min(repeat(lambda: dict(item for d in (x, y) for item in d.items())))
4.266070580109954
$ uname -a
Linux nixos 4.19.113 #1-NixOS SMP Wed Mar 25 07:06:15 UTC 2020 x86_64 GNU/Linux

Resources on Dictionaries

5839

Answer #2

In your case, what you can do is:

z = dict(list(x.items()) + list(y.items()))

This will, as you want it, put the final dict in z, and make the value for key b be properly overridden by the second (y) dict"s value:

>>> x = {"a":1, "b": 2}
>>> y = {"b":10, "c": 11}
>>> z = dict(list(x.items()) + list(y.items()))
>>> z
{"a": 1, "c": 11, "b": 10}

If you use Python 2, you can even remove the list() calls. To create z:

>>> z = dict(x.items() + y.items())
>>> z
{"a": 1, "c": 11, "b": 10}

If you use Python version 3.9.0a4 or greater, then you can directly use:

x = {"a":1, "b": 2}
y = {"b":10, "c": 11}
z = x | y
print(z)
{"a": 1, "c": 11, "b": 10}

5839

Answer #3

An alternative:

z = x.copy()
z.update(y)

How to insert newlines on argparse help text?

5 answers

I"m using argparse in Python 2.7 for parsing input options. One of my options is a multiple choice. I want to make a list in its help text, e.g.

from argparse import ArgumentParser

parser = ArgumentParser(description="test")

parser.add_argument("-g", choices=["a", "b", "g", "d", "e"], default="a",
    help="Some option, where
"
         " a = alpha
"
         " b = beta
"
         " g = gamma
"
         " d = delta
"
         " e = epsilon")

parser.parse_args()

However, argparse strips all newlines and consecutive spaces. The result looks like

~/Downloads:52$ python2.7 x.py -h
usage: x.py [-h] [-g {a,b,g,d,e}]

test

optional arguments:
  -h, --help      show this help message and exit
  -g {a,b,g,d,e}  Some option, where a = alpha b = beta g = gamma d = delta e
                  = epsilon

How to insert newlines in the help text?

406

Answer #1

Try using RawTextHelpFormatter:

from argparse import RawTextHelpFormatter
parser = ArgumentParser(description="test", formatter_class=RawTextHelpFormatter)

Is a Python list guaranteed to have its elements stay in the order they are inserted in?

5 answers

If I have the following Python code

>>> x = []
>>> x = x + [1]
>>> x = x + [2]
>>> x = x + [3]
>>> x
[1, 2, 3]

Will x be guaranteed to always be [1,2,3], or are other orderings of the interim elements possible?

366

Answer #1

Yes, the order of elements in a python list is persistent.

Inserting image into IPython notebook markdown

5 answers

I am starting to depend heavily on the IPython notebook app to develop and document algorithms. It is awesome; but there is something that seems like it should be possible, but I can"t figure out how to do it:

I would like to insert a local image into my (local) IPython notebook markdown to aid in documenting an algorithm. I know enough to add something like <img src="image.png"> to the markdown, but that is about as far as my knowledge goes. I assume I could put the image in the directory represented by 127.0.0.1:8888 (or some subdirectory) to be able to access it, but I can"t figure out where that directory is. (I"m working on a mac.) So, is it possible to do what I"m trying to do without too much trouble?

277

Answer #1

Most of the answers given so far go in the wrong direction, suggesting to load additional libraries and use the code instead of markup. In Ipython/Jupyter Notebooks it is very simple. Make sure the cell is indeed in markup and to display a image use:

![alt text](imagename.png "Title")

Further advantage compared to the other methods proposed is that you can display all common file formats including jpg, png, and gif (animations).

We hope this article has helped you to resolve the problem. Apart from NLP | Parallel list processing with execnet, check other exp-related topics.

Want to excel in Python? See our review of the best Python online courses 2022. If you are interested in Data Science, check also how to learn programming in R.

By the way, this material is also available in other languages:



Manuel Lehnman

Paris | 2022-11-27

Thanks for explaining! I was stuck with NLP | Parallel list processing with execnet for some hours, finally got it done 🤗. I am just not quite sure it is the best method

Angelo Robinson

Abu Dhabi | 2022-11-27

Maybe there are another answers? What NLP | Parallel list processing with execnet exactly means?. I just hope that will not emerge anymore

Dmitry Zelotti

Paris | 2022-11-27

insert is always a bit confusing 😭 NLP | Parallel list processing with execnet is not the only problem I encountered. Will use it in my bachelor thesis

Shop

Learn programming in R: courses

$

Best Python online courses for 2022

$

Best laptop for Fortnite

$

Best laptop for Excel

$

Best laptop for Solidworks

$

Best laptop for Roblox

$

Best computer for crypto mining

$

Best laptop for Sims 4

$

Latest questions

NUMPYNUMPY

Common xlabel/ylabel for matplotlib subplots

12 answers

NUMPYNUMPY

How to specify multiple return types using type-hints

12 answers

NUMPYNUMPY

Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

12 answers

NUMPYNUMPY

Flake8: Ignore specific warning for entire file

12 answers

NUMPYNUMPY

glob exclude pattern

12 answers

NUMPYNUMPY

How to avoid HTTP error 429 (Too Many Requests) python

12 answers

NUMPYNUMPY

Python CSV error: line contains NULL byte

12 answers

NUMPYNUMPY

csv.Error: iterator should return strings, not bytes

12 answers

News


Wiki

Python | How to copy data from one Excel sheet to another

Common xlabel/ylabel for matplotlib subplots

Check if one list is a subset of another in Python

sin

How to specify multiple return types using type-hints

exp

Printing words vertically in Python

exp

Python Extract words from a given string

Cyclic redundancy check in Python

Finding mean, median, mode in Python without libraries

cos

Python add suffix / add prefix to strings in a list

Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

Python - Move item to the end of the list

Python - Print list vertically