Javascript Side Project Ideas

| | | | | | | | |

Contents

To become an expert programmer, you need to practice. There are many fun and exciting Python projects for beginners. These projects allow you to create something useful while learning this fun programming language.

Books and videos can only take your knowledge to a certain level. The best way to hone your skills is to challenge yourself. Improve yourself by creating projects, such as the Python Beginner Projects in the list below.

Now is the time to put that knowledge to the test and start honing your programming experience with Python projects.

Reasons why you should learn Python

Proficiency in one or more programming languages ‚Äã‚Äãhas become desirable, because programming knowledge can lead to profitable and rewarding careers. The demand for Python developers continues to grow, especially as Python is the third largest / a> programming language in the world.

Also, the best companies like Instagram, Google, Spotify, Netflix, Dropbox, Instacart, and Reddit (to name a few) rely on Python. In short, adding Python to your resume will make you a more attractive candidate to potential employers.

There are many ways to learn Python. Some people learn Python from books. Others learn Python through online courses. If you need to be more convincing, check out this great resource to learn more about why you should learn Python .

Choosing a Python project for beginners

Unless you already have some programming skills behind you, you’ll want to make sure you’ve learned the basics of Python. If you’re new to learning Python, check out our beginner’s resources . You can also visit Codecademy and DataQuest for free courses.

Make sure not to confuse Python 2 & Python 3 . It is important to understand both languages. However, learning Python 3 keeps you the most up to date in the language.

Pick a topic that interests you

Don’t start with a project. Browse and find a topic that looks interesting. Not only will you have more fun if you are interested, but fueling that curiosity will keep you motivated to complete the project.

Choosing fun Python projects for beginners can be the difference between starting and ending a project. Often times, new programmers practice choosing a project that solves a daily problem.

Think about how the project will fit into your overall goals. For example, if your business goal is to develop applications, create a simple web application project.

Conversely, if your professional interest is data science , find a project that can analyze a dataset. In summary, there are a ton of great ideas for Python projects. They can be fun and help you reach your career goals or your career path.

Think small to make big gains

In other words Don’t choose a project that requires an expert skill level. Unless you really like the pressure, choosing one that is too difficult at first will only stress you out. It’s fine to dream of a big goal, but recognize that each step of that goal needs to be broken down into smaller steps.

Instead, start with simple Python projects first. Develop bigger ideas, such as web and desktop apps, 3D games, or even social media platforms.

Python projects for beginner developers: games and challenges

 Python games and challenges for beginners
Python games and challenges can improve troubleshooting skills.

It is said that the practice makes perfect. And you are not an expert at anything until you have practiced for over 10,000 hours. It may sound intimidating, but don’t be discouraged. These simple games and challenges will help you increase your understanding and your self-confidence.

For more game ideas, check out PyGame wiki for more Python tutorials and to get started typing on this command line.

Here is a list of nine great Python projects for beginners:

Rock, Paper, Scissors Game

One of the most beloved games of all time and a simple Python project to test your skills. Start by making the player against the computer. Skills Used: Better understand while loops and if statements.

Create a Twitter bot

Want to engage your Twitter followers even when working offline on other projects? You will need to register as a Twitter developer to do so , but don’t worry, it’s not as difficult as you might think.

Guess the number

This could be a fun Python project for groups or events where a random generator is needed. It is useful for organizing lotteries, table games or simply between players to guess a random number. Skills used: Familiarize yourself with the random function, variables, integers, printing, if / else and while loops.

MadLibs generator

Remember that game we played when we were kids? The game where we put silly words in the blanks and laugh hysterically as they are read back to us?

With a Mad Libs generator, you can relive those hilarious moments. This generator allows you to work on a wide range of Python skills. Skills used: strings, variables, concatenation, printing.

Hangman

Similar to generating a random number, this Python game substitutes a word where the user guesses the letters. You will also need to create a counter to count the number of bad letter attempts. Skills used: random library, boolean, input / output, character, string and length.

Password generator

Create a random password generator for your friends and family to protect their accounts ! Skills used: random library and sequences.

Roll the dice

Similar to the "Guess the Number" game above, the construction of a roll of the dice can be used to play. Or you can create one similar to a Magic 8-Ball to answer your deeper questions! Skills used: random library, printing, while loops.

Text-based adventure

This Python project is a simple mission game in which the user can browse different rooms and get a description of each. You will set limits on how far characters can travel, where they are going, and how to track their position. Skills used: variables, strings, input / output, if / else, print and list.

Secret Encrypt

Generate and decrypt secret ciphers. It works well with a fellow programmer where one of you creates a cipher and the other decrypts the secret message. Skills used: encryption methods.

Python projects for intermediate Python developers

Once you understand the Python programming language, addresses more advanced projects. Intermediate projects use more technical skills. They require extensive knowledge of Python. Although these projects are more difficult to achieve, you will learn a lot by taking them on.

Alarm Clock

Creating an alarm clock is an effective way to demonstrate your programming skills. It lets you design something that gives you a specific notification at a particular time. Make your alarm clock more advanced by making it play music or videos on the fly.

Tic-Tac-Toe

It’s time to take a new step in the development of the Python game. Tic-Tac-Toe might be a simple game to play, but it’s not that easy to program. The Pygame library is useful for this type of project. It comes with the necessary modules for sound and graphics.

Wikipedia article generator

In terms of what it does, it’s a pretty straightforward program. However, it can get quite complicated. The goal of the program is to find a random article on Wikipedia.

Next, the program asks the user if they want to view the article. If the user says yes, the program displays it.

Python projects for advanced Python developers

Finally, the next step to test your skills as a Python developer goes through advanced projects. These projects address more unusual aspects of programming and development.

Don’t worry if you have problems with projects like these, even experienced developers have problems with advanced programs. Take your time and try to learn something new from each of them.

Create an MP3 player

It’s time toditch the CDs and start working on your MP3 player. This Python project is to create a tool that plays audio files. The goal is to create a user interface that emulates the physical music player. When finished, you will have an MP3 player that works on your computer or laptop.

Quiz program

It’s quiz time ! Take your Python skills to the next level by creating a quiz app. Quiz apps present a series of questions to users and give them the opportunity to answer them. The quiz then provides the user’s results.

Experiment with your application. Design a quiz that answers immediately after a user gives an answer. Then create a quiz where users only receive the results after the quiz is complete. You can also put a timer on the quiz for each question.

Typing test

Creating a typing test in Python allows you to develop a unique program. It tests your typing speed, lets you create a GUI, and gives you a random phrase. It’s an advanced project, but it will teach you a lot about the design.

Python projects for beginners in data science

Python Data Science Projects
What does the data tell us?

Here is a list of free Python projects for beginners where you will surely find something that intrigues you and invites you to dig.With these you can create a visually stunning data structure project to present to classmates, friends , colleagues, or anyone else !

These datasets can be used for neural networks, deep learning, and machine learning projects:

Javascript Side Project Ideas: StackOverflow Questions

How do I list all files of a directory?

How can I list all files of a directory in Python and add them to a list?

Answer #1:

os.listdir() will get you everything that"s in a directory - files and directories.

If you want just files, you could either filter this down using os.path:

from os import listdir
from os.path import isfile, join
onlyfiles = [f for f in listdir(mypath) if isfile(join(mypath, f))]

or you could use os.walk() which will yield two lists for each directory it visits - splitting into files and dirs for you. If you only want the top directory you can break the first time it yields

from os import walk

f = []
for (dirpath, dirnames, filenames) in walk(mypath):
    f.extend(filenames)
    break

or, shorter:

from os import walk

filenames = next(walk(mypath), (None, None, []))[2]  # [] if no file

Answer #2:

I prefer using the glob module, as it does pattern matching and expansion.

import glob
print(glob.glob("/home/adam/*"))

It does pattern matching intuitively

import glob
# All files ending with .txt
print(glob.glob("/home/adam/*.txt")) 
# All files ending with .txt with depth of 2 folder
print(glob.glob("/home/adam/*/*.txt")) 

It will return a list with the queried files:

["/home/adam/file1.txt", "/home/adam/file2.txt", .... ]

Answer #3:

os.listdir() - list in the current directory

With listdir in os module you get the files and the folders in the current dir

 import os
 arr = os.listdir()
 print(arr)
 
 >>> ["$RECYCLE.BIN", "work.txt", "3ebooks.txt", "documents"]

Looking in a directory

arr = os.listdir("c:\files")

glob from glob

with glob you can specify a type of file to list like this

import glob

txtfiles = []
for file in glob.glob("*.txt"):
    txtfiles.append(file)

glob in a list comprehension

mylist = [f for f in glob.glob("*.txt")]

get the full path of only files in the current directory

import os
from os import listdir
from os.path import isfile, join

cwd = os.getcwd()
onlyfiles = [os.path.join(cwd, f) for f in os.listdir(cwd) if 
os.path.isfile(os.path.join(cwd, f))]
print(onlyfiles) 

["G:\getfilesname\getfilesname.py", "G:\getfilesname\example.txt"]

Getting the full path name with os.path.abspath

You get the full path in return

 import os
 files_path = [os.path.abspath(x) for x in os.listdir()]
 print(files_path)
 
 ["F:\documentiapplications.txt", "F:\documenticollections.txt"]

Walk: going through sub directories

os.walk returns the root, the directories list and the files list, that is why I unpacked them in r, d, f in the for loop; it, then, looks for other files and directories in the subfolders of the root and so on until there are no subfolders.

import os

# Getting the current work directory (cwd)
thisdir = os.getcwd()

# r=root, d=directories, f = files
for r, d, f in os.walk(thisdir):
    for file in f:
        if file.endswith(".docx"):
            print(os.path.join(r, file))

os.listdir(): get files in the current directory (Python 2)

In Python 2, if you want the list of the files in the current directory, you have to give the argument as "." or os.getcwd() in the os.listdir method.

 import os
 arr = os.listdir(".")
 print(arr)
 
 >>> ["$RECYCLE.BIN", "work.txt", "3ebooks.txt", "documents"]

To go up in the directory tree

# Method 1
x = os.listdir("..")

# Method 2
x= os.listdir("/")

Get files: os.listdir() in a particular directory (Python 2 and 3)

 import os
 arr = os.listdir("F:\python")
 print(arr)
 
 >>> ["$RECYCLE.BIN", "work.txt", "3ebooks.txt", "documents"]

Get files of a particular subdirectory with os.listdir()

import os

x = os.listdir("./content")

os.walk(".") - current directory

 import os
 arr = next(os.walk("."))[2]
 print(arr)
 
 >>> ["5bs_Turismo1.pdf", "5bs_Turismo1.pptx", "esperienza.txt"]

next(os.walk(".")) and os.path.join("dir", "file")

 import os
 arr = []
 for d,r,f in next(os.walk("F:\_python")):
     for file in f:
         arr.append(os.path.join(r,file))

 for f in arr:
     print(files)

>>> F:\_python\dict_class.py
>>> F:\_python\programmi.txt

next(os.walk("F:\") - get the full path - list comprehension

 [os.path.join(r,file) for r,d,f in next(os.walk("F:\_python")) for file in f]
 
 >>> ["F:\_python\dict_class.py", "F:\_python\programmi.txt"]

os.walk - get full path - all files in sub dirs**

x = [os.path.join(r,file) for r,d,f in os.walk("F:\_python") for file in f]
print(x)

>>> ["F:\_python\dict.py", "F:\_python\progr.txt", "F:\_python\readl.py"]

os.listdir() - get only txt files

 arr_txt = [x for x in os.listdir() if x.endswith(".txt")]
 print(arr_txt)
 
 >>> ["work.txt", "3ebooks.txt"]

Using glob to get the full path of the files

If I should need the absolute path of the files:

from path import path
from glob import glob
x = [path(f).abspath() for f in glob("F:\*.txt")]
for f in x:
    print(f)

>>> F:acquistionline.txt
>>> F:acquisti_2018.txt
>>> F:ootstrap_jquery_ecc.txt

Using os.path.isfile to avoid directories in the list

import os.path
listOfFiles = [f for f in os.listdir() if os.path.isfile(f)]
print(listOfFiles)

>>> ["a simple game.py", "data.txt", "decorator.py"]

Using pathlib from Python 3.4

import pathlib

flist = []
for p in pathlib.Path(".").iterdir():
    if p.is_file():
        print(p)
        flist.append(p)

 >>> error.PNG
 >>> exemaker.bat
 >>> guiprova.mp3
 >>> setup.py
 >>> speak_gui2.py
 >>> thumb.PNG

With list comprehension:

flist = [p for p in pathlib.Path(".").iterdir() if p.is_file()]

Alternatively, use pathlib.Path() instead of pathlib.Path(".")

Use glob method in pathlib.Path()

import pathlib

py = pathlib.Path().glob("*.py")
for file in py:
    print(file)

>>> stack_overflow_list.py
>>> stack_overflow_list_tkinter.py

Get all and only files with os.walk

import os
x = [i[2] for i in os.walk(".")]
y=[]
for t in x:
    for f in t:
        y.append(f)
print(y)

>>> ["append_to_list.py", "data.txt", "data1.txt", "data2.txt", "data_180617", "os_walk.py", "READ2.py", "read_data.py", "somma_defaltdic.py", "substitute_words.py", "sum_data.py", "data.txt", "data1.txt", "data_180617"]

Get only files with next and walk in a directory

 import os
 x = next(os.walk("F://python"))[2]
 print(x)
 
 >>> ["calculator.bat","calculator.py"]

Get only directories with next and walk in a directory

 import os
 next(os.walk("F://python"))[1] # for the current dir use (".")
 
 >>> ["python3","others"]

Get all the subdir names with walk

for r,d,f in os.walk("F:\_python"):
    for dirs in d:
        print(dirs)

>>> .vscode
>>> pyexcel
>>> pyschool.py
>>> subtitles
>>> _metaprogramming
>>> .ipynb_checkpoints

os.scandir() from Python 3.5 and greater

import os
x = [f.name for f in os.scandir() if f.is_file()]
print(x)

>>> ["calculator.bat","calculator.py"]

# Another example with scandir (a little variation from docs.python.org)
# This one is more efficient than os.listdir.
# In this case, it shows the files only in the current directory
# where the script is executed.

import os
with os.scandir() as i:
    for entry in i:
        if entry.is_file():
            print(entry.name)

>>> ebookmaker.py
>>> error.PNG
>>> exemaker.bat
>>> guiprova.mp3
>>> setup.py
>>> speakgui4.py
>>> speak_gui2.py
>>> speak_gui3.py
>>> thumb.PNG

Examples:

Ex. 1: How many files are there in the subdirectories?

In this example, we look for the number of files that are included in all the directory and its subdirectories.

import os

def count(dir, counter=0):
    "returns number of files in dir and subdirs"
    for pack in os.walk(dir):
        for f in pack[2]:
            counter += 1
    return dir + " : " + str(counter) + "files"

print(count("F:\python"))

>>> "F:\python" : 12057 files"

Ex.2: How to copy all files from a directory to another?

A script to make order in your computer finding all files of a type (default: pptx) and copying them in a new folder.

import os
import shutil
from path import path

destination = "F:\file_copied"
# os.makedirs(destination)

def copyfile(dir, filetype="pptx", counter=0):
    "Searches for pptx (or other - pptx is the default) files and copies them"
    for pack in os.walk(dir):
        for f in pack[2]:
            if f.endswith(filetype):
                fullpath = pack[0] + "\" + f
                print(fullpath)
                shutil.copy(fullpath, destination)
                counter += 1
    if counter > 0:
        print("-" * 30)
        print("	==> Found in: `" + dir + "` : " + str(counter) + " files
")

for dir in os.listdir():
    "searches for folders that starts with `_`"
    if dir[0] == "_":
        # copyfile(dir, filetype="pdf")
        copyfile(dir, filetype="txt")


>>> _compiti18Compito Contabilità 1conti.txt
>>> _compiti18Compito Contabilità 1modula4.txt
>>> _compiti18Compito Contabilità 1moduloa4.txt
>>> ------------------------
>>> ==> Found in: `_compiti18` : 3 files

Ex. 3: How to get all the files in a txt file

In case you want to create a txt file with all the file names:

import os
mylist = ""
with open("filelist.txt", "w", encoding="utf-8") as file:
    for eachfile in os.listdir():
        mylist += eachfile + "
"
    file.write(mylist)

Example: txt with all the files of an hard drive

"""
We are going to save a txt file with all the files in your directory.
We will use the function walk()
"""

import os

# see all the methods of os
# print(*dir(os), sep=", ")
listafile = []
percorso = []
with open("lista_file.txt", "w", encoding="utf-8") as testo:
    for root, dirs, files in os.walk("D:\"):
        for file in files:
            listafile.append(file)
            percorso.append(root + "\" + file)
            testo.write(file + "
")
listafile.sort()
print("N. of files", len(listafile))
with open("lista_file_ordinata.txt", "w", encoding="utf-8") as testo_ordinato:
    for file in listafile:
        testo_ordinato.write(file + "
")

with open("percorso.txt", "w", encoding="utf-8") as file_percorso:
    for file in percorso:
        file_percorso.write(file + "
")

os.system("lista_file.txt")
os.system("lista_file_ordinata.txt")
os.system("percorso.txt")

All the file of C: in one text file

This is a shorter version of the previous code. Change the folder where to start finding the files if you need to start from another position. This code generate a 50 mb on text file on my computer with something less then 500.000 lines with files with the complete path.

import os

with open("file.txt", "w", encoding="utf-8") as filewrite:
    for r, d, f in os.walk("C:\"):
        for file in f:
            filewrite.write(f"{r + file}
")

How to write a file with all paths in a folder of a type

With this function you can create a txt file that will have the name of a type of file that you look for (ex. pngfile.txt) with all the full path of all the files of that type. It can be useful sometimes, I think.

import os

def searchfiles(extension=".ttf", folder="H:\"):
    "Create a txt file with all the file of a type"
    with open(extension[1:] + "file.txt", "w", encoding="utf-8") as filewrite:
        for r, d, f in os.walk(folder):
            for file in f:
                if file.endswith(extension):
                    filewrite.write(f"{r + file}
")

# looking for png file (fonts) in the hard disk H:
searchfiles(".png", "H:\")

>>> H:4bs_18Dolphins5.png
>>> H:4bs_18Dolphins6.png
>>> H:4bs_18Dolphins7.png
>>> H:5_18marketing htmlassetsimageslogo2.png
>>> H:7z001.png
>>> H:7z002.png

(New) Find all files and open them with tkinter GUI

I just wanted to add in this 2019 a little app to search for all files in a dir and be able to open them by doubleclicking on the name of the file in the list. enter image description here

import tkinter as tk
import os

def searchfiles(extension=".txt", folder="H:\"):
    "insert all files in the listbox"
    for r, d, f in os.walk(folder):
        for file in f:
            if file.endswith(extension):
                lb.insert(0, r + "\" + file)

def open_file():
    os.startfile(lb.get(lb.curselection()[0]))

root = tk.Tk()
root.geometry("400x400")
bt = tk.Button(root, text="Search", command=lambda:searchfiles(".png", "H:\"))
bt.pack()
lb = tk.Listbox(root)
lb.pack(fill="both", expand=1)
lb.bind("<Double-Button>", lambda x: open_file())
root.mainloop()

Answer #4:

import os
os.listdir("somedirectory")

will return a list of all files and directories in "somedirectory".

Answer #5:

A one-line solution to get only list of files (no subdirectories):

filenames = next(os.walk(path))[2]

or absolute pathnames:

paths = [os.path.join(path, fn) for fn in next(os.walk(path))[2]]

Javascript Side Project Ideas: StackOverflow Questions

How do I merge two dictionaries in a single expression (taking union of dictionaries)?

Question by Carl Meyer

I have two Python dictionaries, and I want to write a single expression that returns these two dictionaries, merged (i.e. taking the union). The update() method would be what I need, if it returned its result instead of modifying a dictionary in-place.

>>> x = {"a": 1, "b": 2}
>>> y = {"b": 10, "c": 11}
>>> z = x.update(y)
>>> print(z)
None
>>> x
{"a": 1, "b": 10, "c": 11}

How can I get that final merged dictionary in z, not x?

(To be extra-clear, the last-one-wins conflict-handling of dict.update() is what I"m looking for as well.)

Answer #1:

How can I merge two Python dictionaries in a single expression?

For dictionaries x and y, z becomes a shallowly-merged dictionary with values from y replacing those from x.

  • In Python 3.9.0 or greater (released 17 October 2020): PEP-584, discussed here, was implemented and provides the simplest method:

    z = x | y          # NOTE: 3.9+ ONLY
    
  • In Python 3.5 or greater:

    z = {**x, **y}
    
  • In Python 2, (or 3.4 or lower) write a function:

    def merge_two_dicts(x, y):
        z = x.copy()   # start with keys and values of x
        z.update(y)    # modifies z with keys and values of y
        return z
    

    and now:

    z = merge_two_dicts(x, y)
    

Explanation

Say you have two dictionaries and you want to merge them into a new dictionary without altering the original dictionaries:

x = {"a": 1, "b": 2}
y = {"b": 3, "c": 4}

The desired result is to get a new dictionary (z) with the values merged, and the second dictionary"s values overwriting those from the first.

>>> z
{"a": 1, "b": 3, "c": 4}

A new syntax for this, proposed in PEP 448 and available as of Python 3.5, is

z = {**x, **y}

And it is indeed a single expression.

Note that we can merge in with literal notation as well:

z = {**x, "foo": 1, "bar": 2, **y}

and now:

>>> z
{"a": 1, "b": 3, "foo": 1, "bar": 2, "c": 4}

It is now showing as implemented in the release schedule for 3.5, PEP 478, and it has now made its way into the What"s New in Python 3.5 document.

However, since many organizations are still on Python 2, you may wish to do this in a backward-compatible way. The classically Pythonic way, available in Python 2 and Python 3.0-3.4, is to do this as a two-step process:

z = x.copy()
z.update(y) # which returns None since it mutates z

In both approaches, y will come second and its values will replace x"s values, thus b will point to 3 in our final result.

Not yet on Python 3.5, but want a single expression

If you are not yet on Python 3.5 or need to write backward-compatible code, and you want this in a single expression, the most performant while the correct approach is to put it in a function:

def merge_two_dicts(x, y):
    """Given two dictionaries, merge them into a new dict as a shallow copy."""
    z = x.copy()
    z.update(y)
    return z

and then you have a single expression:

z = merge_two_dicts(x, y)

You can also make a function to merge an arbitrary number of dictionaries, from zero to a very large number:

def merge_dicts(*dict_args):
    """
    Given any number of dictionaries, shallow copy and merge into a new dict,
    precedence goes to key-value pairs in latter dictionaries.
    """
    result = {}
    for dictionary in dict_args:
        result.update(dictionary)
    return result

This function will work in Python 2 and 3 for all dictionaries. e.g. given dictionaries a to g:

z = merge_dicts(a, b, c, d, e, f, g) 

and key-value pairs in g will take precedence over dictionaries a to f, and so on.

Critiques of Other Answers

Don"t use what you see in the formerly accepted answer:

z = dict(x.items() + y.items())

In Python 2, you create two lists in memory for each dict, create a third list in memory with length equal to the length of the first two put together, and then discard all three lists to create the dict. In Python 3, this will fail because you"re adding two dict_items objects together, not two lists -

>>> c = dict(a.items() + b.items())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: "dict_items" and "dict_items"

and you would have to explicitly create them as lists, e.g. z = dict(list(x.items()) + list(y.items())). This is a waste of resources and computation power.

Similarly, taking the union of items() in Python 3 (viewitems() in Python 2.7) will also fail when values are unhashable objects (like lists, for example). Even if your values are hashable, since sets are semantically unordered, the behavior is undefined in regards to precedence. So don"t do this:

>>> c = dict(a.items() | b.items())

This example demonstrates what happens when values are unhashable:

>>> x = {"a": []}
>>> y = {"b": []}
>>> dict(x.items() | y.items())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: "list"

Here"s an example where y should have precedence, but instead the value from x is retained due to the arbitrary order of sets:

>>> x = {"a": 2}
>>> y = {"a": 1}
>>> dict(x.items() | y.items())
{"a": 2}

Another hack you should not use:

z = dict(x, **y)

This uses the dict constructor and is very fast and memory-efficient (even slightly more so than our two-step process) but unless you know precisely what is happening here (that is, the second dict is being passed as keyword arguments to the dict constructor), it"s difficult to read, it"s not the intended usage, and so it is not Pythonic.

Here"s an example of the usage being remediated in django.

Dictionaries are intended to take hashable keys (e.g. frozensets or tuples), but this method fails in Python 3 when keys are not strings.

>>> c = dict(a, **b)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: keyword arguments must be strings

From the mailing list, Guido van Rossum, the creator of the language, wrote:

I am fine with declaring dict({}, **{1:3}) illegal, since after all it is abuse of the ** mechanism.

and

Apparently dict(x, **y) is going around as "cool hack" for "call x.update(y) and return x". Personally, I find it more despicable than cool.

It is my understanding (as well as the understanding of the creator of the language) that the intended usage for dict(**y) is for creating dictionaries for readability purposes, e.g.:

dict(a=1, b=10, c=11)

instead of

{"a": 1, "b": 10, "c": 11}

Response to comments

Despite what Guido says, dict(x, **y) is in line with the dict specification, which btw. works for both Python 2 and 3. The fact that this only works for string keys is a direct consequence of how keyword parameters work and not a short-coming of dict. Nor is using the ** operator in this place an abuse of the mechanism, in fact, ** was designed precisely to pass dictionaries as keywords.

Again, it doesn"t work for 3 when keys are not strings. The implicit calling contract is that namespaces take ordinary dictionaries, while users must only pass keyword arguments that are strings. All other callables enforced it. dict broke this consistency in Python 2:

>>> foo(**{("a", "b"): None})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: foo() keywords must be strings
>>> dict(**{("a", "b"): None})
{("a", "b"): None}

This inconsistency was bad given other implementations of Python (PyPy, Jython, IronPython). Thus it was fixed in Python 3, as this usage could be a breaking change.

I submit to you that it is malicious incompetence to intentionally write code that only works in one version of a language or that only works given certain arbitrary constraints.

More comments:

dict(x.items() + y.items()) is still the most readable solution for Python 2. Readability counts.

My response: merge_two_dicts(x, y) actually seems much clearer to me, if we"re actually concerned about readability. And it is not forward compatible, as Python 2 is increasingly deprecated.

{**x, **y} does not seem to handle nested dictionaries. the contents of nested keys are simply overwritten, not merged [...] I ended up being burnt by these answers that do not merge recursively and I was surprised no one mentioned it. In my interpretation of the word "merging" these answers describe "updating one dict with another", and not merging.

Yes. I must refer you back to the question, which is asking for a shallow merge of two dictionaries, with the first"s values being overwritten by the second"s - in a single expression.

Assuming two dictionaries of dictionaries, one might recursively merge them in a single function, but you should be careful not to modify the dictionaries from either source, and the surest way to avoid that is to make a copy when assigning values. As keys must be hashable and are usually therefore immutable, it is pointless to copy them:

from copy import deepcopy

def dict_of_dicts_merge(x, y):
    z = {}
    overlapping_keys = x.keys() & y.keys()
    for key in overlapping_keys:
        z[key] = dict_of_dicts_merge(x[key], y[key])
    for key in x.keys() - overlapping_keys:
        z[key] = deepcopy(x[key])
    for key in y.keys() - overlapping_keys:
        z[key] = deepcopy(y[key])
    return z

Usage:

>>> x = {"a":{1:{}}, "b": {2:{}}}
>>> y = {"b":{10:{}}, "c": {11:{}}}
>>> dict_of_dicts_merge(x, y)
{"b": {2: {}, 10: {}}, "a": {1: {}}, "c": {11: {}}}

Coming up with contingencies for other value types is far beyond the scope of this question, so I will point you at my answer to the canonical question on a "Dictionaries of dictionaries merge".

Less Performant But Correct Ad-hocs

These approaches are less performant, but they will provide correct behavior. They will be much less performant than copy and update or the new unpacking because they iterate through each key-value pair at a higher level of abstraction, but they do respect the order of precedence (latter dictionaries have precedence)

You can also chain the dictionaries manually inside a dict comprehension:

{k: v for d in dicts for k, v in d.items()} # iteritems in Python 2.7

or in Python 2.6 (and perhaps as early as 2.4 when generator expressions were introduced):

dict((k, v) for d in dicts for k, v in d.items()) # iteritems in Python 2

itertools.chain will chain the iterators over the key-value pairs in the correct order:

from itertools import chain
z = dict(chain(x.items(), y.items())) # iteritems in Python 2

Performance Analysis

I"m only going to do the performance analysis of the usages known to behave correctly. (Self-contained so you can copy and paste yourself.)

from timeit import repeat
from itertools import chain

x = dict.fromkeys("abcdefg")
y = dict.fromkeys("efghijk")

def merge_two_dicts(x, y):
    z = x.copy()
    z.update(y)
    return z

min(repeat(lambda: {**x, **y}))
min(repeat(lambda: merge_two_dicts(x, y)))
min(repeat(lambda: {k: v for d in (x, y) for k, v in d.items()}))
min(repeat(lambda: dict(chain(x.items(), y.items()))))
min(repeat(lambda: dict(item for d in (x, y) for item in d.items())))

In Python 3.8.1, NixOS:

>>> min(repeat(lambda: {**x, **y}))
1.0804965235292912
>>> min(repeat(lambda: merge_two_dicts(x, y)))
1.636518670246005
>>> min(repeat(lambda: {k: v for d in (x, y) for k, v in d.items()}))
3.1779992282390594
>>> min(repeat(lambda: dict(chain(x.items(), y.items()))))
2.740647904574871
>>> min(repeat(lambda: dict(item for d in (x, y) for item in d.items())))
4.266070580109954
$ uname -a
Linux nixos 4.19.113 #1-NixOS SMP Wed Mar 25 07:06:15 UTC 2020 x86_64 GNU/Linux

Resources on Dictionaries

Answer #2:

In your case, what you can do is:

z = dict(list(x.items()) + list(y.items()))

This will, as you want it, put the final dict in z, and make the value for key b be properly overridden by the second (y) dict"s value:

>>> x = {"a":1, "b": 2}
>>> y = {"b":10, "c": 11}
>>> z = dict(list(x.items()) + list(y.items()))
>>> z
{"a": 1, "c": 11, "b": 10}

If you use Python 2, you can even remove the list() calls. To create z:

>>> z = dict(x.items() + y.items())
>>> z
{"a": 1, "c": 11, "b": 10}

If you use Python version 3.9.0a4 or greater, then you can directly use:

x = {"a":1, "b": 2}
y = {"b":10, "c": 11}
z = x | y
print(z)
{"a": 1, "c": 11, "b": 10}

Answer #3:

An alternative:

z = x.copy()
z.update(y)

Answer #4:

Another, more concise, option:

z = dict(x, **y)

Note: this has become a popular answer, but it is important to point out that if y has any non-string keys, the fact that this works at all is an abuse of a CPython implementation detail, and it does not work in Python 3, or in PyPy, IronPython, or Jython. Also, Guido is not a fan. So I can"t recommend this technique for forward-compatible or cross-implementation portable code, which really means it should be avoided entirely.

Answer #5:

This probably won"t be a popular answer, but you almost certainly do not want to do this. If you want a copy that"s a merge, then use copy (or deepcopy, depending on what you want) and then update. The two lines of code are much more readable - more Pythonic - than the single line creation with .items() + .items(). Explicit is better than implicit.

In addition, when you use .items() (pre Python 3.0), you"re creating a new list that contains the items from the dict. If your dictionaries are large, then that is quite a lot of overhead (two large lists that will be thrown away as soon as the merged dict is created). update() can work more efficiently, because it can run through the second dict item-by-item.

In terms of time:

>>> timeit.Timer("dict(x, **y)", "x = dict(zip(range(1000), range(1000)))
y=dict(zip(range(1000,2000), range(1000,2000)))").timeit(100000)
15.52571702003479
>>> timeit.Timer("temp = x.copy()
temp.update(y)", "x = dict(zip(range(1000), range(1000)))
y=dict(zip(range(1000,2000), range(1000,2000)))").timeit(100000)
15.694622993469238
>>> timeit.Timer("dict(x.items() + y.items())", "x = dict(zip(range(1000), range(1000)))
y=dict(zip(range(1000,2000), range(1000,2000)))").timeit(100000)
41.484580039978027

IMO the tiny slowdown between the first two is worth it for the readability. In addition, keyword arguments for dictionary creation was only added in Python 2.3, whereas copy() and update() will work in older versions.

Javascript Side Project Ideas: StackOverflow Questions

Finding the index of an item in a list

Given a list ["foo", "bar", "baz"] and an item in the list "bar", how do I get its index (1) in Python?

Answer #1:

>>> ["foo", "bar", "baz"].index("bar")
1

Reference: Data Structures > More on Lists

Caveats follow

Note that while this is perhaps the cleanest way to answer the question as asked, index is a rather weak component of the list API, and I can"t remember the last time I used it in anger. It"s been pointed out to me in the comments that because this answer is heavily referenced, it should be made more complete. Some caveats about list.index follow. It is probably worth initially taking a look at the documentation for it:

list.index(x[, start[, end]])

Return zero-based index in the list of the first item whose value is equal to x. Raises a ValueError if there is no such item.

The optional arguments start and end are interpreted as in the slice notation and are used to limit the search to a particular subsequence of the list. The returned index is computed relative to the beginning of the full sequence rather than the start argument.

Linear time-complexity in list length

An index call checks every element of the list in order, until it finds a match. If your list is long, and you don"t know roughly where in the list it occurs, this search could become a bottleneck. In that case, you should consider a different data structure. Note that if you know roughly where to find the match, you can give index a hint. For instance, in this snippet, l.index(999_999, 999_990, 1_000_000) is roughly five orders of magnitude faster than straight l.index(999_999), because the former only has to search 10 entries, while the latter searches a million:

>>> import timeit
>>> timeit.timeit("l.index(999_999)", setup="l = list(range(0, 1_000_000))", number=1000)
9.356267921015387
>>> timeit.timeit("l.index(999_999, 999_990, 1_000_000)", setup="l = list(range(0, 1_000_000))", number=1000)
0.0004404920036904514
 

Only returns the index of the first match to its argument

A call to index searches through the list in order until it finds a match, and stops there. If you expect to need indices of more matches, you should use a list comprehension, or generator expression.

>>> [1, 1].index(1)
0
>>> [i for i, e in enumerate([1, 2, 1]) if e == 1]
[0, 2]
>>> g = (i for i, e in enumerate([1, 2, 1]) if e == 1)
>>> next(g)
0
>>> next(g)
2

Most places where I once would have used index, I now use a list comprehension or generator expression because they"re more generalizable. So if you"re considering reaching for index, take a look at these excellent Python features.

Throws if element not present in list

A call to index results in a ValueError if the item"s not present.

>>> [1, 1].index(2)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: 2 is not in list

If the item might not be present in the list, you should either

  1. Check for it first with item in my_list (clean, readable approach), or
  2. Wrap the index call in a try/except block which catches ValueError (probably faster, at least when the list to search is long, and the item is usually present.)

Answer #2:

One thing that is really helpful in learning Python is to use the interactive help function:

>>> help(["foo", "bar", "baz"])
Help on list object:

class list(object)
 ...

 |
 |  index(...)
 |      L.index(value, [start, [stop]]) -> integer -- return first index of value
 |

which will often lead you to the method you are looking for.

Answer #3:

The majority of answers explain how to find a single index, but their methods do not return multiple indexes if the item is in the list multiple times. Use enumerate():

for i, j in enumerate(["foo", "bar", "baz"]):
    if j == "bar":
        print(i)

The index() function only returns the first occurrence, while enumerate() returns all occurrences.

As a list comprehension:

[i for i, j in enumerate(["foo", "bar", "baz"]) if j == "bar"]

Here"s also another small solution with itertools.count() (which is pretty much the same approach as enumerate):

from itertools import izip as zip, count # izip for maximum efficiency
[i for i, j in zip(count(), ["foo", "bar", "baz"]) if j == "bar"]

This is more efficient for larger lists than using enumerate():

$ python -m timeit -s "from itertools import izip as zip, count" "[i for i, j in zip(count(), ["foo", "bar", "baz"]*500) if j == "bar"]"
10000 loops, best of 3: 174 usec per loop
$ python -m timeit "[i for i, j in enumerate(["foo", "bar", "baz"]*500) if j == "bar"]"
10000 loops, best of 3: 196 usec per loop

Answer #4:

To get all indexes:

indexes = [i for i,x in enumerate(xs) if x == "foo"]

Answer #5:

index() returns the first index of value!

| index(...)
| L.index(value, [start, [stop]]) -> integer -- return first index of value

def all_indices(value, qlist):
    indices = []
    idx = -1
    while True:
        try:
            idx = qlist.index(value, idx+1)
            indices.append(idx)
        except ValueError:
            break
    return indices

all_indices("foo", ["foo";"bar";"baz";"foo"])

Javascript Side Project Ideas: StackOverflow Questions

InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately

Tried to perform REST GET through python requests with the following code and I got error.

Code snip:

import requests
header = {"Authorization": "Bearer..."}
url = az_base_url + az_subscription_id + "/resourcegroups/Default-Networking/resources?" + az_api_version
r = requests.get(url, headers=header)

Error:

/usr/local/lib/python2.7/dist-packages/requests/packages/urllib3/util/ssl_.py:79: 
          InsecurePlatformWarning: A true SSLContext object is not available. 
          This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. 
          For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
  InsecurePlatformWarning

My python version is 2.7.3. I tried to install urllib3 and requests[security] as some other thread suggests, I still got the same error.

Wonder if anyone can provide some tips?

Answer #1:

The docs give a fair indicator of what"s required., however requests allow us to skip a few steps:

You only need to install the security package extras (thanks @admdrew for pointing it out)

$ pip install requests[security]

or, install them directly:

$ pip install pyopenssl ndg-httpsclient pyasn1

Requests will then automatically inject pyopenssl into urllib3


If you"re on ubuntu, you may run into trouble installing pyopenssl, you"ll need these dependencies:

$ apt-get install libffi-dev libssl-dev

Answer #2:

If you are not able to upgrade your Python version to 2.7.9, and want to suppress warnings,

you can downgrade your "requests" version to 2.5.3:

pip install requests==2.5.3

Bugfix disclosure / Warning introduced in 2.6.0

Dynamic instantiation from string name of a class in dynamically imported module?

In python, I have to instantiate certain class, knowing its name in a string, but this class "lives" in a dynamically imported module. An example follows:

loader-class script:

import sys
class loader:
  def __init__(self, module_name, class_name): # both args are strings
    try:
      __import__(module_name)
      modul = sys.modules[module_name]
      instance = modul.class_name() # obviously this doesn"t works, here is my main problem!
    except ImportError:
       # manage import error

some-dynamically-loaded-module script:

class myName:
  # etc...

I use this arrangement to make any dynamically-loaded-module to be used by the loader-class following certain predefined behaviours in the dyn-loaded-modules...

Answer #1:

You can use getattr

getattr(module, class_name)

to access the class. More complete code:

module = __import__(module_name)
class_ = getattr(module, class_name)
instance = class_()

As mentioned below, we may use importlib

import importlib
module = importlib.import_module(module_name)
class_ = getattr(module, class_name)
instance = class_()

Answer #2:

tl;dr

Import the root module with importlib.import_module and load the class by its name using getattr function:

# Standard import
import importlib
# Load "module.submodule.MyClass"
MyClass = getattr(importlib.import_module("module.submodule"), "MyClass")
# Instantiate the class (pass arguments to the constructor, if needed)
instance = MyClass()

explanations

You probably don"t want to use __import__ to dynamically import a module by name, as it does not allow you to import submodules:

>>> mod = __import__("os.path")
>>> mod.join
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: "module" object has no attribute "join"

Here is what the python doc says about __import__:

Note: This is an advanced function that is not needed in everyday Python programming, unlike importlib.import_module().

Instead, use the standard importlib module to dynamically import a module by name. With getattr you can then instantiate a class by its name:

import importlib
my_module = importlib.import_module("module.submodule")
MyClass = getattr(my_module, "MyClass")
instance = MyClass()

You could also write:

import importlib
module_name, class_name = "module.submodule.MyClass".rsplit(".", 1)
MyClass = getattr(importlib.import_module(module_name), class_name)
instance = MyClass()

This code is valid in python ‚â• 2.7 (including python 3).

pandas loc vs. iloc vs. at vs. iat?

Recently began branching out from my safe place (R) into Python and and am a bit confused by the cell localization/selection in Pandas. I"ve read the documentation but I"m struggling to understand the practical implications of the various localization/selection options.

Is there a reason why I should ever use .loc or .iloc over at, and iat or vice versa? In what situations should I use which method?


Note: future readers be aware that this question is old and was written before pandas v0.20 when there used to exist a function called .ix. This method was later split into two - loc and iloc - to make the explicit distinction between positional and label based indexing. Please beware that ix was discontinued due to inconsistent behavior and being hard to grok, and no longer exists in current versions of pandas (>= 1.0).

Answer #1:

loc: only work on index
iloc: work on position
at: get scalar values. It"s a very fast loc
iat: Get scalar values. It"s a very fast iloc

Also,

at and iat are meant to access a scalar, that is, a single element in the dataframe, while loc and iloc are ments to access several elements at the same time, potentially to perform vectorized operations.

http://pyciencia.blogspot.com/2015/05/obtener-y-filtrar-datos-de-un-dataframe.html

Javascript Side Project Ideas: StackOverflow Questions

JSON datetime between Python and JavaScript

Question by kevin

I want to send a datetime.datetime object in serialized form from Python using JSON and de-serialize in JavaScript using JSON. What is the best way to do this?

Answer #1:

You can add the "default" parameter to json.dumps to handle this:

date_handler = lambda obj: (
    obj.isoformat()
    if isinstance(obj, (datetime.datetime, datetime.date))
    else None
)
json.dumps(datetime.datetime.now(), default=date_handler)
""2010-04-20T20:08:21.634121""

Which is ISO 8601 format.

A more comprehensive default handler function:

def handler(obj):
    if hasattr(obj, "isoformat"):
        return obj.isoformat()
    elif isinstance(obj, ...):
        return ...
    else:
        raise TypeError, "Object of type %s with value of %s is not JSON serializable" % (type(obj), repr(obj))

Update: Added output of type as well as value.
Update: Also handle date

Javascript equivalent of Python"s zip function

Is there a javascript equivalent of Python"s zip function? That is, given multiple arrays of equal lengths create an array of pairs.

For instance, if I have three arrays that look like this:

var array1 = [1, 2, 3];
var array2 = ["a","b","c"];
var array3 = [4, 5, 6];

The output array should be:

var output array:[[1,"a",4], [2,"b",5], [3,"c",6]]

Answer #1:

2016 update:

Here"s a snazzier Ecmascript 6 version:

zip= rows=>rows[0].map((_,c)=>rows.map(row=>row[c]))

Illustration equiv. to Python{zip(*args)}:

> zip([["row0col0", "row0col1", "row0col2"],
       ["row1col0", "row1col1", "row1col2"]]);
[["row0col0","row1col0"],
 ["row0col1","row1col1"],
 ["row0col2","row1col2"]]

(and FizzyTea points out that ES6 has variadic argument syntax, so the following function definition will act like python, but see below for disclaimer... this will not be its own inverse so zip(zip(x)) will not equal x; though as Matt Kramer points out zip(...zip(...x))==x (like in regular python zip(*zip(*x))==x))

Alternative definition equiv. to Python{zip}:

> zip = (...rows) => [...rows[0]].map((_,c) => rows.map(row => row[c]))
> zip( ["row0col0", "row0col1", "row0col2"] ,
       ["row1col0", "row1col1", "row1col2"] );
             // note zip(row0,row1), not zip(matrix)
same answer as above

(Do note that the ... syntax may have performance issues at this time, and possibly in the future, so if you use the second answer with variadic arguments, you may want to perf test it. That said it"s been quite a while since it"s been in the standard.)

Make sure to note the addendum if you wish to use this on strings (perhaps there"s a better way to do it now with es6 iterables).


Here"s a oneliner:

function zip(arrays) {
    return arrays[0].map(function(_,i){
        return arrays.map(function(array){return array[i]})
    });
}

// > zip([[1,2],[11,22],[111,222]])
// [[1,11,111],[2,22,222]]]

// If you believe the following is a valid return value:
//   > zip([])
//   []
// then you can special-case it, or just do
//  return arrays.length==0 ? [] : arrays[0].map(...)

The above assumes that the arrays are of equal size, as they should be. It also assumes you pass in a single list of lists argument, unlike Python"s version where the argument list is variadic. If you want all of these "features", see below. It takes just about 2 extra lines of code.

The following will mimic Python"s zip behavior on edge cases where the arrays are not of equal size, silently pretending the longer parts of arrays don"t exist:

function zip() {
    var args = [].slice.call(arguments);
    var shortest = args.length==0 ? [] : args.reduce(function(a,b){
        return a.length<b.length ? a : b
    });

    return shortest.map(function(_,i){
        return args.map(function(array){return array[i]})
    });
}

// > zip([1,2],[11,22],[111,222,333])
// [[1,11,111],[2,22,222]]]

// > zip()
// []

This will mimic Python"s itertools.zip_longest behavior, inserting undefined where arrays are not defined:

function zip() {
    var args = [].slice.call(arguments);
    var longest = args.reduce(function(a,b){
        return a.length>b.length ? a : b
    }, []);

    return longest.map(function(_,i){
        return args.map(function(array){return array[i]})
    });
}

// > zip([1,2],[11,22],[111,222,333])
// [[1,11,111],[2,22,222],[null,null,333]]

// > zip()
// []

If you use these last two version (variadic aka. multiple-argument versions), then zip is no longer its own inverse. To mimic the zip(*[...]) idiom from Python, you will need to do zip.apply(this, [...]) when you want to invert the zip function or if you want to similarly have a variable number of lists as input.


addendum:

To make this handle any iterable (e.g. in Python you can use zip on strings, ranges, map objects, etc.), you could define the following:

function iterView(iterable) {
    // returns an array equivalent to the iterable
}

However if you write zip in the following way, even that won"t be necessary:

function zip(arrays) {
    return Array.apply(null,Array(arrays[0].length)).map(function(_,i){
        return arrays.map(function(array){return array[i]})
    });
}

Demo:

> JSON.stringify( zip(["abcde",[1,2,3,4,5]]) )
[["a",1],["b",2],["c",3],["d",4],["e",5]]

(Or you could use a range(...) Python-style function if you"ve written one already. Eventually you will be able to use ECMAScript array comprehensions or generators.)

What blocks Ruby, Python to get Javascript V8 speed?

Are there any Ruby / Python features that are blocking implementation of optimizations (e.g. inline caching) V8 engine has?

Python is co-developed by Google guys so it shouldn"t be blocked by software patents.

Or this is rather matter of resources put into the V8 project by Google.

Answer #1:

What blocks Ruby, Python to get Javascript V8 speed?

Nothing.

Well, okay: money. (And time, people, resources, but if you have money, you can buy those.)

V8 has a team of brilliant, highly-specialized, highly-experienced (and thus highly-paid) engineers working on it, that have decades of experience (I"m talking individually – collectively it"s more like centuries) in creating high-performance execution engines for dynamic OO languages. They are basically the same people who also created the Sun HotSpot JVM (among many others).

Lars Bak, the lead developer, has been literally working on VMs for 25 years (and all of those VMs have lead up to V8), which is basically his entire (professional) life. Some of the people writing Ruby VMs aren"t even 25 years old.

Are there any Ruby / Python features that are blocking implementation of optimizations (e.g. inline caching) V8 engine has?

Given that at least IronRuby, JRuby, MagLev, MacRuby and Rubinius have either monomorphic (IronRuby) or polymorphic inline caching, the answer is obviously no.

Modern Ruby implementations already do a great deal of optimizations. For example, for certain operations, Rubinius"s Hash class is faster than YARV"s. Now, this doesn"t sound terribly exciting until you realize that Rubinius"s Hash class is implemented in 100% pure Ruby, while YARV"s is implemented in 100% hand-optimized C.

So, at least in some cases, Rubinius can generate better code than GCC!

Or this is rather matter of resources put into the V8 project by Google.

Yes. Not just Google. The lineage of V8"s source code is 25 years old now. The people who are working on V8 also created the Self VM (to this day one of the fastest dynamic OO language execution engines ever created), the Animorphic Smalltalk VM (to this day one of the fastest Smalltalk execution engines ever created), the HotSpot JVM (the fastest JVM ever created, probably the fastest VM period) and OOVM (one of the most efficient Smalltalk VMs ever created).

In fact, Lars Bak, the lead developer of V8, worked on every single one of those, plus a few others.

Django Template Variables and Javascript

When I render a page using the Django template renderer, I can pass in a dictionary variable containing various values to manipulate them in the page using {{ myVar }}.

Is there a way to access the same variable in Javascript (perhaps using the DOM, I don"t know how Django makes the variables accessible)? I want to be able to lookup details using an AJAX lookup based on the values contained in the variables passed in.

Answer #1:

The {{variable}} is substituted directly into the HTML. Do a view source; it isn"t a "variable" or anything like it. It"s just rendered text.

Having said that, you can put this kind of substitution into your JavaScript.

<script type="text/javascript"> 
   var a = "{{someDjangoVariable}}";
</script>

This gives you "dynamic" javascript.

Web-scraping JavaScript page with Python

I"m trying to develop a simple web scraper. I want to extract text without the HTML code. In fact, I achieve this goal, but I have seen that in some pages where JavaScript is loaded I didn"t obtain good results.

For example, if some JavaScript code adds some text, I can"t see it, because when I call

response = urllib2.urlopen(request)

I get the original text without the added one (because JavaScript is executed in the client).

So, I"m looking for some ideas to solve this problem.

Answer #1:

EDIT 30/Dec/2017: This answer appears in top results of Google searches, so I decided to update it. The old answer is still at the end.

dryscape isn"t maintained anymore and the library dryscape developers recommend is Python 2 only. I have found using Selenium"s python library with Phantom JS as a web driver fast enough and easy to get the work done.

Once you have installed Phantom JS, make sure the phantomjs binary is available in the current path:

phantomjs --version
# result:
2.1.1

Example

To give an example, I created a sample page with following HTML code. (link):

<!DOCTYPE html>
<html>
<head>
  <meta charset="utf-8">
  <title>Javascript scraping test</title>
</head>
<body>
  <p id="intro-text">No javascript support</p>
  <script>
     document.getElementById("intro-text").innerHTML = "Yay! Supports javascript";
  </script> 
</body>
</html>

without javascript it says: No javascript support and with javascript: Yay! Supports javascript

Scraping without JS support:

import requests
from bs4 import BeautifulSoup
response = requests.get(my_url)
soup = BeautifulSoup(response.text)
soup.find(id="intro-text")
# Result:
<p id="intro-text">No javascript support</p>

Scraping with JS support:

from selenium import webdriver
driver = webdriver.PhantomJS()
driver.get(my_url)
p_element = driver.find_element_by_id(id_="intro-text")
print(p_element.text)
# result:
"Yay! Supports javascript"

You can also use Python library dryscrape to scrape javascript driven websites.

Scraping with JS support:

import dryscrape
from bs4 import BeautifulSoup
session = dryscrape.Session()
session.visit(my_url)
response = session.body()
soup = BeautifulSoup(response)
soup.find(id="intro-text")
# Result:
<p id="intro-text">Yay! Supports javascript</p>

Javascript Side Project Ideas: StackOverflow Questions

Python"s equivalent of && (logical-and) in an if-statement

Question by delete

Here"s my code:

def front_back(a, b):
  # +++your code here+++
  if len(a) % 2 == 0 && len(b) % 2 == 0:
    return a[:(len(a)/2)] + b[:(len(b)/2)] + a[(len(a)/2):] + b[(len(b)/2):] 
  else:
    #todo! Not yet done. :P
  return

I"m getting an error in the IF conditional.
What am I doing wrong?

Answer #1:

You would want and instead of &&.

Answer #2:

Python uses and and or conditionals.

i.e.

if foo == "abc" and bar == "bac" or zoo == "123":
  # do something

Answer #3:

I"m getting an error in the IF conditional. What am I doing wrong?

There reason that you get a SyntaxError is that there is no && operator in Python. Likewise || and ! are not valid Python operators.

Some of the operators you may know from other languages have a different name in Python. The logical operators && and || are actually called and and or. Likewise the logical negation operator ! is called not.

So you could just write:

if len(a) % 2 == 0 and len(b) % 2 == 0:

or even:

if not (len(a) % 2 or len(b) % 2):

Some additional information (that might come in handy):

I summarized the operator "equivalents" in this table:

+------------------------------+---------------------+
|  Operator (other languages)  |  Operator (Python)  |
+==============================+=====================+
|              &&              |         and         |
+------------------------------+---------------------+
|              ||              |         or          |
+------------------------------+---------------------+
|              !               |         not         |
+------------------------------+---------------------+

See also Python documentation: 6.11. Boolean operations.

Besides the logical operators Python also has bitwise/binary operators:

+--------------------+--------------------+
|  Logical operator  |  Bitwise operator  |
+====================+====================+
|        and         |         &          |
+--------------------+--------------------+
|         or         |         |          |
+--------------------+--------------------+

There is no bitwise negation in Python (just the bitwise inverse operator ~ - but that is not equivalent to not).

See also 6.6. Unary arithmetic and bitwise/binary operations and 6.7. Binary arithmetic operations.

The logical operators (like in many other languages) have the advantage that these are short-circuited. That means if the first operand already defines the result, then the second operator isn"t evaluated at all.

To show this I use a function that simply takes a value, prints it and returns it again. This is handy to see what is actually evaluated because of the print statements:

>>> def print_and_return(value):
...     print(value)
...     return value

>>> res = print_and_return(False) and print_and_return(True)
False

As you can see only one print statement is executed, so Python really didn"t even look at the right operand.

This is not the case for the binary operators. Those always evaluate both operands:

>>> res = print_and_return(False) & print_and_return(True);
False
True

But if the first operand isn"t enough then, of course, the second operator is evaluated:

>>> res = print_and_return(True) and print_and_return(False);
True
False

To summarize this here is another Table:

+-----------------+-------------------------+
|   Expression    |  Right side evaluated?  |
+=================+=========================+
| `True` and ...  |           Yes           |
+-----------------+-------------------------+
| `False` and ... |           No            |
+-----------------+-------------------------+
|  `True` or ...  |           No            |
+-----------------+-------------------------+
| `False` or ...  |           Yes           |
+-----------------+-------------------------+

The True and False represent what bool(left-hand-side) returns, they don"t have to be True or False, they just need to return True or False when bool is called on them (1).

So in Pseudo-Code(!) the and and or functions work like these:

def and(expr1, expr2):
    left = evaluate(expr1)
    if bool(left):
        return evaluate(expr2)
    else:
        return left

def or(expr1, expr2):
    left = evaluate(expr1)
    if bool(left):
        return left
    else:
        return evaluate(expr2)

Note that this is pseudo-code not Python code. In Python you cannot create functions called and or or because these are keywords. Also you should never use "evaluate" or if bool(...).

Customizing the behavior of your own classes

This implicit bool call can be used to customize how your classes behave with and, or and not.

To show how this can be customized I use this class which again prints something to track what is happening:

class Test(object):
    def __init__(self, value):
        self.value = value

    def __bool__(self):
        print("__bool__ called on {!r}".format(self))
        return bool(self.value)

    __nonzero__ = __bool__  # Python 2 compatibility

    def __repr__(self):
        return "{self.__class__.__name__}({self.value})".format(self=self)

So let"s see what happens with that class in combination with these operators:

>>> if Test(True) and Test(False):
...     pass
__bool__ called on Test(True)
__bool__ called on Test(False)

>>> if Test(False) or Test(False):
...     pass
__bool__ called on Test(False)
__bool__ called on Test(False)

>>> if not Test(True):
...     pass
__bool__ called on Test(True)

If you don"t have a __bool__ method then Python also checks if the object has a __len__ method and if it returns a value greater than zero. That might be useful to know in case you create a sequence container.

See also 4.1. Truth Value Testing.

NumPy arrays and subclasses

Probably a bit beyond the scope of the original question but in case you"re dealing with NumPy arrays or subclasses (like Pandas Series or DataFrames) then the implicit bool call will raise the dreaded ValueError:

>>> import numpy as np
>>> arr = np.array([1,2,3])
>>> bool(arr)
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
>>> arr and arr
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

>>> import pandas as pd
>>> s = pd.Series([1,2,3])
>>> bool(s)
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
>>> s and s
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

In these cases you can use the logical and function from NumPy which performs an element-wise and (or or):

>>> np.logical_and(np.array([False,False,True,True]), np.array([True, False, True, False]))
array([False, False,  True, False])
>>> np.logical_or(np.array([False,False,True,True]), np.array([True, False, True, False]))
array([ True, False,  True,  True])

If you"re dealing just with boolean arrays you could also use the binary operators with NumPy, these do perform element-wise (but also binary) comparisons:

>>> np.array([False,False,True,True]) & np.array([True, False, True, False])
array([False, False,  True, False])
>>> np.array([False,False,True,True]) | np.array([True, False, True, False])
array([ True, False,  True,  True])

(1)

That the bool call on the operands has to return True or False isn"t completely correct. It"s just the first operand that needs to return a boolean in it"s __bool__ method:

class Test(object):
    def __init__(self, value):
        self.value = value

    def __bool__(self):
        return self.value

    __nonzero__ = __bool__  # Python 2 compatibility

    def __repr__(self):
        return "{self.__class__.__name__}({self.value})".format(self=self)

>>> x = Test(10) and Test(10)
TypeError: __bool__ should return bool, returned int
>>> x1 = Test(True) and Test(10)
>>> x2 = Test(False) and Test(10)

That"s because and actually returns the first operand if the first operand evaluates to False and if it evaluates to True then it returns the second operand:

>>> x1
Test(10)
>>> x2
Test(False)

Similarly for or but just the other way around:

>>> Test(True) or Test(10)
Test(True)
>>> Test(False) or Test(10)
Test(10)

However if you use them in an if statement the if will also implicitly call bool on the result. So these finer points may not be relevant for you.

How do you get the logical xor of two variables in Python?

Question by Zach Hirsch

How do you get the logical xor of two variables in Python?

For example, I have two variables that I expect to be strings. I want to test that only one of them contains a True value (is not None or the empty string):

str1 = raw_input("Enter string one:")
str2 = raw_input("Enter string two:")
if logical_xor(str1, str2):
    print "ok"
else:
    print "bad"

The ^ operator seems to be bitwise, and not defined on all objects:

>>> 1 ^ 1
0
>>> 2 ^ 1
3
>>> "abc" ^ ""
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for ^: "str" and "str"

Answer #1:

If you"re already normalizing the inputs to booleans, then != is xor.

bool(a) != bool(b)

Answer #2:

You can always use the definition of xor to compute it from other logical operations:

(a and not b) or (not a and b)

But this is a little too verbose for me, and isn"t particularly clear at first glance. Another way to do it is:

bool(a) ^ bool(b)

The xor operator on two booleans is logical xor (unlike on ints, where it"s bitwise). Which makes sense, since bool is just a subclass of int, but is implemented to only have the values 0 and 1. And logical xor is equivalent to bitwise xor when the domain is restricted to 0 and 1.

So the logical_xor function would be implemented like:

def logical_xor(str1, str2):
    return bool(str1) ^ bool(str2)

Credit to Nick Coghlan on the Python-3000 mailing list.

Javascript Side Project Ideas: StackOverflow Questions

How can I open multiple files using "with open" in Python?

I want to change a couple of files at one time, iff I can write to all of them. I"m wondering if I somehow can combine the multiple open calls with the with statement:

try:
  with open("a", "w") as a and open("b", "w") as b:
    do_something()
except IOError as e:
  print "Operation failed: %s" % e.strerror

If that"s not possible, what would an elegant solution to this problem look like?

Answer #1:

As of Python 2.7 (or 3.1 respectively) you can write

with open("a", "w") as a, open("b", "w") as b:
    do_something()

In earlier versions of Python, you can sometimes use contextlib.nested() to nest context managers. This won"t work as expected for opening multiples files, though -- see the linked documentation for details.


In the rare case that you want to open a variable number of files all at the same time, you can use contextlib.ExitStack, starting from Python version 3.3:

with ExitStack() as stack:
    files = [stack.enter_context(open(fname)) for fname in filenames]
    # Do something with "files"

Most of the time you have a variable set of files, you likely want to open them one after the other, though.

Answer #2:

For opening many files at once or for long file paths, it may be useful to break things up over multiple lines. From the Python Style Guide as suggested by @Sven Marnach in comments to another answer:

with open("/path/to/InFile.ext", "r") as file_1, 
     open("/path/to/OutFile.ext", "w") as file_2:
    file_2.write(file_1.read())

open() in Python does not create a file if it doesn"t exist

What is the best way to open a file as read/write if it exists, or if it does not, then create it and open it as read/write? From what I read, file = open("myfile.dat", "rw") should do this, right?

It is not working for me (Python 2.6.2) and I"m wondering if it is a version problem, or not supposed to work like that or what.

The bottom line is, I just need a solution for the problem. I am curious about the other stuff, but all I need is a nice way to do the opening part.

The enclosing directory was writeable by user and group, not other (I"m on a Linux system... so permissions 775 in other words), and the exact error was:

IOError: no such file or directory.

Answer #1:

You should use open with the w+ mode:

file = open("myfile.dat", "w+")

Answer #2:

The advantage of the following approach is that the file is properly closed at the block"s end, even if an exception is raised on the way. It"s equivalent to try-finally, but much shorter.

with open("file.dat";"a+") as f:
    f.write(...)
    ...

a+ Opens a file for both appending and reading. The file pointer is at the end of the file if the file exists. The file opens in the append mode. If the file does not exist, it creates a new file for reading and writing. -Python file modes

seek() method sets the file"s current position.

f.seek(pos [, (0|1|2)])
pos .. position of the r/w pointer
[] .. optionally
() .. one of ->
  0 .. absolute position
  1 .. relative position to current
  2 .. relative position from end

Only "rwab+" characters are allowed; there must be exactly one of "rwa" - see Stack Overflow question Python file modes detail.

Difference between modes a, a+, w, w+, and r+ in built-in open function?

In the python built-in open function, what is the exact difference between the modes w, a, w+, a+, and r+?

In particular, the documentation implies that all of these will allow writing to the file, and says that it opens the files for "appending", "writing", and "updating" specifically, but does not define what these terms mean.

Answer #1:

The opening modes are exactly the same as those for the C standard library function fopen().

The BSD fopen manpage defines them as follows:

 The argument mode points to a string beginning with one of the following
 sequences (Additional characters may follow these sequences.):

 ``r""   Open text file for reading.  The stream is positioned at the
         beginning of the file.

 ``r+""  Open for reading and writing.  The stream is positioned at the
         beginning of the file.

 ``w""   Truncate file to zero length or create text file for writing.
         The stream is positioned at the beginning of the file.

 ``w+""  Open for reading and writing.  The file is created if it does not
         exist, otherwise it is truncated.  The stream is positioned at
         the beginning of the file.

 ``a""   Open for writing.  The file is created if it does not exist.  The
         stream is positioned at the end of the file.  Subsequent writes
         to the file will always end up at the then current end of file,
         irrespective of any intervening fseek(3) or similar.

 ``a+""  Open for reading and writing.  The file is created if it does not
         exist.  The stream is positioned at the end of the file.  Subse-
         quent writes to the file will always end up at the then current
         end of file, irrespective of any intervening fseek(3) or similar.

Javascript Side Project Ideas: StackOverflow Questions

What is the difference between flatten and ravel functions in numpy?

import numpy as np
y = np.array(((1,2,3),(4,5,6),(7,8,9)))
OUTPUT:
print(y.flatten())
[1   2   3   4   5   6   7   8   9]
print(y.ravel())
[1   2   3   4   5   6   7   8   9]

Both function return the same list. Then what is the need of two different functions performing same job.

Answer #1:

The current API is that:

  • flatten always returns a copy.
  • ravel returns a view of the original array whenever possible. This isn"t visible in the printed output, but if you modify the array returned by ravel, it may modify the entries in the original array. If you modify the entries in an array returned from flatten this will never happen. ravel will often be faster since no memory is copied, but you have to be more careful about modifying the array it returns.
  • reshape((-1,)) gets a view whenever the strides of the array allow it even if that means you don"t always get a contiguous array.

Answer #2:

As explained here a key difference is that:

  • flatten is a method of an ndarray object and hence can only be called for true numpy arrays.

  • ravel is a library-level function and hence can be called on any object that can successfully be parsed.

For example ravel will work on a list of ndarrays, while flatten is not available for that type of object.

@IanH also points out important differences with memory handling in his answer.

List to array conversion to use ravel() function

I have a list in python and I want to convert it to an array to be able to use ravel() function.

Answer #1:

Use numpy.asarray:

import numpy as np
myarray = np.asarray(mylist)

Javascript Side Project Ideas: StackOverflow Questions

How do I merge two dictionaries in a single expression (taking union of dictionaries)?

Question by Carl Meyer

I have two Python dictionaries, and I want to write a single expression that returns these two dictionaries, merged (i.e. taking the union). The update() method would be what I need, if it returned its result instead of modifying a dictionary in-place.

>>> x = {"a": 1, "b": 2}
>>> y = {"b": 10, "c": 11}
>>> z = x.update(y)
>>> print(z)
None
>>> x
{"a": 1, "b": 10, "c": 11}

How can I get that final merged dictionary in z, not x?

(To be extra-clear, the last-one-wins conflict-handling of dict.update() is what I"m looking for as well.)

Answer #1:

How can I merge two Python dictionaries in a single expression?

For dictionaries x and y, z becomes a shallowly-merged dictionary with values from y replacing those from x.

  • In Python 3.9.0 or greater (released 17 October 2020): PEP-584, discussed here, was implemented and provides the simplest method:

    z = x | y          # NOTE: 3.9+ ONLY
    
  • In Python 3.5 or greater:

    z = {**x, **y}
    
  • In Python 2, (or 3.4 or lower) write a function:

    def merge_two_dicts(x, y):
        z = x.copy()   # start with keys and values of x
        z.update(y)    # modifies z with keys and values of y
        return z
    

    and now:

    z = merge_two_dicts(x, y)
    

Explanation

Say you have two dictionaries and you want to merge them into a new dictionary without altering the original dictionaries:

x = {"a": 1, "b": 2}
y = {"b": 3, "c": 4}

The desired result is to get a new dictionary (z) with the values merged, and the second dictionary"s values overwriting those from the first.

>>> z
{"a": 1, "b": 3, "c": 4}

A new syntax for this, proposed in PEP 448 and available as of Python 3.5, is

z = {**x, **y}

And it is indeed a single expression.

Note that we can merge in with literal notation as well:

z = {**x, "foo": 1, "bar": 2, **y}

and now:

>>> z
{"a": 1, "b": 3, "foo": 1, "bar": 2, "c": 4}

It is now showing as implemented in the release schedule for 3.5, PEP 478, and it has now made its way into the What"s New in Python 3.5 document.

However, since many organizations are still on Python 2, you may wish to do this in a backward-compatible way. The classically Pythonic way, available in Python 2 and Python 3.0-3.4, is to do this as a two-step process:

z = x.copy()
z.update(y) # which returns None since it mutates z

In both approaches, y will come second and its values will replace x"s values, thus b will point to 3 in our final result.

Not yet on Python 3.5, but want a single expression

If you are not yet on Python 3.5 or need to write backward-compatible code, and you want this in a single expression, the most performant while the correct approach is to put it in a function:

def merge_two_dicts(x, y):
    """Given two dictionaries, merge them into a new dict as a shallow copy."""
    z = x.copy()
    z.update(y)
    return z

and then you have a single expression:

z = merge_two_dicts(x, y)

You can also make a function to merge an arbitrary number of dictionaries, from zero to a very large number:

def merge_dicts(*dict_args):
    """
    Given any number of dictionaries, shallow copy and merge into a new dict,
    precedence goes to key-value pairs in latter dictionaries.
    """
    result = {}
    for dictionary in dict_args:
        result.update(dictionary)
    return result

This function will work in Python 2 and 3 for all dictionaries. e.g. given dictionaries a to g:

z = merge_dicts(a, b, c, d, e, f, g) 

and key-value pairs in g will take precedence over dictionaries a to f, and so on.

Critiques of Other Answers

Don"t use what you see in the formerly accepted answer:

z = dict(x.items() + y.items())

In Python 2, you create two lists in memory for each dict, create a third list in memory with length equal to the length of the first two put together, and then discard all three lists to create the dict. In Python 3, this will fail because you"re adding two dict_items objects together, not two lists -

>>> c = dict(a.items() + b.items())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: "dict_items" and "dict_items"

and you would have to explicitly create them as lists, e.g. z = dict(list(x.items()) + list(y.items())). This is a waste of resources and computation power.

Similarly, taking the union of items() in Python 3 (viewitems() in Python 2.7) will also fail when values are unhashable objects (like lists, for example). Even if your values are hashable, since sets are semantically unordered, the behavior is undefined in regards to precedence. So don"t do this:

>>> c = dict(a.items() | b.items())

This example demonstrates what happens when values are unhashable:

>>> x = {"a": []}
>>> y = {"b": []}
>>> dict(x.items() | y.items())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: "list"

Here"s an example where y should have precedence, but instead the value from x is retained due to the arbitrary order of sets:

>>> x = {"a": 2}
>>> y = {"a": 1}
>>> dict(x.items() | y.items())
{"a": 2}

Another hack you should not use:

z = dict(x, **y)

This uses the dict constructor and is very fast and memory-efficient (even slightly more so than our two-step process) but unless you know precisely what is happening here (that is, the second dict is being passed as keyword arguments to the dict constructor), it"s difficult to read, it"s not the intended usage, and so it is not Pythonic.

Here"s an example of the usage being remediated in django.

Dictionaries are intended to take hashable keys (e.g. frozensets or tuples), but this method fails in Python 3 when keys are not strings.

>>> c = dict(a, **b)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: keyword arguments must be strings

From the mailing list, Guido van Rossum, the creator of the language, wrote:

I am fine with declaring dict({}, **{1:3}) illegal, since after all it is abuse of the ** mechanism.

and

Apparently dict(x, **y) is going around as "cool hack" for "call x.update(y) and return x". Personally, I find it more despicable than cool.

It is my understanding (as well as the understanding of the creator of the language) that the intended usage for dict(**y) is for creating dictionaries for readability purposes, e.g.:

dict(a=1, b=10, c=11)

instead of

{"a": 1, "b": 10, "c": 11}

Response to comments

Despite what Guido says, dict(x, **y) is in line with the dict specification, which btw. works for both Python 2 and 3. The fact that this only works for string keys is a direct consequence of how keyword parameters work and not a short-coming of dict. Nor is using the ** operator in this place an abuse of the mechanism, in fact, ** was designed precisely to pass dictionaries as keywords.

Again, it doesn"t work for 3 when keys are not strings. The implicit calling contract is that namespaces take ordinary dictionaries, while users must only pass keyword arguments that are strings. All other callables enforced it. dict broke this consistency in Python 2:

>>> foo(**{("a", "b"): None})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: foo() keywords must be strings
>>> dict(**{("a", "b"): None})
{("a", "b"): None}

This inconsistency was bad given other implementations of Python (PyPy, Jython, IronPython). Thus it was fixed in Python 3, as this usage could be a breaking change.

I submit to you that it is malicious incompetence to intentionally write code that only works in one version of a language or that only works given certain arbitrary constraints.

More comments:

dict(x.items() + y.items()) is still the most readable solution for Python 2. Readability counts.

My response: merge_two_dicts(x, y) actually seems much clearer to me, if we"re actually concerned about readability. And it is not forward compatible, as Python 2 is increasingly deprecated.

{**x, **y} does not seem to handle nested dictionaries. the contents of nested keys are simply overwritten, not merged [...] I ended up being burnt by these answers that do not merge recursively and I was surprised no one mentioned it. In my interpretation of the word "merging" these answers describe "updating one dict with another", and not merging.

Yes. I must refer you back to the question, which is asking for a shallow merge of two dictionaries, with the first"s values being overwritten by the second"s - in a single expression.

Assuming two dictionaries of dictionaries, one might recursively merge them in a single function, but you should be careful not to modify the dictionaries from either source, and the surest way to avoid that is to make a copy when assigning values. As keys must be hashable and are usually therefore immutable, it is pointless to copy them:

from copy import deepcopy

def dict_of_dicts_merge(x, y):
    z = {}
    overlapping_keys = x.keys() & y.keys()
    for key in overlapping_keys:
        z[key] = dict_of_dicts_merge(x[key], y[key])
    for key in x.keys() - overlapping_keys:
        z[key] = deepcopy(x[key])
    for key in y.keys() - overlapping_keys:
        z[key] = deepcopy(y[key])
    return z

Usage:

>>> x = {"a":{1:{}}, "b": {2:{}}}
>>> y = {"b":{10:{}}, "c": {11:{}}}
>>> dict_of_dicts_merge(x, y)
{"b": {2: {}, 10: {}}, "a": {1: {}}, "c": {11: {}}}

Coming up with contingencies for other value types is far beyond the scope of this question, so I will point you at my answer to the canonical question on a "Dictionaries of dictionaries merge".

Less Performant But Correct Ad-hocs

These approaches are less performant, but they will provide correct behavior. They will be much less performant than copy and update or the new unpacking because they iterate through each key-value pair at a higher level of abstraction, but they do respect the order of precedence (latter dictionaries have precedence)

You can also chain the dictionaries manually inside a dict comprehension:

{k: v for d in dicts for k, v in d.items()} # iteritems in Python 2.7

or in Python 2.6 (and perhaps as early as 2.4 when generator expressions were introduced):

dict((k, v) for d in dicts for k, v in d.items()) # iteritems in Python 2

itertools.chain will chain the iterators over the key-value pairs in the correct order:

from itertools import chain
z = dict(chain(x.items(), y.items())) # iteritems in Python 2

Performance Analysis

I"m only going to do the performance analysis of the usages known to behave correctly. (Self-contained so you can copy and paste yourself.)

from timeit import repeat
from itertools import chain

x = dict.fromkeys("abcdefg")
y = dict.fromkeys("efghijk")

def merge_two_dicts(x, y):
    z = x.copy()
    z.update(y)
    return z

min(repeat(lambda: {**x, **y}))
min(repeat(lambda: merge_two_dicts(x, y)))
min(repeat(lambda: {k: v for d in (x, y) for k, v in d.items()}))
min(repeat(lambda: dict(chain(x.items(), y.items()))))
min(repeat(lambda: dict(item for d in (x, y) for item in d.items())))

In Python 3.8.1, NixOS:

>>> min(repeat(lambda: {**x, **y}))
1.0804965235292912
>>> min(repeat(lambda: merge_two_dicts(x, y)))
1.636518670246005
>>> min(repeat(lambda: {k: v for d in (x, y) for k, v in d.items()}))
3.1779992282390594
>>> min(repeat(lambda: dict(chain(x.items(), y.items()))))
2.740647904574871
>>> min(repeat(lambda: dict(item for d in (x, y) for item in d.items())))
4.266070580109954
$ uname -a
Linux nixos 4.19.113 #1-NixOS SMP Wed Mar 25 07:06:15 UTC 2020 x86_64 GNU/Linux

Resources on Dictionaries

Answer #2:

In your case, what you can do is:

z = dict(list(x.items()) + list(y.items()))

This will, as you want it, put the final dict in z, and make the value for key b be properly overridden by the second (y) dict"s value:

>>> x = {"a":1, "b": 2}
>>> y = {"b":10, "c": 11}
>>> z = dict(list(x.items()) + list(y.items()))
>>> z
{"a": 1, "c": 11, "b": 10}

If you use Python 2, you can even remove the list() calls. To create z:

>>> z = dict(x.items() + y.items())
>>> z
{"a": 1, "c": 11, "b": 10}

If you use Python version 3.9.0a4 or greater, then you can directly use:

x = {"a":1, "b": 2}
y = {"b":10, "c": 11}
z = x | y
print(z)
{"a": 1, "c": 11, "b": 10}

Answer #3:

An alternative:

z = x.copy()
z.update(y)

Answer #4:

Another, more concise, option:

z = dict(x, **y)

Note: this has become a popular answer, but it is important to point out that if y has any non-string keys, the fact that this works at all is an abuse of a CPython implementation detail, and it does not work in Python 3, or in PyPy, IronPython, or Jython. Also, Guido is not a fan. So I can"t recommend this technique for forward-compatible or cross-implementation portable code, which really means it should be avoided entirely.

Answer #5:

This probably won"t be a popular answer, but you almost certainly do not want to do this. If you want a copy that"s a merge, then use copy (or deepcopy, depending on what you want) and then update. The two lines of code are much more readable - more Pythonic - than the single line creation with .items() + .items(). Explicit is better than implicit.

In addition, when you use .items() (pre Python 3.0), you"re creating a new list that contains the items from the dict. If your dictionaries are large, then that is quite a lot of overhead (two large lists that will be thrown away as soon as the merged dict is created). update() can work more efficiently, because it can run through the second dict item-by-item.

In terms of time:

>>> timeit.Timer("dict(x, **y)", "x = dict(zip(range(1000), range(1000)))
y=dict(zip(range(1000,2000), range(1000,2000)))").timeit(100000)
15.52571702003479
>>> timeit.Timer("temp = x.copy()
temp.update(y)", "x = dict(zip(range(1000), range(1000)))
y=dict(zip(range(1000,2000), range(1000,2000)))").timeit(100000)
15.694622993469238
>>> timeit.Timer("dict(x.items() + y.items())", "x = dict(zip(range(1000), range(1000)))
y=dict(zip(range(1000,2000), range(1000,2000)))").timeit(100000)
41.484580039978027

IMO the tiny slowdown between the first two is worth it for the readability. In addition, keyword arguments for dictionary creation was only added in Python 2.3, whereas copy() and update() will work in older versions.

Javascript Side Project Ideas: StackOverflow Questions

Separation of business logic and data access in django

I am writing a project in Django and I see that 80% of the code is in the file models.py. This code is confusing and, after a certain time, I cease to understand what is really happening.

Here is what bothers me:

  1. I find it ugly that my model level (which was supposed to be responsible only for the work with data from a database) is also sending email, walking on API to other services, etc.
  2. Also, I find it unacceptable to place business logic in the view, because this way it becomes difficult to control. For example, in my application there are at least three ways to create new instances of User, but technically it should create them uniformly.
  3. I do not always notice when the methods and properties of my models become non-deterministic and when they develop side effects.

Here is a simple example. At first, the User model was like this:

class User(db.Models):

    def get_present_name(self):
        return self.name or "Anonymous"

    def activate(self):
        self.status = "activated"
        self.save()

Over time, it turned into this:

class User(db.Models):

    def get_present_name(self): 
        # property became non-deterministic in terms of database
        # data is taken from another service by api
        return remote_api.request_user_name(self.uid) or "Anonymous" 

    def activate(self):
        # method now has a side effect (send message to user)
        self.status = "activated"
        self.save()
        send_mail("Your account is activated!", "…", [self.email])

What I want is to separate entities in my code:

  1. Entities of my database, persistence level: What data does my application keep?
  2. Entities of my application, business logic level: What does my application do?

What are the good practices to implement such an approach that can be applied in Django?

Answer #1:

It seems like you are asking about the difference between the data model and the domain model – the latter is where you can find the business logic and entities as perceived by your end user, the former is where you actually store your data.

Furthermore, I"ve interpreted the 3rd part of your question as: how to notice failure to keep these models separate.

These are two very different concepts and it"s always hard to keep them separate. However, there are some common patterns and tools that can be used for this purpose.

About the Domain Model

The first thing you need to recognize is that your domain model is not really about data; it is about actions and questions such as "activate this user", "deactivate this user", "which users are currently activated?", and "what is this user"s name?". In classical terms: it"s about queries and commands.

Thinking in Commands

Let"s start by looking at the commands in your example: "activate this user" and "deactivate this user". The nice thing about commands is that they can easily be expressed by small given-when-then scenario"s:

given an inactive user
when the admin activates this user
then the user becomes active
and a confirmation e-mail is sent to the user
and an entry is added to the system log
(etc. etc.)

Such scenario"s are useful to see how different parts of your infrastructure can be affected by a single command – in this case your database (some kind of "active" flag), your mail server, your system log, etc.

Such scenario"s also really help you in setting up a Test Driven Development environment.

And finally, thinking in commands really helps you create a task-oriented application. Your users will appreciate this :-)

Expressing Commands

Django provides two easy ways of expressing commands; they are both valid options and it is not unusual to mix the two approaches.

The service layer

The service module has already been described by @Hedde. Here you define a separate module and each command is represented as a function.

services.py

def activate_user(user_id):
    user = User.objects.get(pk=user_id)

    # set active flag
    user.active = True
    user.save()

    # mail user
    send_mail(...)

    # etc etc

Using forms

The other way is to use a Django Form for each command. I prefer this approach, because it combines multiple closely related aspects:

  • execution of the command (what does it do?)
  • validation of the command parameters (can it do this?)
  • presentation of the command (how can I do this?)

forms.py

class ActivateUserForm(forms.Form):

    user_id = IntegerField(widget = UsernameSelectWidget, verbose_name="Select a user to activate")
    # the username select widget is not a standard Django widget, I just made it up

    def clean_user_id(self):
        user_id = self.cleaned_data["user_id"]
        if User.objects.get(pk=user_id).active:
            raise ValidationError("This user cannot be activated")
        # you can also check authorizations etc. 
        return user_id

    def execute(self):
        """
        This is not a standard method in the forms API; it is intended to replace the 
        "extract-data-from-form-in-view-and-do-stuff" pattern by a more testable pattern. 
        """
        user_id = self.cleaned_data["user_id"]

        user = User.objects.get(pk=user_id)

        # set active flag
        user.active = True
        user.save()

        # mail user
        send_mail(...)

        # etc etc

Thinking in Queries

You example did not contain any queries, so I took the liberty of making up a few useful queries. I prefer to use the term "question", but queries is the classical terminology. Interesting queries are: "What is the name of this user?", "Can this user log in?", "Show me a list of deactivated users", and "What is the geographical distribution of deactivated users?"

Before embarking on answering these queries, you should always ask yourself this question, is this:

  • a presentational query just for my templates, and/or
  • a business logic query tied to executing my commands, and/or
  • a reporting query.

Presentational queries are merely made to improve the user interface. The answers to business logic queries directly affect the execution of your commands. Reporting queries are merely for analytical purposes and have looser time constraints. These categories are not mutually exclusive.

The other question is: "do I have complete control over the answers?" For example, when querying the user"s name (in this context) we do not have any control over the outcome, because we rely on an external API.

Making Queries

The most basic query in Django is the use of the Manager object:

User.objects.filter(active=True)

Of course, this only works if the data is actually represented in your data model. This is not always the case. In those cases, you can consider the options below.

Custom tags and filters

The first alternative is useful for queries that are merely presentational: custom tags and template filters.

template.html

<h1>Welcome, {{ user|friendly_name }}</h1>

template_tags.py

@register.filter
def friendly_name(user):
    return remote_api.get_cached_name(user.id)

Query methods

If your query is not merely presentational, you could add queries to your services.py (if you are using that), or introduce a queries.py module:

queries.py

def inactive_users():
    return User.objects.filter(active=False)


def users_called_publysher():
    for user in User.objects.all():
        if remote_api.get_cached_name(user.id) == "publysher":
            yield user 

Proxy models

Proxy models are very useful in the context of business logic and reporting. You basically define an enhanced subset of your model. You can override a Manager’s base QuerySet by overriding the Manager.get_queryset() method.

models.py

class InactiveUserManager(models.Manager):
    def get_queryset(self):
        query_set = super(InactiveUserManager, self).get_queryset()
        return query_set.filter(active=False)

class InactiveUser(User):
    """
    >>> for user in InactiveUser.objects.all():
    …        assert user.active is False 
    """

    objects = InactiveUserManager()
    class Meta:
        proxy = True

Query models

For queries that are inherently complex, but are executed quite often, there is the possibility of query models. A query model is a form of denormalization where relevant data for a single query is stored in a separate model. The trick of course is to keep the denormalized model in sync with the primary model. Query models can only be used if changes are entirely under your control.

models.py

class InactiveUserDistribution(models.Model):
    country = CharField(max_length=200)
    inactive_user_count = IntegerField(default=0)

The first option is to update these models in your commands. This is very useful if these models are only changed by one or two commands.

forms.py

class ActivateUserForm(forms.Form):
    # see above
   
    def execute(self):
        # see above
        query_model = InactiveUserDistribution.objects.get_or_create(country=user.country)
        query_model.inactive_user_count -= 1
        query_model.save()

A better option would be to use custom signals. These signals are of course emitted by your commands. Signals have the advantage that you can keep multiple query models in sync with your original model. Furthermore, signal processing can be offloaded to background tasks, using Celery or similar frameworks.

signals.py

user_activated = Signal(providing_args = ["user"])
user_deactivated = Signal(providing_args = ["user"])

forms.py

class ActivateUserForm(forms.Form):
    # see above
   
    def execute(self):
        # see above
        user_activated.send_robust(sender=self, user=user)

models.py

class InactiveUserDistribution(models.Model):
    # see above

@receiver(user_activated)
def on_user_activated(sender, **kwargs):
        user = kwargs["user"]
        query_model = InactiveUserDistribution.objects.get_or_create(country=user.country)
        query_model.inactive_user_count -= 1
        query_model.save()
    

Keeping it clean

When using this approach, it becomes ridiculously easy to determine if your code stays clean. Just follow these guidelines:

  • Does my model contain methods that do more than managing database state? You should extract a command.
  • Does my model contain properties that do not map to database fields? You should extract a query.
  • Does my model reference infrastructure that is not my database (such as mail)? You should extract a command.

The same goes for views (because views often suffer from the same problem).

  • Does my view actively manage database models? You should extract a command.

Some References

Django documentation: proxy models

Django documentation: signals

Architecture: Domain Driven Design

Answer #2:

I usually implement a service layer in between views and models. This acts like your project"s API and gives you a good helicopter view of what is going on. I inherited this practice from a colleague of mine that uses this layering technique a lot with Java projects (JSF), e.g:

models.py

class Book:
   author = models.ForeignKey(User)
   title = models.CharField(max_length=125)

   class Meta:
       app_label = "library"

services.py

from library.models import Book

def get_books(limit=None, **filters):
    """ simple service function for retrieving books can be widely extended """
    return Book.objects.filter(**filters)[:limit]  # list[:None] will return the entire list

views.py

from library.services import get_books

class BookListView(ListView):
    """ simple view, e.g. implement a _build and _apply filters function """
    queryset = get_books()

Mind you, I usually take models, views and services to module level and separate even further depending on the project"s size

Cosine Similarity between 2 Number Lists

I want to calculate the cosine similarity between two lists, let"s say for example list 1 which is dataSetI and list 2 which is dataSetII.

Let"s say dataSetI is [3, 45, 7, 2] and dataSetII is [2, 54, 13, 15]. The length of the lists are always equal. I want to report cosine similarity as a number between 0 and 1.

dataSetI = [3, 45, 7, 2]
dataSetII = [2, 54, 13, 15]

def cosine_similarity(list1, list2):
  # How to?
  pass

print(cosine_similarity(dataSetI, dataSetII))

Answer #1:

You should try SciPy. It has a bunch of useful scientific routines for example, "routines for computing integrals numerically, solving differential equations, optimization, and sparse matrices." It uses the superfast optimized NumPy for its number crunching. See here for installing.

Note that spatial.distance.cosine computes the distance, and not the similarity. So, you must subtract the value from 1 to get the similarity.

from scipy import spatial

dataSetI = [3, 45, 7, 2]
dataSetII = [2, 54, 13, 15]
result = 1 - spatial.distance.cosine(dataSetI, dataSetII)

Answer #2:

another version based on numpy only

from numpy import dot
from numpy.linalg import norm

cos_sim = dot(a, b)/(norm(a)*norm(b))

Haversine Formula in Python (Bearing and Distance between two GPS points)

Problem

I would like to know how to get the distance and bearing between 2 GPS points. I have researched on the haversine formula. Someone told me that I could also find the bearing using the same data.

Edit

Everything is working fine but the bearing doesn"t quite work right yet. The bearing outputs negative but should be between 0 - 360 degrees. The set data should make the horizontal bearing 96.02166666666666 and is:

Start point: 53.32055555555556 , -1.7297222222222221   
Bearing:  96.02166666666666  
Distance: 2 km  
Destination point: 53.31861111111111, -1.6997222222222223  
Final bearing: 96.04555555555555

Here is my new code:

from math import *

Aaltitude = 2000
Oppsite  = 20000

lat1 = 53.32055555555556
lat2 = 53.31861111111111
lon1 = -1.7297222222222221
lon2 = -1.6997222222222223

lon1, lat1, lon2, lat2 = map(radians, [lon1, lat1, lon2, lat2])

dlon = lon2 - lon1
dlat = lat2 - lat1
a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
c = 2 * atan2(sqrt(a), sqrt(1-a))
Base = 6371 * c


Bearing =atan2(cos(lat1)*sin(lat2)-sin(lat1)*cos(lat2)*cos(lon2-lon1), sin(lon2-lon1)*cos(lat2)) 

Bearing = degrees(Bearing)
print ""
print ""
print "--------------------"
print "Horizontal Distance:"
print Base
print "--------------------"
print "Bearing:"
print Bearing
print "--------------------"


Base2 = Base * 1000
distance = Base * 2 + Oppsite * 2 / 2
Caltitude = Oppsite - Aaltitude

a = Oppsite/Base
b = atan(a)
c = degrees(b)

distance = distance / 1000

print "The degree of vertical angle is:"
print c
print "--------------------"
print "The distance between the Balloon GPS and the Antenna GPS is:"
print distance
print "--------------------"

Answer #1:

Here"s a Python version:

from math import radians, cos, sin, asin, sqrt

def haversine(lon1, lat1, lon2, lat2):
    """
    Calculate the great circle distance in kilometers between two points 
    on the earth (specified in decimal degrees)
    """
    # convert decimal degrees to radians 
    lon1, lat1, lon2, lat2 = map(radians, [lon1, lat1, lon2, lat2])

    # haversine formula 
    dlon = lon2 - lon1 
    dlat = lat2 - lat1 
    a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
    c = 2 * asin(sqrt(a)) 
    r = 6371 # Radius of earth in kilometers. Use 3956 for miles. Determines return value units.
    return c * r

Shop

Best laptop for Sims 4

$

Best laptop for Zoom

$499

Best laptop for Minecraft

$590

Best laptop for engineering student

$

Best laptop for development

$

Best laptop for Cricut Maker

$

Best laptop for hacking

$890

Best laptop for Machine Learning

$950

Latest questions

NUMPYNUMPY

psycopg2: insert multiple rows with one query

12 answers

NUMPYNUMPY

How to convert Nonetype to int or string?

12 answers

NUMPYNUMPY

How to specify multiple return types using type-hints

12 answers

NUMPYNUMPY

Javascript Error: IPython is not defined in JupyterLab

12 answers

News

Wiki

Python OpenCV | cv2.putText () method

numpy.arctan2 () in Python

Python | os.path.realpath () method

Python OpenCV | cv2.circle () method

Python OpenCV cv2.cvtColor () method

Python - Move item to the end of the list

time.perf_counter () function in Python

Check if one list is a subset of another in Python

Python os.path.join () method