How can I iterate over files in a given directory?

_files | StackOverflow

I need to iterate through all .asm files inside a given directory and do some actions on them.

How can this be done in a efficient way?

Answer rating: 1015

Original answer:

import os

for filename in os.listdir(directory):
    if filename.endswith(".asm") or filename.endswith(".py"): 
         # print(os.path.join(directory, filename))
        continue
    else:
        continue

Python 3.6 version of the above answer, using os - assuming that you have the directory path as a str object in a variable called directory_in_str:

import os

directory = os.fsencode(directory_in_str)
    
for file in os.listdir(directory):
     filename = os.fsdecode(file)
     if filename.endswith(".asm") or filename.endswith(".py"): 
         # print(os.path.join(directory, filename))
         continue
     else:
         continue

Or recursively, using pathlib:

from pathlib import Path

pathlist = Path(directory_in_str).glob("**/*.asm")
for path in pathlist:
     # because path is object not string
     path_in_str = str(path)
     # print(path_in_str)
  • Use rglob to replace glob("**/*.asm") with rglob("*.asm")
    • This is like calling Path.glob() with "**/" added in front of the given relative pattern:
from pathlib import Path

pathlist = Path(directory_in_str).rglob("*.asm")
for path in pathlist:
     # because path is object not string
     path_in_str = str(path)
     # print(path_in_str)

Answer rating: 182

This will iterate over all descendant files, not just the immediate children of the directory:

import os

for subdir, dirs, files in os.walk(rootdir):
    for file in files:
        #print os.path.join(subdir, file)
        filepath = subdir + os.sep + file

        if filepath.endswith(".asm"):
            print (filepath)

Answer rating: 156

You can try using glob module:

import glob

for filepath in glob.iglob("my_dir/*.asm"):
    print(filepath)

and since Python 3.5 you can search subdirectories as well:

glob.glob("**/*.txt", recursive=True) # => ["2.txt", "sub/3.txt"]

From the docs:

The glob module finds all the pathnames matching a specified pattern according to the rules used by the Unix shell, although results are returned in arbitrary order. No tilde expansion is done, but *, ?, and character ranges expressed with [] will be correctly matched.





How can I iterate over files in a given directory?: StackOverflow Questions

How do I list all files of a directory?

How can I list all files of a directory in Python and add them to a list?

Importing files from different folder

I have the following folder structure.

application
├── app
│   └── folder
│       └── file.py
└── app2
    └── some_folder
        └── some_file.py

I want to import some functions from file.py in some_file.py.

I"ve tried

from application.app.folder.file import func_name

and some other various attempts but so far I couldn"t manage to import properly. How can I do this?

If Python is interpreted, what are .pyc files?

I"ve been given to understand that Python is an interpreted language...
However, when I look at my Python source code I see .pyc files, which Windows identifies as "Compiled Python Files".

Where do these come in?

Find all files in a directory with extension .txt in Python

How can I find all the files in a directory having the extension .txt in python?

How to import other Python files?

How do I import other files in Python?

  1. How exactly can I import a specific python file like import file.py?
  2. How can I import a folder instead of a specific file?
  3. I want to load a Python file dynamically at runtime, based on user input.
  4. I want to know how to load just one specific part from the file.

For example, in main.py I have:

from extra import * 

Although this gives me all the definitions in extra.py, when maybe all I want is a single definition:

def gap():
    print
    print

What do I add to the import statement to just get gap from extra.py?

How to use glob() to find files recursively?

This is what I have:

glob(os.path.join("src","*.c"))

but I want to search the subfolders of src. Something like this would work:

glob(os.path.join("src","*.c"))
glob(os.path.join("src","*","*.c"))
glob(os.path.join("src","*","*","*.c"))
glob(os.path.join("src","*","*","*","*.c"))

But this is obviously limited and clunky.

How can I open multiple files using "with open" in Python?

I want to change a couple of files at one time, iff I can write to all of them. I"m wondering if I somehow can combine the multiple open calls with the with statement:

try:
  with open("a", "w") as a and open("b", "w") as b:
    do_something()
except IOError as e:
  print "Operation failed: %s" % e.strerror

If that"s not possible, what would an elegant solution to this problem look like?

How can I iterate over files in a given directory?

I need to iterate through all .asm files inside a given directory and do some actions on them.

How can this be done in a efficient way?

How to serve static files in Flask

So this is embarrassing. I"ve got an application that I threw together in Flask and for now it is just serving up a single static HTML page with some links to CSS and JS. And I can"t find where in the documentation Flask describes returning static files. Yes, I could use render_template but I know the data is not templatized. I"d have thought send_file or url_for was the right thing, but I could not get those to work. In the meantime, I am opening the files, reading content, and rigging up a Response with appropriate mimetype:

import os.path

from flask import Flask, Response


app = Flask(__name__)
app.config.from_object(__name__)


def root_dir():  # pragma: no cover
    return os.path.abspath(os.path.dirname(__file__))


def get_file(filename):  # pragma: no cover
    try:
        src = os.path.join(root_dir(), filename)
        # Figure out how flask returns static files
        # Tried:
        # - render_template
        # - send_file
        # This should not be so non-obvious
        return open(src).read()
    except IOError as exc:
        return str(exc)


@app.route("/", methods=["GET"])
def metrics():  # pragma: no cover
    content = get_file("jenkins_analytics.html")
    return Response(content, mimetype="text/html")


@app.route("/", defaults={"path": ""})
@app.route("/<path:path>")
def get_resource(path):  # pragma: no cover
    mimetypes = {
        ".css": "text/css",
        ".html": "text/html",
        ".js": "application/javascript",
    }
    complete_path = os.path.join(root_dir(), path)
    ext = os.path.splitext(path)[1]
    mimetype = mimetypes.get(ext, "text/html")
    content = get_file(complete_path)
    return Response(content, mimetype=mimetype)


if __name__ == "__main__":  # pragma: no cover
    app.run(port=80)

Someone want to give a code sample or url for this? I know this is going to be dead simple.

Unzipping files in Python

I read through the zipfile documentation, but couldn"t understand how to unzip a file, only how to zip a file. How do I unzip all the contents of a zip file into the same directory?

Answer #1

Recommendation for beginners:

This is my personal recommendation for beginners: start by learning virtualenv and pip, tools which work with both Python 2 and 3 and in a variety of situations, and pick up other tools once you start needing them.

PyPI packages not in the standard library:

  • virtualenv is a very popular tool that creates isolated Python environments for Python libraries. If you"re not familiar with this tool, I highly recommend learning it, as it is a very useful tool, and I"ll be making comparisons to it for the rest of this answer.

It works by installing a bunch of files in a directory (eg: env/), and then modifying the PATH environment variable to prefix it with a custom bin directory (eg: env/bin/). An exact copy of the python or python3 binary is placed in this directory, but Python is programmed to look for libraries relative to its path first, in the environment directory. It"s not part of Python"s standard library, but is officially blessed by the PyPA (Python Packaging Authority). Once activated, you can install packages in the virtual environment using pip.

  • pyenv is used to isolate Python versions. For example, you may want to test your code against Python 2.7, 3.6, 3.7 and 3.8, so you"ll need a way to switch between them. Once activated, it prefixes the PATH environment variable with ~/.pyenv/shims, where there are special files matching the Python commands (python, pip). These are not copies of the Python-shipped commands; they are special scripts that decide on the fly which version of Python to run based on the PYENV_VERSION environment variable, or the .python-version file, or the ~/.pyenv/version file. pyenv also makes the process of downloading and installing multiple Python versions easier, using the command pyenv install.

  • pyenv-virtualenv is a plugin for pyenv by the same author as pyenv, to allow you to use pyenv and virtualenv at the same time conveniently. However, if you"re using Python 3.3 or later, pyenv-virtualenv will try to run python -m venv if it is available, instead of virtualenv. You can use virtualenv and pyenv together without pyenv-virtualenv, if you don"t want the convenience features.

  • virtualenvwrapper is a set of extensions to virtualenv (see docs). It gives you commands like mkvirtualenv, lssitepackages, and especially workon for switching between different virtualenv directories. This tool is especially useful if you want multiple virtualenv directories.

  • pyenv-virtualenvwrapper is a plugin for pyenv by the same author as pyenv, to conveniently integrate virtualenvwrapper into pyenv.

  • pipenv aims to combine Pipfile, pip and virtualenv into one command on the command-line. The virtualenv directory typically gets placed in ~/.local/share/virtualenvs/XXX, with XXX being a hash of the path of the project directory. This is different from virtualenv, where the directory is typically in the current working directory. pipenv is meant to be used when developing Python applications (as opposed to libraries). There are alternatives to pipenv, such as poetry, which I won"t list here since this question is only about the packages that are similarly named.

Standard library:

  • pyvenv (not to be confused with pyenv in the previous section) is a script shipped with Python 3 but deprecated in Python 3.6 as it had problems (not to mention the confusing name). In Python 3.6+, the exact equivalent is python3 -m venv.

  • venv is a package shipped with Python 3, which you can run using python3 -m venv (although for some reason some distros separate it out into a separate distro package, such as python3-venv on Ubuntu/Debian). It serves the same purpose as virtualenv, but only has a subset of its features (see a comparison here). virtualenv continues to be more popular than venv, especially since the former supports both Python 2 and 3.

Answer #2

os.listdir() - list in the current directory

With listdir in os module you get the files and the folders in the current dir

 import os
 arr = os.listdir()
 print(arr)
 
 >>> ["$RECYCLE.BIN", "work.txt", "3ebooks.txt", "documents"]

Looking in a directory

arr = os.listdir("c:\files")

glob from glob

with glob you can specify a type of file to list like this

import glob

txtfiles = []
for file in glob.glob("*.txt"):
    txtfiles.append(file)

glob in a list comprehension

mylist = [f for f in glob.glob("*.txt")]

get the full path of only files in the current directory

import os
from os import listdir
from os.path import isfile, join

cwd = os.getcwd()
onlyfiles = [os.path.join(cwd, f) for f in os.listdir(cwd) if 
os.path.isfile(os.path.join(cwd, f))]
print(onlyfiles) 

["G:\getfilesname\getfilesname.py", "G:\getfilesname\example.txt"]

Getting the full path name with os.path.abspath

You get the full path in return

 import os
 files_path = [os.path.abspath(x) for x in os.listdir()]
 print(files_path)
 
 ["F:\documentiapplications.txt", "F:\documenticollections.txt"]

Walk: going through sub directories

os.walk returns the root, the directories list and the files list, that is why I unpacked them in r, d, f in the for loop; it, then, looks for other files and directories in the subfolders of the root and so on until there are no subfolders.

import os

# Getting the current work directory (cwd)
thisdir = os.getcwd()

# r=root, d=directories, f = files
for r, d, f in os.walk(thisdir):
    for file in f:
        if file.endswith(".docx"):
            print(os.path.join(r, file))

os.listdir(): get files in the current directory (Python 2)

In Python 2, if you want the list of the files in the current directory, you have to give the argument as "." or os.getcwd() in the os.listdir method.

 import os
 arr = os.listdir(".")
 print(arr)
 
 >>> ["$RECYCLE.BIN", "work.txt", "3ebooks.txt", "documents"]

To go up in the directory tree

# Method 1
x = os.listdir("..")

# Method 2
x= os.listdir("/")

Get files: os.listdir() in a particular directory (Python 2 and 3)

 import os
 arr = os.listdir("F:\python")
 print(arr)
 
 >>> ["$RECYCLE.BIN", "work.txt", "3ebooks.txt", "documents"]

Get files of a particular subdirectory with os.listdir()

import os

x = os.listdir("./content")

os.walk(".") - current directory

 import os
 arr = next(os.walk("."))[2]
 print(arr)
 
 >>> ["5bs_Turismo1.pdf", "5bs_Turismo1.pptx", "esperienza.txt"]

next(os.walk(".")) and os.path.join("dir", "file")

 import os
 arr = []
 for d,r,f in next(os.walk("F:\_python")):
     for file in f:
         arr.append(os.path.join(r,file))

 for f in arr:
     print(files)

>>> F:\_python\dict_class.py
>>> F:\_python\programmi.txt

next(os.walk("F:\") - get the full path - list comprehension

 [os.path.join(r,file) for r,d,f in next(os.walk("F:\_python")) for file in f]
 
 >>> ["F:\_python\dict_class.py", "F:\_python\programmi.txt"]

os.walk - get full path - all files in sub dirs**

x = [os.path.join(r,file) for r,d,f in os.walk("F:\_python") for file in f]
print(x)

>>> ["F:\_python\dict.py", "F:\_python\progr.txt", "F:\_python\readl.py"]

os.listdir() - get only txt files

 arr_txt = [x for x in os.listdir() if x.endswith(".txt")]
 print(arr_txt)
 
 >>> ["work.txt", "3ebooks.txt"]

Using glob to get the full path of the files

If I should need the absolute path of the files:

from path import path
from glob import glob
x = [path(f).abspath() for f in glob("F:\*.txt")]
for f in x:
    print(f)

>>> F:acquistionline.txt
>>> F:acquisti_2018.txt
>>> F:ootstrap_jquery_ecc.txt

Using os.path.isfile to avoid directories in the list

import os.path
listOfFiles = [f for f in os.listdir() if os.path.isfile(f)]
print(listOfFiles)

>>> ["a simple game.py", "data.txt", "decorator.py"]

Using pathlib from Python 3.4

import pathlib

flist = []
for p in pathlib.Path(".").iterdir():
    if p.is_file():
        print(p)
        flist.append(p)

 >>> error.PNG
 >>> exemaker.bat
 >>> guiprova.mp3
 >>> setup.py
 >>> speak_gui2.py
 >>> thumb.PNG

With list comprehension:

flist = [p for p in pathlib.Path(".").iterdir() if p.is_file()]

Alternatively, use pathlib.Path() instead of pathlib.Path(".")

Use glob method in pathlib.Path()

import pathlib

py = pathlib.Path().glob("*.py")
for file in py:
    print(file)

>>> stack_overflow_list.py
>>> stack_overflow_list_tkinter.py

Get all and only files with os.walk

import os
x = [i[2] for i in os.walk(".")]
y=[]
for t in x:
    for f in t:
        y.append(f)
print(y)

>>> ["append_to_list.py", "data.txt", "data1.txt", "data2.txt", "data_180617", "os_walk.py", "READ2.py", "read_data.py", "somma_defaltdic.py", "substitute_words.py", "sum_data.py", "data.txt", "data1.txt", "data_180617"]

Get only files with next and walk in a directory

 import os
 x = next(os.walk("F://python"))[2]
 print(x)
 
 >>> ["calculator.bat","calculator.py"]

Get only directories with next and walk in a directory

 import os
 next(os.walk("F://python"))[1] # for the current dir use (".")
 
 >>> ["python3","others"]

Get all the subdir names with walk

for r,d,f in os.walk("F:\_python"):
    for dirs in d:
        print(dirs)

>>> .vscode
>>> pyexcel
>>> pyschool.py
>>> subtitles
>>> _metaprogramming
>>> .ipynb_checkpoints

os.scandir() from Python 3.5 and greater

import os
x = [f.name for f in os.scandir() if f.is_file()]
print(x)

>>> ["calculator.bat","calculator.py"]

# Another example with scandir (a little variation from docs.python.org)
# This one is more efficient than os.listdir.
# In this case, it shows the files only in the current directory
# where the script is executed.

import os
with os.scandir() as i:
    for entry in i:
        if entry.is_file():
            print(entry.name)

>>> ebookmaker.py
>>> error.PNG
>>> exemaker.bat
>>> guiprova.mp3
>>> setup.py
>>> speakgui4.py
>>> speak_gui2.py
>>> speak_gui3.py
>>> thumb.PNG

Examples:

Ex. 1: How many files are there in the subdirectories?

In this example, we look for the number of files that are included in all the directory and its subdirectories.

import os

def count(dir, counter=0):
    "returns number of files in dir and subdirs"
    for pack in os.walk(dir):
        for f in pack[2]:
            counter += 1
    return dir + " : " + str(counter) + "files"

print(count("F:\python"))

>>> "F:\python" : 12057 files"

Ex.2: How to copy all files from a directory to another?

A script to make order in your computer finding all files of a type (default: pptx) and copying them in a new folder.

import os
import shutil
from path import path

destination = "F:\file_copied"
# os.makedirs(destination)

def copyfile(dir, filetype="pptx", counter=0):
    "Searches for pptx (or other - pptx is the default) files and copies them"
    for pack in os.walk(dir):
        for f in pack[2]:
            if f.endswith(filetype):
                fullpath = pack[0] + "\" + f
                print(fullpath)
                shutil.copy(fullpath, destination)
                counter += 1
    if counter > 0:
        print("-" * 30)
        print("	==> Found in: `" + dir + "` : " + str(counter) + " files
")

for dir in os.listdir():
    "searches for folders that starts with `_`"
    if dir[0] == "_":
        # copyfile(dir, filetype="pdf")
        copyfile(dir, filetype="txt")


>>> _compiti18Compito Contabilità 1conti.txt
>>> _compiti18Compito Contabilità 1modula4.txt
>>> _compiti18Compito Contabilità 1moduloa4.txt
>>> ------------------------
>>> ==> Found in: `_compiti18` : 3 files

Ex. 3: How to get all the files in a txt file

In case you want to create a txt file with all the file names:

import os
mylist = ""
with open("filelist.txt", "w", encoding="utf-8") as file:
    for eachfile in os.listdir():
        mylist += eachfile + "
"
    file.write(mylist)

Example: txt with all the files of an hard drive

"""
We are going to save a txt file with all the files in your directory.
We will use the function walk()
"""

import os

# see all the methods of os
# print(*dir(os), sep=", ")
listafile = []
percorso = []
with open("lista_file.txt", "w", encoding="utf-8") as testo:
    for root, dirs, files in os.walk("D:\"):
        for file in files:
            listafile.append(file)
            percorso.append(root + "\" + file)
            testo.write(file + "
")
listafile.sort()
print("N. of files", len(listafile))
with open("lista_file_ordinata.txt", "w", encoding="utf-8") as testo_ordinato:
    for file in listafile:
        testo_ordinato.write(file + "
")

with open("percorso.txt", "w", encoding="utf-8") as file_percorso:
    for file in percorso:
        file_percorso.write(file + "
")

os.system("lista_file.txt")
os.system("lista_file_ordinata.txt")
os.system("percorso.txt")

All the file of C: in one text file

This is a shorter version of the previous code. Change the folder where to start finding the files if you need to start from another position. This code generate a 50 mb on text file on my computer with something less then 500.000 lines with files with the complete path.

import os

with open("file.txt", "w", encoding="utf-8") as filewrite:
    for r, d, f in os.walk("C:\"):
        for file in f:
            filewrite.write(f"{r + file}
")

How to write a file with all paths in a folder of a type

With this function you can create a txt file that will have the name of a type of file that you look for (ex. pngfile.txt) with all the full path of all the files of that type. It can be useful sometimes, I think.

import os

def searchfiles(extension=".ttf", folder="H:\"):
    "Create a txt file with all the file of a type"
    with open(extension[1:] + "file.txt", "w", encoding="utf-8") as filewrite:
        for r, d, f in os.walk(folder):
            for file in f:
                if file.endswith(extension):
                    filewrite.write(f"{r + file}
")

# looking for png file (fonts) in the hard disk H:
searchfiles(".png", "H:\")

>>> H:4bs_18Dolphins5.png
>>> H:4bs_18Dolphins6.png
>>> H:4bs_18Dolphins7.png
>>> H:5_18marketing htmlassetsimageslogo2.png
>>> H:7z001.png
>>> H:7z002.png

(New) Find all files and open them with tkinter GUI

I just wanted to add in this 2019 a little app to search for all files in a dir and be able to open them by doubleclicking on the name of the file in the list. enter image description here

import tkinter as tk
import os

def searchfiles(extension=".txt", folder="H:\"):
    "insert all files in the listbox"
    for r, d, f in os.walk(folder):
        for file in f:
            if file.endswith(extension):
                lb.insert(0, r + "\" + file)

def open_file():
    os.startfile(lb.get(lb.curselection()[0]))

root = tk.Tk()
root.geometry("400x400")
bt = tk.Button(root, text="Search", command=lambda:searchfiles(".png", "H:\"))
bt.pack()
lb = tk.Listbox(root)
lb.pack(fill="both", expand=1)
lb.bind("<Double-Button>", lambda x: open_file())
root.mainloop()

Answer #3

-----> pip install gensim config --global http.sslVerify false

Just install any package with the "config --global http.sslVerify false" statement

You can ignore SSL errors by setting pypi.org and files.pythonhosted.org as trusted hosts.

$ pip install --trusted-host pypi.org --trusted-host files.pythonhosted.org <package_name>

Note: Sometime during April 2018, the Python Package Index was migrated from pypi.python.org to pypi.org. This means "trusted-host" commands using the old domain no longer work.

Permanent Fix

Since the release of pip 10.0, you should be able to fix this permanently just by upgrading pip itself:

$ pip install --trusted-host pypi.org --trusted-host files.pythonhosted.org pip setuptools

Or by just reinstalling it to get the latest version:

$ curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py

(… and then running get-pip.py with the relevant Python interpreter).

pip install <otherpackage> should just work after this. If not, then you will need to do more, as explained below.


You may want to add the trusted hosts and proxy to your config file.

pip.ini (Windows) or pip.conf (unix)

[global]
trusted-host = pypi.python.org
               pypi.org
               files.pythonhosted.org

Alternate Solutions (Less secure)

Most of the answers could pose a security issue.

Two of the workarounds that help in installing most of the python packages with ease would be:

  • Using easy_install: if you are really lazy and don"t want to waste much time, use easy_install <package_name>. Note that some packages won"t be found or will give small errors.
  • Using Wheel: download the Wheel of the python package and use the pip command pip install wheel_package_name.whl to install the package.

Answer #4

It helps to install a python package foo on your machine (can also be in virtualenv) so that you can import the package foo from other projects and also from [I]Python prompts.

It does the similar job of pip, easy_install etc.,


Using setup.py

Let"s start with some definitions:

Package - A folder/directory that contains __init__.py file.
Module - A valid python file with .py extension.
Distribution - How one package relates to other packages and modules.

Let"s say you want to install a package named foo. Then you do,

$ git clone https://github.com/user/foo  
$ cd foo
$ python setup.py install

Instead, if you don"t want to actually install it but still would like to use it. Then do,

$ python setup.py develop  

This command will create symlinks to the source directory within site-packages instead of copying things. Because of this, it is quite fast (particularly for large packages).


Creating setup.py

If you have your package tree like,

foo
├── foo
│   ├── data_struct.py
│   ├── __init__.py
│   └── internals.py
├── README
├── requirements.txt
└── setup.py

Then, you do the following in your setup.py script so that it can be installed on some machine:

from setuptools import setup

setup(
   name="foo",
   version="1.0",
   description="A useful module",
   author="Man Foo",
   author_email="[email protected]",
   packages=["foo"],  #same as name
   install_requires=["wheel", "bar", "greek"], #external packages as dependencies
)

Instead, if your package tree is more complex like the one below:

foo
├── foo
│   ├── data_struct.py
│   ├── __init__.py
│   └── internals.py
├── README
├── requirements.txt
├── scripts
│   ├── cool
│   └── skype
└── setup.py

Then, your setup.py in this case would be like:

from setuptools import setup

setup(
   name="foo",
   version="1.0",
   description="A useful module",
   author="Man Foo",
   author_email="[email protected]",
   packages=["foo"],  #same as name
   install_requires=["wheel", "bar", "greek"], #external packages as dependencies
   scripts=[
            "scripts/cool",
            "scripts/skype",
           ]
)

Add more stuff to (setup.py) & make it decent:

from setuptools import setup

with open("README", "r") as f:
    long_description = f.read()

setup(
   name="foo",
   version="1.0",
   description="A useful module",
   license="MIT",
   long_description=long_description,
   author="Man Foo",
   author_email="[email protected]",
   url="http://www.foopackage.com/",
   packages=["foo"],  #same as name
   install_requires=["wheel", "bar", "greek"], #external packages as dependencies
   scripts=[
            "scripts/cool",
            "scripts/skype",
           ]
)

The long_description is used in pypi.org as the README description of your package.


And finally, you"re now ready to upload your package to PyPi.org so that others can install your package using pip install yourpackage.

At this point there are two options.

  • publish in the temporary test.pypi.org server to make oneself familiarize with the procedure, and then publish it on the permanent pypi.org server for the public to use your package.
  • publish straight away on the permanent pypi.org server, if you are already familiar with the procedure and have your user credentials (e.g., username, password, package name)

Once your package name is registered in pypi.org, nobody can claim or use it. Python packaging suggests the twine package for uploading purposes (of your package to PyPi). Thus,

(1) the first step is to locally build the distributions using:

# prereq: wheel (pip install wheel)  
$ python setup.py sdist bdist_wheel   

(2) then using twine for uploading either to test.pypi.org or pypi.org:

$ twine upload --repository testpypi dist/*  
username: ***  
password: ***  

It will take few minutes for the package to appear on test.pypi.org. Once you"re satisfied with it, you can then upload your package to the real & permanent index of pypi.org simply with:

$ twine upload dist/*  

Optionally, you can also sign the files in your package with a GPG by:

$ twine upload dist/* --sign 

Bonus Reading:

Answer #5

tl;dr / quick fix

  • Don"t decode/encode willy nilly
  • Don"t assume your strings are UTF-8 encoded
  • Try to convert strings to Unicode strings as soon as possible in your code
  • Fix your locale: How to solve UnicodeDecodeError in Python 3.6?
  • Don"t be tempted to use quick reload hacks

Unicode Zen in Python 2.x - The Long Version

Without seeing the source it"s difficult to know the root cause, so I"ll have to speak generally.

UnicodeDecodeError: "ascii" codec can"t decode byte generally happens when you try to convert a Python 2.x str that contains non-ASCII to a Unicode string without specifying the encoding of the original string.

In brief, Unicode strings are an entirely separate type of Python string that does not contain any encoding. They only hold Unicode point codes and therefore can hold any Unicode point from across the entire spectrum. Strings contain encoded text, beit UTF-8, UTF-16, ISO-8895-1, GBK, Big5 etc. Strings are decoded to Unicode and Unicodes are encoded to strings. Files and text data are always transferred in encoded strings.

The Markdown module authors probably use unicode() (where the exception is thrown) as a quality gate to the rest of the code - it will convert ASCII or re-wrap existing Unicodes strings to a new Unicode string. The Markdown authors can"t know the encoding of the incoming string so will rely on you to decode strings to Unicode strings before passing to Markdown.

Unicode strings can be declared in your code using the u prefix to strings. E.g.

>>> my_u = u"my ünicôdé strįng"
>>> type(my_u)
<type "unicode">

Unicode strings may also come from file, databases and network modules. When this happens, you don"t need to worry about the encoding.

Gotchas

Conversion from str to Unicode can happen even when you don"t explicitly call unicode().

The following scenarios cause UnicodeDecodeError exceptions:

# Explicit conversion without encoding
unicode("€")

# New style format string into Unicode string
# Python will try to convert value string to Unicode first
u"The currency is: {}".format("€")

# Old style format string into Unicode string
# Python will try to convert value string to Unicode first
u"The currency is: %s" % "€"

# Append string to Unicode
# Python will try to convert string to Unicode first
u"The currency is: " + "€"         

Examples

In the following diagram, you can see how the word café has been encoded in either "UTF-8" or "Cp1252" encoding depending on the terminal type. In both examples, caf is just regular ascii. In UTF-8, é is encoded using two bytes. In "Cp1252", é is 0xE9 (which is also happens to be the Unicode point value (it"s no coincidence)). The correct decode() is invoked and conversion to a Python Unicode is successfull: Diagram of a string being converted to a Python Unicode string

In this diagram, decode() is called with ascii (which is the same as calling unicode() without an encoding given). As ASCII can"t contain bytes greater than 0x7F, this will throw a UnicodeDecodeError exception:

Diagram of a string being converted to a Python Unicode string with the wrong encoding

The Unicode Sandwich

It"s good practice to form a Unicode sandwich in your code, where you decode all incoming data to Unicode strings, work with Unicodes, then encode to strs on the way out. This saves you from worrying about the encoding of strings in the middle of your code.

Input / Decode

Source code

If you need to bake non-ASCII into your source code, just create Unicode strings by prefixing the string with a u. E.g.

u"Zürich"

To allow Python to decode your source code, you will need to add an encoding header to match the actual encoding of your file. For example, if your file was encoded as "UTF-8", you would use:

# encoding: utf-8

This is only necessary when you have non-ASCII in your source code.

Files

Usually non-ASCII data is received from a file. The io module provides a TextWrapper that decodes your file on the fly, using a given encoding. You must use the correct encoding for the file - it can"t be easily guessed. For example, for a UTF-8 file:

import io
with io.open("my_utf8_file.txt", "r", encoding="utf-8") as my_file:
     my_unicode_string = my_file.read() 

my_unicode_string would then be suitable for passing to Markdown. If a UnicodeDecodeError from the read() line, then you"ve probably used the wrong encoding value.

CSV Files

The Python 2.7 CSV module does not support non-ASCII characters üò©. Help is at hand, however, with https://pypi.python.org/pypi/backports.csv.

Use it like above but pass the opened file to it:

from backports import csv
import io
with io.open("my_utf8_file.txt", "r", encoding="utf-8") as my_file:
    for row in csv.reader(my_file):
        yield row

Databases

Most Python database drivers can return data in Unicode, but usually require a little configuration. Always use Unicode strings for SQL queries.

MySQL

In the connection string add:

charset="utf8",
use_unicode=True

E.g.

>>> db = MySQLdb.connect(host="localhost", user="root", passwd="passwd", db="sandbox", use_unicode=True, charset="utf8")
PostgreSQL

Add:

psycopg2.extensions.register_type(psycopg2.extensions.UNICODE)
psycopg2.extensions.register_type(psycopg2.extensions.UNICODEARRAY)

HTTP

Web pages can be encoded in just about any encoding. The Content-type header should contain a charset field to hint at the encoding. The content can then be decoded manually against this value. Alternatively, Python-Requests returns Unicodes in response.text.

Manually

If you must decode strings manually, you can simply do my_string.decode(encoding), where encoding is the appropriate encoding. Python 2.x supported codecs are given here: Standard Encodings. Again, if you get UnicodeDecodeError then you"ve probably got the wrong encoding.

The meat of the sandwich

Work with Unicodes as you would normal strs.

Output

stdout / printing

print writes through the stdout stream. Python tries to configure an encoder on stdout so that Unicodes are encoded to the console"s encoding. For example, if a Linux shell"s locale is en_GB.UTF-8, the output will be encoded to UTF-8. On Windows, you will be limited to an 8bit code page.

An incorrectly configured console, such as corrupt locale, can lead to unexpected print errors. PYTHONIOENCODING environment variable can force the encoding for stdout.

Files

Just like input, io.open can be used to transparently convert Unicodes to encoded byte strings.

Database

The same configuration for reading will allow Unicodes to be written directly.

Python 3

Python 3 is no more Unicode capable than Python 2.x is, however it is slightly less confused on the topic. E.g the regular str is now a Unicode string and the old str is now bytes.

The default encoding is UTF-8, so if you .decode() a byte string without giving an encoding, Python 3 uses UTF-8 encoding. This probably fixes 50% of people"s Unicode problems.

Further, open() operates in text mode by default, so returns decoded str (Unicode ones). The encoding is derived from your locale, which tends to be UTF-8 on Un*x systems or an 8-bit code page, such as windows-1251, on Windows boxes.

Why you shouldn"t use sys.setdefaultencoding("utf8")

It"s a nasty hack (there"s a reason you have to use reload) that will only mask problems and hinder your migration to Python 3.x. Understand the problem, fix the root cause and enjoy Unicode zen. See Why should we NOT use sys.setdefaultencoding("utf-8") in a py script? for further details

Answer #6

You can"t.

One workaround is to create clone a new environment and then remove the original one.

First, remember to deactivate your current environment. You can do this with the commands:

  • deactivate on Windows or
  • source deactivate on macOS/Linux.

Then:

conda create --name new_name --clone old_name
conda remove --name old_name --all # or its alias: `conda env remove --name old_name`

Notice there are several drawbacks of this method:

  1. It redownloads packages (you can use --offline flag to disable it)
  2. Time consumed on copying environment"s files
  3. Temporary double disk usage

There is an open issue requesting this feature.

Answer #7

Explanation

From PEP 328

Relative imports use a module"s __name__ attribute to determine that module"s position in the package hierarchy. If the module"s name does not contain any package information (e.g. it is set to "__main__") then relative imports are resolved as if the module were a top level module, regardless of where the module is actually located on the file system.

At some point PEP 338 conflicted with PEP 328:

... relative imports rely on __name__ to determine the current module"s position in the package hierarchy. In a main module, the value of __name__ is always "__main__", so explicit relative imports will always fail (as they only work for a module inside a package)

and to address the issue, PEP 366 introduced the top level variable __package__:

By adding a new module level attribute, this PEP allows relative imports to work automatically if the module is executed using the -m switch. A small amount of boilerplate in the module itself will allow the relative imports to work when the file is executed by name. [...] When it [the attribute] is present, relative imports will be based on this attribute rather than the module __name__ attribute. [...] When the main module is specified by its filename, then the __package__ attribute will be set to None. [...] When the import system encounters an explicit relative import in a module without __package__ set (or with it set to None), it will calculate and store the correct value (__name__.rpartition(".")[0] for normal modules and __name__ for package initialisation modules)

(emphasis mine)

If the __name__ is "__main__", __name__.rpartition(".")[0] returns empty string. This is why there"s empty string literal in the error description:

SystemError: Parent module "" not loaded, cannot perform relative import

The relevant part of the CPython"s PyImport_ImportModuleLevelObject function:

if (PyDict_GetItem(interp->modules, package) == NULL) {
    PyErr_Format(PyExc_SystemError,
            "Parent module %R not loaded, cannot perform relative "
            "import", package);
    goto error;
}

CPython raises this exception if it was unable to find package (the name of the package) in interp->modules (accessible as sys.modules). Since sys.modules is "a dictionary that maps module names to modules which have already been loaded", it"s now clear that the parent module must be explicitly absolute-imported before performing relative import.

Note: The patch from the issue 18018 has added another if block, which will be executed before the code above:

if (PyUnicode_CompareWithASCIIString(package, "") == 0) {
    PyErr_SetString(PyExc_ImportError,
            "attempted relative import with no known parent package");
    goto error;
} /* else if (PyDict_GetItem(interp->modules, package) == NULL) {
    ...
*/

If package (same as above) is empty string, the error message will be

ImportError: attempted relative import with no known parent package

However, you will only see this in Python 3.6 or newer.

Solution #1: Run your script using -m

Consider a directory (which is a Python package):

.
├── package
│   ├── __init__.py
│   ├── module.py
│   └── standalone.py

All of the files in package begin with the same 2 lines of code:

from pathlib import Path
print("Running" if __name__ == "__main__" else "Importing", Path(__file__).resolve())

I"m including these two lines only to make the order of operations obvious. We can ignore them completely, since they don"t affect the execution.

__init__.py and module.py contain only those two lines (i.e., they are effectively empty).

standalone.py additionally attempts to import module.py via relative import:

from . import module  # explicit relative import

We"re well aware that /path/to/python/interpreter package/standalone.py will fail. However, we can run the module with the -m command line option that will "search sys.path for the named module and execute its contents as the __main__ module":

[email protected]:~$ python3 -i -m package.standalone
Importing /home/vaultah/package/__init__.py
Running /home/vaultah/package/standalone.py
Importing /home/vaultah/package/module.py
>>> __file__
"/home/vaultah/package/standalone.py"
>>> __package__
"package"
>>> # The __package__ has been correctly set and module.py has been imported.
... # What"s inside sys.modules?
... import sys
>>> sys.modules["__main__"]
<module "package.standalone" from "/home/vaultah/package/standalone.py">
>>> sys.modules["package.module"]
<module "package.module" from "/home/vaultah/package/module.py">
>>> sys.modules["package"]
<module "package" from "/home/vaultah/package/__init__.py">

-m does all the importing stuff for you and automatically sets __package__, but you can do that yourself in the

Solution #2: Set __package__ manually

Please treat it as a proof of concept rather than an actual solution. It isn"t well-suited for use in real-world code.

PEP 366 has a workaround to this problem, however, it"s incomplete, because setting __package__ alone is not enough. You"re going to need to import at least N preceding packages in the module hierarchy, where N is the number of parent directories (relative to the directory of the script) that will be searched for the module being imported.

Thus,

  1. Add the parent directory of the Nth predecessor of the current module to sys.path

  2. Remove the current file"s directory from sys.path

  3. Import the parent module of the current module using its fully-qualified name

  4. Set __package__ to the fully-qualified name from 2

  5. Perform the relative import

I"ll borrow files from the Solution #1 and add some more subpackages:

package
├── __init__.py
├── module.py
└── subpackage
    ├── __init__.py
    └── subsubpackage
        ├── __init__.py
        └── standalone.py

This time standalone.py will import module.py from the package package using the following relative import

from ... import module  # N = 3

We"ll need to precede that line with the boilerplate code, to make it work.

import sys
from pathlib import Path

if __name__ == "__main__" and __package__ is None:
    file = Path(__file__).resolve()
    parent, top = file.parent, file.parents[3]

    sys.path.append(str(top))
    try:
        sys.path.remove(str(parent))
    except ValueError: # Already removed
        pass

    import package.subpackage.subsubpackage
    __package__ = "package.subpackage.subsubpackage"

from ... import module # N = 3

It allows us to execute standalone.py by filename:

[email protected]:~$ python3 package/subpackage/subsubpackage/standalone.py
Running /home/vaultah/package/subpackage/subsubpackage/standalone.py
Importing /home/vaultah/package/__init__.py
Importing /home/vaultah/package/subpackage/__init__.py
Importing /home/vaultah/package/subpackage/subsubpackage/__init__.py
Importing /home/vaultah/package/module.py

A more general solution wrapped in a function can be found here. Example usage:

if __name__ == "__main__" and __package__ is None:
    import_parents(level=3) # N = 3

from ... import module
from ...module.submodule import thing

Solution #3: Use absolute imports and setuptools

The steps are -

  1. Replace explicit relative imports with equivalent absolute imports

  2. Install package to make it importable

For instance, the directory structure may be as follows

.
├── project
│   ├── package
│   │   ├── __init__.py
│   │   ├── module.py
│   │   └── standalone.py
│   └── setup.py

where setup.py is

from setuptools import setup, find_packages
setup(
    name = "your_package_name",
    packages = find_packages(),
)

The rest of the files were borrowed from the Solution #1.

Installation will allow you to import the package regardless of your working directory (assuming there"ll be no naming issues).

We can modify standalone.py to use this advantage (step 1):

from package import module  # absolute import

Change your working directory to project and run /path/to/python/interpreter setup.py install --user (--user installs the package in your site-packages directory) (step 2):

[email protected]:~$ cd project
[email protected]:~/project$ python3 setup.py install --user

Let"s verify that it"s now possible to run standalone.py as a script:

[email protected]:~/project$ python3 -i package/standalone.py
Running /home/vaultah/project/package/standalone.py
Importing /home/vaultah/.local/lib/python3.6/site-packages/your_package_name-0.0.0-py3.6.egg/package/__init__.py
Importing /home/vaultah/.local/lib/python3.6/site-packages/your_package_name-0.0.0-py3.6.egg/package/module.py
>>> module
<module "package.module" from "/home/vaultah/.local/lib/python3.6/site-packages/your_package_name-0.0.0-py3.6.egg/package/module.py">
>>> import sys
>>> sys.modules["package"]
<module "package" from "/home/vaultah/.local/lib/python3.6/site-packages/your_package_name-0.0.0-py3.6.egg/package/__init__.py">
>>> sys.modules["package.module"]
<module "package.module" from "/home/vaultah/.local/lib/python3.6/site-packages/your_package_name-0.0.0-py3.6.egg/package/module.py">

Note: If you decide to go down this route, you"d be better off using virtual environments to install packages in isolation.

Solution #4: Use absolute imports and some boilerplate code

Frankly, the installation is not necessary - you could add some boilerplate code to your script to make absolute imports work.

I"m going to borrow files from Solution #1 and change standalone.py:

  1. Add the parent directory of package to sys.path before attempting to import anything from package using absolute imports:

    import sys
    from pathlib import Path # if you haven"t already done so
    file = Path(__file__).resolve()
    parent, root = file.parent, file.parents[1]
    sys.path.append(str(root))
    
    # Additionally remove the current file"s directory from sys.path
    try:
        sys.path.remove(str(parent))
    except ValueError: # Already removed
        pass
    
  2. Replace the relative import by the absolute import:

    from package import module  # absolute import
    

standalone.py runs without problems:

[email protected]:~$ python3 -i package/standalone.py
Running /home/vaultah/package/standalone.py
Importing /home/vaultah/package/__init__.py
Importing /home/vaultah/package/module.py
>>> module
<module "package.module" from "/home/vaultah/package/module.py">
>>> import sys
>>> sys.modules["package"]
<module "package" from "/home/vaultah/package/__init__.py">
>>> sys.modules["package.module"]
<module "package.module" from "/home/vaultah/package/module.py">

I feel that I should warn you: try not to do this, especially if your project has a complex structure.


As a side note, PEP 8 recommends the use of absolute imports, but states that in some scenarios explicit relative imports are acceptable:

Absolute imports are recommended, as they are usually more readable and tend to be better behaved (or at least give better error messages). [...] However, explicit relative imports are an acceptable alternative to absolute imports, especially when dealing with complex package layouts where using absolute imports would be unnecessarily verbose.

Answer #8

Is this the correct use of conftest.py?

Yes it is. Fixtures are a potential and common use of conftest.py. The fixtures that you will define will be shared among all tests in your test suite. However, defining fixtures in the root conftest.py might be useless and it would slow down testing if such fixtures are not used by all tests.

Does it have other uses?

Yes it does.

  • Fixtures: Define fixtures for static data used by tests. This data can be accessed by all tests in the suite unless specified otherwise. This could be data as well as helpers of modules which will be passed to all tests.

  • External plugin loading: conftest.py is used to import external plugins or modules. By defining the following global variable, pytest will load the module and make it available for its test. Plugins are generally files defined in your project or other modules which might be needed in your tests. You can also load a set of predefined plugins as explained here.

    pytest_plugins = "someapp.someplugin"

  • Hooks: You can specify hooks such as setup and teardown methods and much more to improve your tests. For a set of available hooks, read Hooks link. Example:

      def pytest_runtest_setup(item):
           """ called before ``pytest_runtest_call(item). """
           #do some stuff`
    
  • Test root path: This is a bit of a hidden feature. By defining conftest.py in your root path, you will have pytest recognizing your application modules without specifying PYTHONPATH. In the background, py.test modifies your sys.path by including all submodules which are found from the root path.

Can I have more than one conftest.py file?

Yes you can and it is strongly recommended if your test structure is somewhat complex. conftest.py files have directory scope. Therefore, creating targeted fixtures and helpers is good practice.

When would I want to do that? Examples will be appreciated.

Several cases could fit:

Creating a set of tools or hooks for a particular group of tests.

root/mod/conftest.py

def pytest_runtest_setup(item):
    print("I am mod")
    #do some stuff


test root/mod2/test.py will NOT produce "I am mod"

Loading a set of fixtures for some tests but not for others.

root/mod/conftest.py

@pytest.fixture()
def fixture():
    return "some stuff"

root/mod2/conftest.py

@pytest.fixture()
def fixture():
    return "some other stuff"

root/mod2/test.py

def test(fixture):
    print(fixture)

Will print "some other stuff".

Overriding hooks inherited from the root conftest.py.

root/mod/conftest.py

def pytest_runtest_setup(item):
    print("I am mod")
    #do some stuff

root/conftest.py

def pytest_runtest_setup(item):
    print("I am root")
    #do some stuff

By running any test inside root/mod, only "I am mod" is printed.

You can read more about conftest.py here.

EDIT:

What if I need plain-old helper functions to be called from a number of tests in different modules - will they be available to me if I put them in a conftest.py? Or should I simply put them in a helpers.py module and import and use it in my test modules?

You can use conftest.py to define your helpers. However, you should follow common practice. Helpers can be used as fixtures at least in pytest. For example in my tests I have a mock redis helper which I inject into my tests this way.

root/helper/redis/redis.py

@pytest.fixture
def mock_redis():
    return MockRedis()

root/tests/stuff/conftest.py

pytest_plugin="helper.redis.redis"

root/tests/stuff/test.py

def test(mock_redis):
    print(mock_redis.get("stuff"))

This will be a test module that you can freely import in your tests. NOTE that you could potentially name redis.py as conftest.py if your module redis contains more tests. However, that practice is discouraged because of ambiguity.

If you want to use conftest.py, you can simply put that helper in your root conftest.py and inject it when needed.

root/tests/conftest.py

@pytest.fixture
def mock_redis():
    return MockRedis()

root/tests/stuff/test.py

def test(mock_redis):
    print(mock_redis.get(stuff))

Another thing you can do is to write an installable plugin. In that case your helper can be written anywhere but it needs to define an entry point to be installed in your and other potential test frameworks. See this.

If you don"t want to use fixtures, you could of course define a simple helper and just use the plain old import wherever it is needed.

root/tests/helper/redis.py

class MockRedis():
    # stuff

root/tests/stuff/test.py

from helper.redis import MockRedis

def test():
    print(MockRedis().get(stuff))

However, here you might have problems with the path since the module is not in a child folder of the test. You should be able to overcome this (not tested) by adding an __init__.py to your helper

root/tests/helper/init.py

from .redis import MockRedis

Or simply adding the helper module to your PYTHONPATH.

Answer #9

I would suggest reading PEP 483 and PEP 484 and watching this presentation by Guido on type hinting.

In a nutshell: Type hinting is literally what the words mean. You hint the type of the object(s) you"re using.

Due to the dynamic nature of Python, inferring or checking the type of an object being used is especially hard. This fact makes it hard for developers to understand what exactly is going on in code they haven"t written and, most importantly, for type checking tools found in many IDEs (PyCharm and PyDev come to mind) that are limited due to the fact that they don"t have any indicator of what type the objects are. As a result they resort to trying to infer the type with (as mentioned in the presentation) around 50% success rate.


To take two important slides from the type hinting presentation:

Why type hints?

  1. Helps type checkers: By hinting at what type you want the object to be the type checker can easily detect if, for instance, you"re passing an object with a type that isn"t expected.
  2. Helps with documentation: A third person viewing your code will know what is expected where, ergo, how to use it without getting them TypeErrors.
  3. Helps IDEs develop more accurate and robust tools: Development Environments will be better suited at suggesting appropriate methods when know what type your object is. You have probably experienced this with some IDE at some point, hitting the . and having methods/attributes pop up which aren"t defined for an object.

Why use static type checkers?

  • Find bugs sooner: This is self-evident, I believe.
  • The larger your project the more you need it: Again, makes sense. Static languages offer a robustness and control that dynamic languages lack. The bigger and more complex your application becomes the more control and predictability (from a behavioral aspect) you require.
  • Large teams are already running static analysis: I"m guessing this verifies the first two points.

As a closing note for this small introduction: This is an optional feature and, from what I understand, it has been introduced in order to reap some of the benefits of static typing.

You generally do not need to worry about it and definitely don"t need to use it (especially in cases where you use Python as an auxiliary scripting language). It should be helpful when developing large projects as it offers much needed robustness, control and additional debugging capabilities.


Type hinting with mypy:

In order to make this answer more complete, I think a little demonstration would be suitable. I"ll be using mypy, the library which inspired Type Hints as they are presented in the PEP. This is mainly written for anybody bumping into this question and wondering where to begin.

Before I do that let me reiterate the following: PEP 484 doesn"t enforce anything; it is simply setting a direction for function annotations and proposing guidelines for how type checking can/should be performed. You can annotate your functions and hint as many things as you want; your scripts will still run regardless of the presence of annotations because Python itself doesn"t use them.

Anyways, as noted in the PEP, hinting types should generally take three forms:

  • Function annotations (PEP 3107).
  • Stub files for built-in/user modules.
  • Special # type: type comments that complement the first two forms. (See: What are variable annotations? for a Python 3.6 update for # type: type comments)

Additionally, you"ll want to use type hints in conjunction with the new typing module introduced in Py3.5. In it, many (additional) ABCs (abstract base classes) are defined along with helper functions and decorators for use in static checking. Most ABCs in collections.abc are included, but in a generic form in order to allow subscription (by defining a __getitem__() method).

For anyone interested in a more in-depth explanation of these, the mypy documentation is written very nicely and has a lot of code samples demonstrating/describing the functionality of their checker; it is definitely worth a read.

Function annotations and special comments:

First, it"s interesting to observe some of the behavior we can get when using special comments. Special # type: type comments can be added during variable assignments to indicate the type of an object if one cannot be directly inferred. Simple assignments are generally easily inferred but others, like lists (with regard to their contents), cannot.

Note: If we want to use any derivative of containers and need to specify the contents for that container we must use the generic types from the typing module. These support indexing.

# Generic List, supports indexing.
from typing import List

# In this case, the type is easily inferred as type: int.
i = 0

# Even though the type can be inferred as of type list
# there is no way to know the contents of this list.
# By using type: List[str] we indicate we want to use a list of strings.
a = []  # type: List[str]

# Appending an int to our list
# is statically not correct.
a.append(i)

# Appending a string is fine.
a.append("i")

print(a)  # [0, "i"]

If we add these commands to a file and execute them with our interpreter, everything works just fine and print(a) just prints the contents of list a. The # type comments have been discarded, treated as plain comments which have no additional semantic meaning.

By running this with mypy, on the other hand, we get the following response:

(Python3)[email protected]: mypy typeHintsCode.py
typesInline.py:14: error: Argument 1 to "append" of "list" has incompatible type "int"; expected "str"

Indicating that a list of str objects cannot contain an int, which, statically speaking, is sound. This can be fixed by either abiding to the type of a and only appending str objects or by changing the type of the contents of a to indicate that any value is acceptable (Intuitively performed with List[Any] after Any has been imported from typing).

Function annotations are added in the form param_name : type after each parameter in your function signature and a return type is specified using the -> type notation before the ending function colon; all annotations are stored in the __annotations__ attribute for that function in a handy dictionary form. Using a trivial example (which doesn"t require extra types from the typing module):

def annotated(x: int, y: str) -> bool:
    return x < y

The annotated.__annotations__ attribute now has the following values:

{"y": <class "str">, "return": <class "bool">, "x": <class "int">}

If we"re a complete newbie, or we are familiar with Python 2.7 concepts and are consequently unaware of the TypeError lurking in the comparison of annotated, we can perform another static check, catch the error and save us some trouble:

(Python3)[email protected]: mypy typeHintsCode.py
typeFunction.py: note: In function "annotated":
typeFunction.py:2: error: Unsupported operand types for > ("str" and "int")

Among other things, calling the function with invalid arguments will also get caught:

annotated(20, 20)

# mypy complains:
typeHintsCode.py:4: error: Argument 2 to "annotated" has incompatible type "int"; expected "str"

These can be extended to basically any use case and the errors caught extend further than basic calls and operations. The types you can check for are really flexible and I have merely given a small sneak peak of its potential. A look in the typing module, the PEPs or the mypy documentation will give you a more comprehensive idea of the capabilities offered.

Stub files:

Stub files can be used in two different non mutually exclusive cases:

  • You need to type check a module for which you do not want to directly alter the function signatures
  • You want to write modules and have type-checking but additionally want to separate annotations from content.

What stub files (with an extension of .pyi) are is an annotated interface of the module you are making/want to use. They contain the signatures of the functions you want to type-check with the body of the functions discarded. To get a feel of this, given a set of three random functions in a module named randfunc.py:

def message(s):
    print(s)

def alterContents(myIterable):
    return [i for i in myIterable if i % 2 == 0]

def combine(messageFunc, itFunc):
    messageFunc("Printing the Iterable")
    a = alterContents(range(1, 20))
    return set(a)

We can create a stub file randfunc.pyi, in which we can place some restrictions if we wish to do so. The downside is that somebody viewing the source without the stub won"t really get that annotation assistance when trying to understand what is supposed to be passed where.

Anyway, the structure of a stub file is pretty simplistic: Add all function definitions with empty bodies (pass filled) and supply the annotations based on your requirements. Here, let"s assume we only want to work with int types for our Containers.

# Stub for randfucn.py
from typing import Iterable, List, Set, Callable

def message(s: str) -> None: pass

def alterContents(myIterable: Iterable[int])-> List[int]: pass

def combine(
    messageFunc: Callable[[str], Any],
    itFunc: Callable[[Iterable[int]], List[int]]
)-> Set[int]: pass

The combine function gives an indication of why you might want to use annotations in a different file, they some times clutter up the code and reduce readability (big no-no for Python). You could of course use type aliases but that sometime confuses more than it helps (so use them wisely).


This should get you familiarized with the basic concepts of type hints in Python. Even though the type checker used has been mypy you should gradually start to see more of them pop-up, some internally in IDEs (PyCharm,) and others as standard Python modules.

I"ll try and add additional checkers/related packages in the following list when and if I find them (or if suggested).

Checkers I know of:

  • Mypy: as described here.
  • PyType: By Google, uses different notation from what I gather, probably worth a look.

Related Packages/Projects:

  • typeshed: Official Python repository housing an assortment of stub files for the standard library.

The typeshed project is actually one of the best places you can look to see how type hinting might be used in a project of your own. Let"s take as an example the __init__ dunders of the Counter class in the corresponding .pyi file:

class Counter(Dict[_T, int], Generic[_T]):
        @overload
        def __init__(self) -> None: ...
        @overload
        def __init__(self, Mapping: Mapping[_T, int]) -> None: ...
        @overload
        def __init__(self, iterable: Iterable[_T]) -> None: ...

Where _T = TypeVar("_T") is used to define generic classes. For the Counter class we can see that it can either take no arguments in its initializer, get a single Mapping from any type to an int or take an Iterable of any type.


Notice: One thing I forgot to mention was that the typing module has been introduced on a provisional basis. From PEP 411:

A provisional package may have its API modified prior to "graduating" into a "stable" state. On one hand, this state provides the package with the benefits of being formally part of the Python distribution. On the other hand, the core development team explicitly states that no promises are made with regards to the the stability of the package"s API, which may change for the next release. While it is considered an unlikely outcome, such packages may even be removed from the standard library without a deprecation period if the concerns regarding their API or maintenance prove well-founded.

So take things here with a pinch of salt; I"m doubtful it will be removed or altered in significant ways, but one can never know.


** Another topic altogether, but valid in the scope of type-hints: PEP 526: Syntax for Variable Annotations is an effort to replace # type comments by introducing new syntax which allows users to annotate the type of variables in simple varname: type statements.

See What are variable annotations?, as previously mentioned, for a small introduction to these.

Answer #10

Whatever is assigned to the files variable is incorrect. Use the following code.

import glob
import os

list_of_files = glob.glob("/path/to/folder/*") # * means all if need specific format then *.csv
latest_file = max(list_of_files, key=os.path.getctime)
print(latest_file)

Get Solution for free from DataCamp guru