Python | os.path.isfile () method

File handling | isfile | Python Methods and Functions

os.path.isfile() in Python is used to check if the specified path is an existing regular file or not.

Syntax: os.path.isfile (path)

Parameter:
path : A path-like object representing a file system path. A path-like object is either a string or bytes object representing a path.

Return Type: This method returns a Boolean value of class bool . This method returns True if specified path is an existing regular file, otherwise returns False.

Code # 1: Using the os.path.isfile () method

# Python program to explain the os.path.isfile () method

 
# os module import

import os

  
# Path

path = ' /home/User/Desktop/file.txt'

 
# Check if
# specified path
# existing file

  isFile = os.path.isfile (path)

print (isFile)

  

 
# Path

path = '/ home / User / Desktop /'

 
# Check
# specified path
# existing file

isFile = os.path.isfile (path)

print (isFile)

Exit:

 True False 

Link: https://docs.python.org/3/library/os.path.html





Python | os.path.isfile () method: StackOverflow Questions

Answer #1

os.listdir() - list in the current directory

With listdir in os module you get the files and the folders in the current dir

 import os
 arr = os.listdir()
 print(arr)
 
 >>> ["$RECYCLE.BIN", "work.txt", "3ebooks.txt", "documents"]

Looking in a directory

arr = os.listdir("c:\files")

glob from glob

with glob you can specify a type of file to list like this

import glob

txtfiles = []
for file in glob.glob("*.txt"):
    txtfiles.append(file)

glob in a list comprehension

mylist = [f for f in glob.glob("*.txt")]

get the full path of only files in the current directory

import os
from os import listdir
from os.path import isfile, join

cwd = os.getcwd()
onlyfiles = [os.path.join(cwd, f) for f in os.listdir(cwd) if 
os.path.isfile(os.path.join(cwd, f))]
print(onlyfiles) 

["G:\getfilesname\getfilesname.py", "G:\getfilesname\example.txt"]

Getting the full path name with os.path.abspath

You get the full path in return

 import os
 files_path = [os.path.abspath(x) for x in os.listdir()]
 print(files_path)
 
 ["F:\documentiapplications.txt", "F:\documenticollections.txt"]

Walk: going through sub directories

os.walk returns the root, the directories list and the files list, that is why I unpacked them in r, d, f in the for loop; it, then, looks for other files and directories in the subfolders of the root and so on until there are no subfolders.

import os

# Getting the current work directory (cwd)
thisdir = os.getcwd()

# r=root, d=directories, f = files
for r, d, f in os.walk(thisdir):
    for file in f:
        if file.endswith(".docx"):
            print(os.path.join(r, file))

os.listdir(): get files in the current directory (Python 2)

In Python 2, if you want the list of the files in the current directory, you have to give the argument as "." or os.getcwd() in the os.listdir method.

 import os
 arr = os.listdir(".")
 print(arr)
 
 >>> ["$RECYCLE.BIN", "work.txt", "3ebooks.txt", "documents"]

To go up in the directory tree

# Method 1
x = os.listdir("..")

# Method 2
x= os.listdir("/")

Get files: os.listdir() in a particular directory (Python 2 and 3)

 import os
 arr = os.listdir("F:\python")
 print(arr)
 
 >>> ["$RECYCLE.BIN", "work.txt", "3ebooks.txt", "documents"]

Get files of a particular subdirectory with os.listdir()

import os

x = os.listdir("./content")

os.walk(".") - current directory

 import os
 arr = next(os.walk("."))[2]
 print(arr)
 
 >>> ["5bs_Turismo1.pdf", "5bs_Turismo1.pptx", "esperienza.txt"]

next(os.walk(".")) and os.path.join("dir", "file")

 import os
 arr = []
 for d,r,f in next(os.walk("F:\_python")):
     for file in f:
         arr.append(os.path.join(r,file))

 for f in arr:
     print(files)

>>> F:\_python\dict_class.py
>>> F:\_python\programmi.txt

next(os.walk("F:\") - get the full path - list comprehension

 [os.path.join(r,file) for r,d,f in next(os.walk("F:\_python")) for file in f]
 
 >>> ["F:\_python\dict_class.py", "F:\_python\programmi.txt"]

os.walk - get full path - all files in sub dirs**

x = [os.path.join(r,file) for r,d,f in os.walk("F:\_python") for file in f]
print(x)

>>> ["F:\_python\dict.py", "F:\_python\progr.txt", "F:\_python\readl.py"]

os.listdir() - get only txt files

 arr_txt = [x for x in os.listdir() if x.endswith(".txt")]
 print(arr_txt)
 
 >>> ["work.txt", "3ebooks.txt"]

Using glob to get the full path of the files

If I should need the absolute path of the files:

from path import path
from glob import glob
x = [path(f).abspath() for f in glob("F:\*.txt")]
for f in x:
    print(f)

>>> F:acquistionline.txt
>>> F:acquisti_2018.txt
>>> F:ootstrap_jquery_ecc.txt

Using os.path.isfile to avoid directories in the list

import os.path
listOfFiles = [f for f in os.listdir() if os.path.isfile(f)]
print(listOfFiles)

>>> ["a simple game.py", "data.txt", "decorator.py"]

Using pathlib from Python 3.4

import pathlib

flist = []
for p in pathlib.Path(".").iterdir():
    if p.is_file():
        print(p)
        flist.append(p)

 >>> error.PNG
 >>> exemaker.bat
 >>> guiprova.mp3
 >>> setup.py
 >>> speak_gui2.py
 >>> thumb.PNG

With list comprehension:

flist = [p for p in pathlib.Path(".").iterdir() if p.is_file()]

Alternatively, use pathlib.Path() instead of pathlib.Path(".")

Use glob method in pathlib.Path()

import pathlib

py = pathlib.Path().glob("*.py")
for file in py:
    print(file)

>>> stack_overflow_list.py
>>> stack_overflow_list_tkinter.py

Get all and only files with os.walk

import os
x = [i[2] for i in os.walk(".")]
y=[]
for t in x:
    for f in t:
        y.append(f)
print(y)

>>> ["append_to_list.py", "data.txt", "data1.txt", "data2.txt", "data_180617", "os_walk.py", "READ2.py", "read_data.py", "somma_defaltdic.py", "substitute_words.py", "sum_data.py", "data.txt", "data1.txt", "data_180617"]

Get only files with next and walk in a directory

 import os
 x = next(os.walk("F://python"))[2]
 print(x)
 
 >>> ["calculator.bat","calculator.py"]

Get only directories with next and walk in a directory

 import os
 next(os.walk("F://python"))[1] # for the current dir use (".")
 
 >>> ["python3","others"]

Get all the subdir names with walk

for r,d,f in os.walk("F:\_python"):
    for dirs in d:
        print(dirs)

>>> .vscode
>>> pyexcel
>>> pyschool.py
>>> subtitles
>>> _metaprogramming
>>> .ipynb_checkpoints

os.scandir() from Python 3.5 and greater

import os
x = [f.name for f in os.scandir() if f.is_file()]
print(x)

>>> ["calculator.bat","calculator.py"]

# Another example with scandir (a little variation from docs.python.org)
# This one is more efficient than os.listdir.
# In this case, it shows the files only in the current directory
# where the script is executed.

import os
with os.scandir() as i:
    for entry in i:
        if entry.is_file():
            print(entry.name)

>>> ebookmaker.py
>>> error.PNG
>>> exemaker.bat
>>> guiprova.mp3
>>> setup.py
>>> speakgui4.py
>>> speak_gui2.py
>>> speak_gui3.py
>>> thumb.PNG

Examples:

Ex. 1: How many files are there in the subdirectories?

In this example, we look for the number of files that are included in all the directory and its subdirectories.

import os

def count(dir, counter=0):
    "returns number of files in dir and subdirs"
    for pack in os.walk(dir):
        for f in pack[2]:
            counter += 1
    return dir + " : " + str(counter) + "files"

print(count("F:\python"))

>>> "F:\python" : 12057 files"

Ex.2: How to copy all files from a directory to another?

A script to make order in your computer finding all files of a type (default: pptx) and copying them in a new folder.

import os
import shutil
from path import path

destination = "F:\file_copied"
# os.makedirs(destination)

def copyfile(dir, filetype="pptx", counter=0):
    "Searches for pptx (or other - pptx is the default) files and copies them"
    for pack in os.walk(dir):
        for f in pack[2]:
            if f.endswith(filetype):
                fullpath = pack[0] + "\" + f
                print(fullpath)
                shutil.copy(fullpath, destination)
                counter += 1
    if counter > 0:
        print("-" * 30)
        print("	==> Found in: `" + dir + "` : " + str(counter) + " files
")

for dir in os.listdir():
    "searches for folders that starts with `_`"
    if dir[0] == "_":
        # copyfile(dir, filetype="pdf")
        copyfile(dir, filetype="txt")


>>> _compiti18Compito Contabilità 1conti.txt
>>> _compiti18Compito Contabilità 1modula4.txt
>>> _compiti18Compito Contabilità 1moduloa4.txt
>>> ------------------------
>>> ==> Found in: `_compiti18` : 3 files

Ex. 3: How to get all the files in a txt file

In case you want to create a txt file with all the file names:

import os
mylist = ""
with open("filelist.txt", "w", encoding="utf-8") as file:
    for eachfile in os.listdir():
        mylist += eachfile + "
"
    file.write(mylist)

Example: txt with all the files of an hard drive

"""
We are going to save a txt file with all the files in your directory.
We will use the function walk()
"""

import os

# see all the methods of os
# print(*dir(os), sep=", ")
listafile = []
percorso = []
with open("lista_file.txt", "w", encoding="utf-8") as testo:
    for root, dirs, files in os.walk("D:\"):
        for file in files:
            listafile.append(file)
            percorso.append(root + "\" + file)
            testo.write(file + "
")
listafile.sort()
print("N. of files", len(listafile))
with open("lista_file_ordinata.txt", "w", encoding="utf-8") as testo_ordinato:
    for file in listafile:
        testo_ordinato.write(file + "
")

with open("percorso.txt", "w", encoding="utf-8") as file_percorso:
    for file in percorso:
        file_percorso.write(file + "
")

os.system("lista_file.txt")
os.system("lista_file_ordinata.txt")
os.system("percorso.txt")

All the file of C: in one text file

This is a shorter version of the previous code. Change the folder where to start finding the files if you need to start from another position. This code generate a 50 mb on text file on my computer with something less then 500.000 lines with files with the complete path.

import os

with open("file.txt", "w", encoding="utf-8") as filewrite:
    for r, d, f in os.walk("C:\"):
        for file in f:
            filewrite.write(f"{r + file}
")

How to write a file with all paths in a folder of a type

With this function you can create a txt file that will have the name of a type of file that you look for (ex. pngfile.txt) with all the full path of all the files of that type. It can be useful sometimes, I think.

import os

def searchfiles(extension=".ttf", folder="H:\"):
    "Create a txt file with all the file of a type"
    with open(extension[1:] + "file.txt", "w", encoding="utf-8") as filewrite:
        for r, d, f in os.walk(folder):
            for file in f:
                if file.endswith(extension):
                    filewrite.write(f"{r + file}
")

# looking for png file (fonts) in the hard disk H:
searchfiles(".png", "H:\")

>>> H:4bs_18Dolphins5.png
>>> H:4bs_18Dolphins6.png
>>> H:4bs_18Dolphins7.png
>>> H:5_18marketing htmlassetsimageslogo2.png
>>> H:7z001.png
>>> H:7z002.png

(New) Find all files and open them with tkinter GUI

I just wanted to add in this 2019 a little app to search for all files in a dir and be able to open them by doubleclicking on the name of the file in the list. enter image description here

import tkinter as tk
import os

def searchfiles(extension=".txt", folder="H:\"):
    "insert all files in the listbox"
    for r, d, f in os.walk(folder):
        for file in f:
            if file.endswith(extension):
                lb.insert(0, r + "\" + file)

def open_file():
    os.startfile(lb.get(lb.curselection()[0]))

root = tk.Tk()
root.geometry("400x400")
bt = tk.Button(root, text="Search", command=lambda:searchfiles(".png", "H:\"))
bt.pack()
lb = tk.Listbox(root)
lb.pack(fill="both", expand=1)
lb.bind("<Double-Button>", lambda x: open_file())
root.mainloop()

Answer #2

Python syntax to delete a file

import os
os.remove("/tmp/<file_name>.txt")

Or

import os
os.unlink("/tmp/<file_name>.txt")

Or

pathlib Library for Python version >= 3.4

file_to_rem = pathlib.Path("/tmp/<file_name>.txt")
file_to_rem.unlink()

Path.unlink(missing_ok=False)

Unlink method used to remove the file or the symbolik link.

If missing_ok is false (the default), FileNotFoundError is raised if the path does not exist.
If missing_ok is true, FileNotFoundError exceptions will be ignored (same behavior as the POSIX rm -f command).
Changed in version 3.8: The missing_ok parameter was added.

Best practice

  1. First, check whether the file or folder exists or not then only delete that file. This can be achieved in two ways :
    a. os.path.isfile("/path/to/file")
    b. Use exception handling.

EXAMPLE for os.path.isfile

#!/usr/bin/python
import os
myfile="/tmp/foo.txt"

## If file exists, delete it ##
if os.path.isfile(myfile):
    os.remove(myfile)
else:    ## Show an error ##
    print("Error: %s file not found" % myfile)

Exception Handling

#!/usr/bin/python
import os

## Get input ##
myfile= raw_input("Enter file name to delete: ")

## Try to delete the file ##
try:
    os.remove(myfile)
except OSError as e:  ## if failed, report it back to the user ##
    print ("Error: %s - %s." % (e.filename, e.strerror))

RESPECTIVE OUTPUT

Enter file name to delete : demo.txt
Error: demo.txt - No such file or directory.

Enter file name to delete : rrr.txt
Error: rrr.txt - Operation not permitted.

Enter file name to delete : foo.txt

Python syntax to delete a folder

shutil.rmtree()

Example for shutil.rmtree()

#!/usr/bin/python
import os
import sys
import shutil

# Get directory name
mydir= raw_input("Enter directory name: ")

## Try to remove tree; if failed show an error using try...except on screen
try:
    shutil.rmtree(mydir)
except OSError as e:
    print ("Error: %s - %s." % (e.filename, e.strerror))

Answer #3

Although almost every possible way has been listed in (at least one of) the existing answers (e.g. Python 3.4 specific stuff was added), I"ll try to group everything together.

Note: every piece of Python standard library code that I"m going to post, belongs to version 3.5.3.

Problem statement:

  1. Check file (arguable: also folder ("special" file) ?) existence
  2. Don"t use try / except / else / finally blocks

Possible solutions:

  1. [Python 3]: os.path.exists(path) (also check other function family members like os.path.isfile, os.path.isdir, os.path.lexists for slightly different behaviors)

    os.path.exists(path)
    

    Return True if path refers to an existing path or an open file descriptor. Returns False for broken symbolic links. On some platforms, this function may return False if permission is not granted to execute os.stat() on the requested file, even if the path physically exists.

    All good, but if following the import tree:

    • os.path - posixpath.py (ntpath.py)

      • genericpath.py, line ~#20+

        def exists(path):
            """Test whether a path exists.  Returns False for broken symbolic links"""
            try:
                st = os.stat(path)
            except os.error:
                return False
            return True
        

    it"s just a try / except block around [Python 3]: os.stat(path, *, dir_fd=None, follow_symlinks=True). So, your code is try / except free, but lower in the framestack there"s (at least) one such block. This also applies to other funcs (including os.path.isfile).

    1.1. [Python 3]: Path.is_file()

    • It"s a fancier (and more pythonic) way of handling paths, but
    • Under the hood, it does exactly the same thing (pathlib.py, line ~#1330):

      def is_file(self):
          """
          Whether this path is a regular file (also True for symlinks pointing
          to regular files).
          """
          try:
              return S_ISREG(self.stat().st_mode)
          except OSError as e:
              if e.errno not in (ENOENT, ENOTDIR):
                  raise
              # Path doesn"t exist or is a broken symlink
              # (see https://bitbucket.org/pitrou/pathlib/issue/12/)
              return False
      
  2. [Python 3]: With Statement Context Managers. Either:

    • Create one:

      class Swallow:  # Dummy example
          swallowed_exceptions = (FileNotFoundError,)
      
          def __enter__(self):
              print("Entering...")
      
          def __exit__(self, exc_type, exc_value, exc_traceback):
              print("Exiting:", exc_type, exc_value, exc_traceback)
              return exc_type in Swallow.swallowed_exceptions  # only swallow FileNotFoundError (not e.g. TypeError - if the user passes a wrong argument like None or float or ...)
      
      • And its usage - I"ll replicate the os.path.isfile behavior (note that this is just for demonstrating purposes, do not attempt to write such code for production):

        import os
        import stat
        
        
        def isfile_seaman(path):  # Dummy func
            result = False
            with Swallow():
                result = stat.S_ISREG(os.stat(path).st_mode)
            return result
        
    • Use [Python 3]: contextlib.suppress(*exceptions) - which was specifically designed for selectively suppressing exceptions


    But, they seem to be wrappers over try / except / else / finally blocks, as [Python 3]: The with statement states:

    This allows common try...except...finally usage patterns to be encapsulated for convenient reuse.

  3. Filesystem traversal functions (and search the results for matching item(s))


    Since these iterate over folders, (in most of the cases) they are inefficient for our problem (there are exceptions, like non wildcarded globbing - as @ShadowRanger pointed out), so I"m not going to insist on them. Not to mention that in some cases, filename processing might be required.

  4. [Python 3]: os.access(path, mode, *, dir_fd=None, effective_ids=False, follow_symlinks=True) whose behavior is close to os.path.exists (actually it"s wider, mainly because of the 2nd argument)

    • user permissions might restrict the file "visibility" as the doc states:

      ...test if the invoking user has the specified access to path. mode should be F_OK to test the existence of path...

    os.access("/tmp", os.F_OK)

    Since I also work in C, I use this method as well because under the hood, it calls native APIs (again, via "${PYTHON_SRC_DIR}/Modules/posixmodule.c"), but it also opens a gate for possible user errors, and it"s not as Pythonic as other variants. So, as @AaronHall rightly pointed out, don"t use it unless you know what you"re doing:

    Note: calling native APIs is also possible via [Python 3]: ctypes - A foreign function library for Python, but in most cases it"s more complicated.

    (Win specific): Since vcruntime* (msvcr*) .dll exports a [MS.Docs]: _access, _waccess function family as well, here"s an example:

    Python 3.5.3 (v3.5.3:1880cb95a742, Jan 16 2017, 16:02:32) [MSC v.1900 64 bit (AMD64)] on win32
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import os, ctypes
    >>> ctypes.CDLL("msvcrt")._waccess(u"C:\Windows\System32\cmd.exe", os.F_OK)
    0
    >>> ctypes.CDLL("msvcrt")._waccess(u"C:\Windows\System32\cmd.exe.notexist", os.F_OK)
    -1
    

    Notes:

    • Although it"s not a good practice, I"m using os.F_OK in the call, but that"s just for clarity (its value is 0)
    • I"m using _waccess so that the same code works on Python3 and Python2 (in spite of unicode related differences between them)
    • Although this targets a very specific area, it was not mentioned in any of the previous answers


    The Lnx (Ubtu (16 x64)) counterpart as well:

    Python 3.5.2 (default, Nov 17 2016, 17:05:23)
    [GCC 5.4.0 20160609] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import os, ctypes
    >>> ctypes.CDLL("/lib/x86_64-linux-gnu/libc.so.6").access(b"/tmp", os.F_OK)
    0
    >>> ctypes.CDLL("/lib/x86_64-linux-gnu/libc.so.6").access(b"/tmp.notexist", os.F_OK)
    -1
    

    Notes:

    • Instead hardcoding libc"s path ("/lib/x86_64-linux-gnu/libc.so.6") which may (and most likely, will) vary across systems, None (or the empty string) can be passed to CDLL constructor (ctypes.CDLL(None).access(b"/tmp", os.F_OK)). According to [man7]: DLOPEN(3):

      If filename is NULL, then the returned handle is for the main program. When given to dlsym(), this handle causes a search for a symbol in the main program, followed by all shared objects loaded at program startup, and then all shared objects loaded by dlopen() with the flag RTLD_GLOBAL.

      • Main (current) program (python) is linked against libc, so its symbols (including access) will be loaded
      • This has to be handled with care, since functions like main, Py_Main and (all the) others are available; calling them could have disastrous effects (on the current program)
      • This doesn"t also apply to Win (but that"s not such a big deal, since msvcrt.dll is located in "%SystemRoot%System32" which is in %PATH% by default). I wanted to take things further and replicate this behavior on Win (and submit a patch), but as it turns out, [MS.Docs]: GetProcAddress function only "sees" exported symbols, so unless someone declares the functions in the main executable as __declspec(dllexport) (why on Earth the regular person would do that?), the main program is loadable but pretty much unusable
  5. Install some third-party module with filesystem capabilities

    Most likely, will rely on one of the ways above (maybe with slight customizations).
    One example would be (again, Win specific) [GitHub]: mhammond/pywin32 - Python for Windows (pywin32) Extensions, which is a Python wrapper over WINAPIs.

    But, since this is more like a workaround, I"m stopping here.

  6. Another (lame) workaround (gainarie) is (as I like to call it,) the sysadmin approach: use Python as a wrapper to execute shell commands

    • Win:

      (py35x64_test) e:WorkDevStackOverflowq000082831>"e:WorkDevVEnvspy35x64_testScriptspython.exe" -c "import os; print(os.system("dir /b "C:\Windows\System32\cmd.exe" > nul 2>&1"))"
      0
      
      (py35x64_test) e:WorkDevStackOverflowq000082831>"e:WorkDevVEnvspy35x64_testScriptspython.exe" -c "import os; print(os.system("dir /b "C:\Windows\System32\cmd.exe.notexist" > nul 2>&1"))"
      1
      
    • Nix (Lnx (Ubtu)):

      [[email protected]:~]> python3 -c "import os; print(os.system("ls "/tmp" > /dev/null 2>&1"))"
      0
      [[email protected]:~]> python3 -c "import os; print(os.system("ls "/tmp.notexist" > /dev/null 2>&1"))"
      512
      

Bottom line:

  • Do use try / except / else / finally blocks, because they can prevent you running into a series of nasty problems. A counter-example that I can think of, is performance: such blocks are costly, so try not to place them in code that it"s supposed to run hundreds of thousands times per second (but since (in most cases) it involves disk access, it won"t be the case).

Final note(s):

  • I will try to keep it up to date, any suggestions are welcome, I will incorporate anything useful that will come up into the answer

Answer #4

How do I check whether a file exists, using Python, without using a try statement?

Now available since Python 3.4, import and instantiate a Path object with the file name, and check the is_file method (note that this returns True for symlinks pointing to regular files as well):

>>> from pathlib import Path
>>> Path("/").is_file()
False
>>> Path("/initrd.img").is_file()
True
>>> Path("/doesnotexist").is_file()
False

If you"re on Python 2, you can backport the pathlib module from pypi, pathlib2, or otherwise check isfile from the os.path module:

>>> import os
>>> os.path.isfile("/")
False
>>> os.path.isfile("/initrd.img")
True
>>> os.path.isfile("/doesnotexist")
False

Now the above is probably the best pragmatic direct answer here, but there"s the possibility of a race condition (depending on what you"re trying to accomplish), and the fact that the underlying implementation uses a try, but Python uses try everywhere in its implementation.

Because Python uses try everywhere, there"s really no reason to avoid an implementation that uses it.

But the rest of this answer attempts to consider these caveats.

Longer, much more pedantic answer

Available since Python 3.4, use the new Path object in pathlib. Note that .exists is not quite right, because directories are not files (except in the unix sense that everything is a file).

>>> from pathlib import Path
>>> root = Path("/")
>>> root.exists()
True

So we need to use is_file:

>>> root.is_file()
False

Here"s the help on is_file:

is_file(self)
    Whether this path is a regular file (also True for symlinks pointing
    to regular files).

So let"s get a file that we know is a file:

>>> import tempfile
>>> file = tempfile.NamedTemporaryFile()
>>> filepathobj = Path(file.name)
>>> filepathobj.is_file()
True
>>> filepathobj.exists()
True

By default, NamedTemporaryFile deletes the file when closed (and will automatically close when no more references exist to it).

>>> del file
>>> filepathobj.exists()
False
>>> filepathobj.is_file()
False

If you dig into the implementation, though, you"ll see that is_file uses try:

def is_file(self):
    """
    Whether this path is a regular file (also True for symlinks pointing
    to regular files).
    """
    try:
        return S_ISREG(self.stat().st_mode)
    except OSError as e:
        if e.errno not in (ENOENT, ENOTDIR):
            raise
        # Path doesn"t exist or is a broken symlink
        # (see https://bitbucket.org/pitrou/pathlib/issue/12/)
        return False

Race Conditions: Why we like try

We like try because it avoids race conditions. With try, you simply attempt to read your file, expecting it to be there, and if not, you catch the exception and perform whatever fallback behavior makes sense.

If you want to check that a file exists before you attempt to read it, and you might be deleting it and then you might be using multiple threads or processes, or another program knows about that file and could delete it - you risk the chance of a race condition if you check it exists, because you are then racing to open it before its condition (its existence) changes.

Race conditions are very hard to debug because there"s a very small window in which they can cause your program to fail.

But if this is your motivation, you can get the value of a try statement by using the suppress context manager.

Avoiding race conditions without a try statement: suppress

Python 3.4 gives us the suppress context manager (previously the ignore context manager), which does semantically exactly the same thing in fewer lines, while also (at least superficially) meeting the original ask to avoid a try statement:

from contextlib import suppress
from pathlib import Path

Usage:

>>> with suppress(OSError), Path("doesnotexist").open() as f:
...     for line in f:
...         print(line)
... 
>>>
>>> with suppress(OSError):
...     Path("doesnotexist").unlink()
... 
>>> 

For earlier Pythons, you could roll your own suppress, but without a try will be more verbose than with. I do believe this actually is the only answer that doesn"t use try at any level in the Python that can be applied to prior to Python 3.4 because it uses a context manager instead:

class suppress(object):
    def __init__(self, *exceptions):
        self.exceptions = exceptions
    def __enter__(self):
        return self
    def __exit__(self, exc_type, exc_value, traceback):
        if exc_type is not None:
            return issubclass(exc_type, self.exceptions)

Perhaps easier with a try:

from contextlib import contextmanager

@contextmanager
def suppress(*exceptions):
    try:
        yield
    except exceptions:
        pass

Other options that don"t meet the ask for "without try":

isfile

import os
os.path.isfile(path)

from the docs:

os.path.isfile(path)

Return True if path is an existing regular file. This follows symbolic links, so both islink() and isfile() can be true for the same path.

But if you examine the source of this function, you"ll see it actually does use a try statement:

# This follows symbolic links, so both islink() and isdir() can be true
# for the same path on systems that support symlinks
def isfile(path):
    """Test whether a path is a regular file"""
    try:
        st = os.stat(path)
    except os.error:
        return False
    return stat.S_ISREG(st.st_mode)
>>> OSError is os.error
True

All it"s doing is using the given path to see if it can get stats on it, catching OSError and then checking if it"s a file if it didn"t raise the exception.

If you intend to do something with the file, I would suggest directly attempting it with a try-except to avoid a race condition:

try:
    with open(path) as f:
        f.read()
except OSError:
    pass

os.access

Available for Unix and Windows is os.access, but to use you must pass flags, and it does not differentiate between files and directories. This is more used to test if the real invoking user has access in an elevated privilege environment:

import os
os.access(path, os.F_OK)

It also suffers from the same race condition problems as isfile. From the docs:

Note: Using access() to check if a user is authorized to e.g. open a file before actually doing so using open() creates a security hole, because the user might exploit the short time interval between checking and opening the file to manipulate it. It’s preferable to use EAFP techniques. For example:

if os.access("myfile", os.R_OK):
    with open("myfile") as fp:
        return fp.read()
return "some default data"

is better written as:

try:
    fp = open("myfile")
except IOError as e:
    if e.errno == errno.EACCES:
        return "some default data"
    # Not a permission error.
    raise
else:
    with fp:
        return fp.read()

Avoid using os.access. It is a low level function that has more opportunities for user error than the higher level objects and functions discussed above.

Criticism of another answer:

Another answer says this about os.access:

Personally, I prefer this one because under the hood, it calls native APIs (via "${PYTHON_SRC_DIR}/Modules/posixmodule.c"), but it also opens a gate for possible user errors, and it"s not as Pythonic as other variants:

This answer says it prefers a non-Pythonic, error-prone method, with no justification. It seems to encourage users to use low-level APIs without understanding them.

It also creates a context manager which, by unconditionally returning True, allows all Exceptions (including KeyboardInterrupt and SystemExit!) to pass silently, which is a good way to hide bugs.

This seems to encourage users to adopt poor practices.

Answer #5

Testing for files and folders with os.path.isfile(), os.path.isdir() and os.path.exists()

Assuming that the "path" is a valid path, this table shows what is returned by each function for files and folders:

enter image description here

You can also test if a file is a certain type of file using os.path.splitext() to get the extension (if you don"t already know it)

>>> import os
>>> path = "path to a word document"
>>> os.path.isfile(path)
True
>>> os.path.splitext(path)[1] == ".docx" # test if the extension is .docx
True

Answer #6

import os


def convert_bytes(num):
    """
    this function will convert bytes to MB.... GB... etc
    """
    for x in ["bytes", "KB", "MB", "GB", "TB"]:
        if num < 1024.0:
            return "%3.1f %s" % (num, x)
        num /= 1024.0


def file_size(file_path):
    """
    this function will return the file size
    """
    if os.path.isfile(file_path):
        file_info = os.stat(file_path)
        return convert_bytes(file_info.st_size)


# Lets check the file size of MS Paint exe 
# or you can use any file path
file_path = r"C:WindowsSystem32mspaint.exe"
print file_size(file_path)

Result:

6.1 MB

Answer #7

In 2016 the best way is still using os.path.isfile:

>>> os.path.isfile("/path/to/some/file.txt")

Or in Python 3 you can use pathlib:

import pathlib
path = pathlib.Path("/path/to/some/file.txt")
if path.is_file():
    ...

Answer #8

Preliminary notes

  • Although there"s a clear differentiation between file and directory terms in the question text, some may argue that directories are actually special files
  • The statement: "all files of a directory" can be interpreted in two ways:
    1. All direct (or level 1) descendants only
    2. All descendants in the whole directory tree (including the ones in sub-directories)
  • When the question was asked, I imagine that Python 2, was the LTS version, however the code samples will be run by Python 3(.5) (I"ll keep them as Python 2 compliant as possible; also, any code belonging to Python that I"m going to post, is from v3.5.4 - unless otherwise specified). That has consequences related to another keyword in the question: "add them into a list":

    • In pre Python 2.2 versions, sequences (iterables) were mostly represented by lists (tuples, sets, ...)
    • In Python 2.2, the concept of generator ([Python.Wiki]: Generators) - courtesy of [Python 3]: The yield statement) - was introduced. As time passed, generator counterparts started to appear for functions that returned/worked with lists
    • In Python 3, generator is the default behavior
    • Not sure if returning a list is still mandatory (or a generator would do as well), but passing a generator to the list constructor, will create a list out of it (and also consume it). The example below illustrates the differences on [Python 3]: map(function, iterable, ...)
    >>> import sys
    >>> sys.version
    "2.7.10 (default, Mar  8 2016, 15:02:46) [MSC v.1600 64 bit (AMD64)]"
    >>> m = map(lambda x: x, [1, 2, 3])  # Just a dummy lambda function
    >>> m, type(m)
    ([1, 2, 3], <type "list">)
    >>> len(m)
    3
    


    >>> import sys
    >>> sys.version
    "3.5.4 (v3.5.4:3f56838, Aug  8 2017, 02:17:05) [MSC v.1900 64 bit (AMD64)]"
    >>> m = map(lambda x: x, [1, 2, 3])
    >>> m, type(m)
    (<map object at 0x000001B4257342B0>, <class "map">)
    >>> len(m)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: object of type "map" has no len()
    >>> lm0 = list(m)  # Build a list from the generator
    >>> lm0, type(lm0)
    ([1, 2, 3], <class "list">)
    >>>
    >>> lm1 = list(m)  # Build a list from the same generator
    >>> lm1, type(lm1)  # Empty list now - generator already consumed
    ([], <class "list">)
    
  • The examples will be based on a directory called root_dir with the following structure (this example is for Win, but I"m using the same tree on Lnx as well):

    E:WorkDevStackOverflowq003207219>tree /f "root_dir"
    Folder PATH listing for volume Work
    Volume serial number is 00000029 3655:6FED
    E:WORKDEVSTACKOVERFLOWQ003207219ROOT_DIR
    ¦   file0
    ¦   file1
    ¦
    +---dir0
    ¦   +---dir00
    ¦   ¦   ¦   file000
    ¦   ¦   ¦
    ¦   ¦   +---dir000
    ¦   ¦           file0000
    ¦   ¦
    ¦   +---dir01
    ¦   ¦       file010
    ¦   ¦       file011
    ¦   ¦
    ¦   +---dir02
    ¦       +---dir020
    ¦           +---dir0200
    +---dir1
    ¦       file10
    ¦       file11
    ¦       file12
    ¦
    +---dir2
    ¦   ¦   file20
    ¦   ¦
    ¦   +---dir20
    ¦           file200
    ¦
    +---dir3
    


Solutions

Programmatic approaches:

  1. [Python 3]: os.listdir(path=".")

    Return a list containing the names of the entries in the directory given by path. The list is in arbitrary order, and does not include the special entries "." and ".." ...


    >>> import os
    >>> root_dir = "root_dir"  # Path relative to current dir (os.getcwd())
    >>>
    >>> os.listdir(root_dir)  # List all the items in root_dir
    ["dir0", "dir1", "dir2", "dir3", "file0", "file1"]
    >>>
    >>> [item for item in os.listdir(root_dir) if os.path.isfile(os.path.join(root_dir, item))]  # Filter items and only keep files (strip out directories)
    ["file0", "file1"]
    

    A more elaborate example (code_os_listdir.py):

    import os
    from pprint import pformat
    
    
    def _get_dir_content(path, include_folders, recursive):
        entries = os.listdir(path)
        for entry in entries:
            entry_with_path = os.path.join(path, entry)
            if os.path.isdir(entry_with_path):
                if include_folders:
                    yield entry_with_path
                if recursive:
                    for sub_entry in _get_dir_content(entry_with_path, include_folders, recursive):
                        yield sub_entry
            else:
                yield entry_with_path
    
    
    def get_dir_content(path, include_folders=True, recursive=True, prepend_folder_name=True):
        path_len = len(path) + len(os.path.sep)
        for item in _get_dir_content(path, include_folders, recursive):
            yield item if prepend_folder_name else item[path_len:]
    
    
    def _get_dir_content_old(path, include_folders, recursive):
        entries = os.listdir(path)
        ret = list()
        for entry in entries:
            entry_with_path = os.path.join(path, entry)
            if os.path.isdir(entry_with_path):
                if include_folders:
                    ret.append(entry_with_path)
                if recursive:
                    ret.extend(_get_dir_content_old(entry_with_path, include_folders, recursive))
            else:
                ret.append(entry_with_path)
        return ret
    
    
    def get_dir_content_old(path, include_folders=True, recursive=True, prepend_folder_name=True):
        path_len = len(path) + len(os.path.sep)
        return [item if prepend_folder_name else item[path_len:] for item in _get_dir_content_old(path, include_folders, recursive)]
    
    
    def main():
        root_dir = "root_dir"
        ret0 = get_dir_content(root_dir, include_folders=True, recursive=True, prepend_folder_name=True)
        lret0 = list(ret0)
        print(ret0, len(lret0), pformat(lret0))
        ret1 = get_dir_content_old(root_dir, include_folders=False, recursive=True, prepend_folder_name=False)
        print(len(ret1), pformat(ret1))
    
    
    if __name__ == "__main__":
        main()
    

    Notes:

    • There are two implementations:
      • One that uses generators (of course here it seems useless, since I immediately convert the result to a list)
      • The classic one (function names ending in _old)
    • Recursion is used (to get into subdirectories)
    • For each implementation there are two functions:
      • One that starts with an underscore (_): "private" (should not be called directly) - that does all the work
      • The public one (wrapper over previous): it just strips off the initial path (if required) from the returned entries. It"s an ugly implementation, but it"s the only idea that I could come with at this point
    • In terms of performance, generators are generally a little bit faster (considering both creation and iteration times), but I didn"t test them in recursive functions, and also I am iterating inside the function over inner generators - don"t know how performance friendly is that
    • Play with the arguments to get different results


    Output:

    (py35x64_test) E:WorkDevStackOverflowq003207219>"e:WorkDevVEnvspy35x64_testScriptspython.exe" "code_os_listdir.py"
    <generator object get_dir_content at 0x000001BDDBB3DF10> 22 ["root_dir\dir0",
     "root_dir\dir0\dir00",
     "root_dir\dir0\dir00\dir000",
     "root_dir\dir0\dir00\dir000\file0000",
     "root_dir\dir0\dir00\file000",
     "root_dir\dir0\dir01",
     "root_dir\dir0\dir01\file010",
     "root_dir\dir0\dir01\file011",
     "root_dir\dir0\dir02",
     "root_dir\dir0\dir02\dir020",
     "root_dir\dir0\dir02\dir020\dir0200",
     "root_dir\dir1",
     "root_dir\dir1\file10",
     "root_dir\dir1\file11",
     "root_dir\dir1\file12",
     "root_dir\dir2",
     "root_dir\dir2\dir20",
     "root_dir\dir2\dir20\file200",
     "root_dir\dir2\file20",
     "root_dir\dir3",
     "root_dir\file0",
     "root_dir\file1"]
    11 ["dir0\dir00\dir000\file0000",
     "dir0\dir00\file000",
     "dir0\dir01\file010",
     "dir0\dir01\file011",
     "dir1\file10",
     "dir1\file11",
     "dir1\file12",
     "dir2\dir20\file200",
     "dir2\file20",
     "file0",
     "file1"]
    


  1. [Python 3]: os.scandir(path=".") (Python 3.5+, backport: [PyPI]: scandir)

    Return an iterator of os.DirEntry objects corresponding to the entries in the directory given by path. The entries are yielded in arbitrary order, and the special entries "." and ".." are not included.

    Using scandir() instead of listdir() can significantly increase the performance of code that also needs file type or file attribute information, because os.DirEntry objects expose this information if the operating system provides it when scanning a directory. All os.DirEntry methods may perform a system call, but is_dir() and is_file() usually only require a system call for symbolic links; os.DirEntry.stat() always requires a system call on Unix but only requires one for symbolic links on Windows.


    >>> import os
    >>> root_dir = os.path.join(".", "root_dir")  # Explicitly prepending current directory
    >>> root_dir
    ".\root_dir"
    >>>
    >>> scandir_iterator = os.scandir(root_dir)
    >>> scandir_iterator
    <nt.ScandirIterator object at 0x00000268CF4BC140>
    >>> [item.path for item in scandir_iterator]
    [".\root_dir\dir0", ".\root_dir\dir1", ".\root_dir\dir2", ".\root_dir\dir3", ".\root_dir\file0", ".\root_dir\file1"]
    >>>
    >>> [item.path for item in scandir_iterator]  # Will yield an empty list as it was consumed by previous iteration (automatically performed by the list comprehension)
    []
    >>>
    >>> scandir_iterator = os.scandir(root_dir)  # Reinitialize the generator
    >>> for item in scandir_iterator :
    ...     if os.path.isfile(item.path):
    ...             print(item.name)
    ...
    file0
    file1
    

    Notes:

    • It"s similar to os.listdir
    • But it"s also more flexible (and offers more functionality), more Pythonic (and in some cases, faster)


  1. [Python 3]: os.walk(top, topdown=True, onerror=None, followlinks=False)

    Generate the file names in a directory tree by walking the tree either top-down or bottom-up. For each directory in the tree rooted at directory top (including top itself), it yields a 3-tuple (dirpath, dirnames, filenames).


    >>> import os
    >>> root_dir = os.path.join(os.getcwd(), "root_dir")  # Specify the full path
    >>> root_dir
    "E:\Work\Dev\StackOverflow\q003207219\root_dir"
    >>>
    >>> walk_generator = os.walk(root_dir)
    >>> root_dir_entry = next(walk_generator)  # First entry corresponds to the root dir (passed as an argument)
    >>> root_dir_entry
    ("E:\Work\Dev\StackOverflow\q003207219\root_dir", ["dir0", "dir1", "dir2", "dir3"], ["file0", "file1"])
    >>>
    >>> root_dir_entry[1] + root_dir_entry[2]  # Display dirs and files (direct descendants) in a single list
    ["dir0", "dir1", "dir2", "dir3", "file0", "file1"]
    >>>
    >>> [os.path.join(root_dir_entry[0], item) for item in root_dir_entry[1] + root_dir_entry[2]]  # Display all the entries in the previous list by their full path
    ["E:\Work\Dev\StackOverflow\q003207219\root_dir\dir0", "E:\Work\Dev\StackOverflow\q003207219\root_dir\dir1", "E:\Work\Dev\StackOverflow\q003207219\root_dir\dir2", "E:\Work\Dev\StackOverflow\q003207219\root_dir\dir3", "E:\Work\Dev\StackOverflow\q003207219\root_dir\file0", "E:\Work\Dev\StackOverflow\q003207219\root_dir\file1"]
    >>>
    >>> for entry in walk_generator:  # Display the rest of the elements (corresponding to every subdir)
    ...     print(entry)
    ...
    ("E:\Work\Dev\StackOverflow\q003207219\root_dir\dir0", ["dir00", "dir01", "dir02"], [])
    ("E:\Work\Dev\StackOverflow\q003207219\root_dir\dir0\dir00", ["dir000"], ["file000"])
    ("E:\Work\Dev\StackOverflow\q003207219\root_dir\dir0\dir00\dir000", [], ["file0000"])
    ("E:\Work\Dev\StackOverflow\q003207219\root_dir\dir0\dir01", [], ["file010", "file011"])
    ("E:\Work\Dev\StackOverflow\q003207219\root_dir\dir0\dir02", ["dir020"], [])
    ("E:\Work\Dev\StackOverflow\q003207219\root_dir\dir0\dir02\dir020", ["dir0200"], [])
    ("E:\Work\Dev\StackOverflow\q003207219\root_dir\dir0\dir02\dir020\dir0200", [], [])
    ("E:\Work\Dev\StackOverflow\q003207219\root_dir\dir1", [], ["file10", "file11", "file12"])
    ("E:\Work\Dev\StackOverflow\q003207219\root_dir\dir2", ["dir20"], ["file20"])
    ("E:\Work\Dev\StackOverflow\q003207219\root_dir\dir2\dir20", [], ["file200"])
    ("E:\Work\Dev\StackOverflow\q003207219\root_dir\dir3", [], [])
    

    Notes:

    • Under the scenes, it uses os.scandir (os.listdir on older versions)
    • It does the heavy lifting by recurring in subfolders


  1. [Python 3]: glob.glob(pathname, *, recursive=False) ([Python 3]: glob.iglob(pathname, *, recursive=False))

    Return a possibly-empty list of path names that match pathname, which must be a string containing a path specification. pathname can be either absolute (like /usr/src/Python-1.5/Makefile) or relative (like ../../Tools/*/*.gif), and can contain shell-style wildcards. Broken symlinks are included in the results (as in the shell).
    ...
    Changed in version 3.5: Support for recursive globs using “**”.


    >>> import glob, os
    >>> wildcard_pattern = "*"
    >>> root_dir = os.path.join("root_dir", wildcard_pattern)  # Match every file/dir name
    >>> root_dir
    "root_dir\*"
    >>>
    >>> glob_list = glob.glob(root_dir)
    >>> glob_list
    ["root_dir\dir0", "root_dir\dir1", "root_dir\dir2", "root_dir\dir3", "root_dir\file0", "root_dir\file1"]
    >>>
    >>> [item.replace("root_dir" + os.path.sep, "") for item in glob_list]  # Strip the dir name and the path separator from begining
    ["dir0", "dir1", "dir2", "dir3", "file0", "file1"]
    >>>
    >>> for entry in glob.iglob(root_dir + "*", recursive=True):
    ...     print(entry)
    ...
    root_dir
    root_dirdir0
    root_dirdir0dir00
    root_dirdir0dir00dir000
    root_dirdir0dir00dir000file0000
    root_dirdir0dir00file000
    root_dirdir0dir01
    root_dirdir0dir01file010
    root_dirdir0dir01file011
    root_dirdir0dir02
    root_dirdir0dir02dir020
    root_dirdir0dir02dir020dir0200
    root_dirdir1
    root_dirdir1file10
    root_dirdir1file11
    root_dirdir1file12
    root_dirdir2
    root_dirdir2dir20
    root_dirdir2dir20file200
    root_dirdir2file20
    root_dirdir3
    root_dirfile0
    root_dirfile1
    

    Notes:

    • Uses os.listdir
    • For large trees (especially if recursive is on), iglob is preferred
    • Allows advanced filtering based on name (due to the wildcard)


  1. [Python 3]: class pathlib.Path(*pathsegments) (Python 3.4+, backport: [PyPI]: pathlib2)

    >>> import pathlib
    >>> root_dir = "root_dir"
    >>> root_dir_instance = pathlib.Path(root_dir)
    >>> root_dir_instance
    WindowsPath("root_dir")
    >>> root_dir_instance.name
    "root_dir"
    >>> root_dir_instance.is_dir()
    True
    >>>
    >>> [item.name for item in root_dir_instance.glob("*")]  # Wildcard searching for all direct descendants
    ["dir0", "dir1", "dir2", "dir3", "file0", "file1"]
    >>>
    >>> [os.path.join(item.parent.name, item.name) for item in root_dir_instance.glob("*") if not item.is_dir()]  # Display paths (including parent) for files only
    ["root_dir\file0", "root_dir\file1"]
    

    Notes:

    • This is one way of achieving our goal
    • It"s the OOP style of handling paths
    • Offers lots of functionalities


  1. [Python 2]: dircache.listdir(path) (Python 2 only)


    def listdir(path):
        """List directory contents, using cache."""
        try:
            cached_mtime, list = cache[path]
            del cache[path]
        except KeyError:
            cached_mtime, list = -1, []
        mtime = os.stat(path).st_mtime
        if mtime != cached_mtime:
            list = os.listdir(path)
            list.sort()
        cache[path] = mtime, list
        return list
    


  1. [man7]: OPENDIR(3) / [man7]: READDIR(3) / [man7]: CLOSEDIR(3) via [Python 3]: ctypes - A foreign function library for Python (POSIX specific)

    ctypes is a foreign function library for Python. It provides C compatible data types, and allows calling functions in DLLs or shared libraries. It can be used to wrap these libraries in pure Python.

    code_ctypes.py:

    #!/usr/bin/env python3
    
    import sys
    from ctypes import Structure, 
        c_ulonglong, c_longlong, c_ushort, c_ubyte, c_char, c_int, 
        CDLL, POINTER, 
        create_string_buffer, get_errno, set_errno, cast
    
    
    DT_DIR = 4
    DT_REG = 8
    
    char256 = c_char * 256
    
    
    class LinuxDirent64(Structure):
        _fields_ = [
            ("d_ino", c_ulonglong),
            ("d_off", c_longlong),
            ("d_reclen", c_ushort),
            ("d_type", c_ubyte),
            ("d_name", char256),
        ]
    
    LinuxDirent64Ptr = POINTER(LinuxDirent64)
    
    libc_dll = this_process = CDLL(None, use_errno=True)
    # ALWAYS set argtypes and restype for functions, otherwise it"s UB!!!
    opendir = libc_dll.opendir
    readdir = libc_dll.readdir
    closedir = libc_dll.closedir
    
    
    def get_dir_content(path):
        ret = [path, list(), list()]
        dir_stream = opendir(create_string_buffer(path.encode()))
        if (dir_stream == 0):
            print("opendir returned NULL (errno: {:d})".format(get_errno()))
            return ret
        set_errno(0)
        dirent_addr = readdir(dir_stream)
        while dirent_addr:
            dirent_ptr = cast(dirent_addr, LinuxDirent64Ptr)
            dirent = dirent_ptr.contents
            name = dirent.d_name.decode()
            if dirent.d_type & DT_DIR:
                if name not in (".", ".."):
                    ret[1].append(name)
            elif dirent.d_type & DT_REG:
                ret[2].append(name)
            dirent_addr = readdir(dir_stream)
        if get_errno():
            print("readdir returned NULL (errno: {:d})".format(get_errno()))
        closedir(dir_stream)
        return ret
    
    
    def main():
        print("{:s} on {:s}
    ".format(sys.version, sys.platform))
        root_dir = "root_dir"
        entries = get_dir_content(root_dir)
        print(entries)
    
    
    if __name__ == "__main__":
        main()
    

    Notes:

    • It loads the three functions from libc (loaded in the current process) and calls them (for more details check [SO]: How do I check whether a file exists without exceptions? (@CristiFati"s answer) - last notes from item #4.). That would place this approach very close to the Python / C edge
    • LinuxDirent64 is the ctypes representation of struct dirent64 from [man7]: dirent.h(0P) (so are the DT_ constants) from my machine: Ubtu 16 x64 (4.10.0-40-generic and libc6-dev:amd64). On other flavors/versions, the struct definition might differ, and if so, the ctypes alias should be updated, otherwise it will yield Undefined Behavior
    • It returns data in the os.walk"s format. I didn"t bother to make it recursive, but starting from the existing code, that would be a fairly trivial task
    • Everything is doable on Win as well, the data (libraries, functions, structs, constants, ...) differ


    Output:

    [[email protected]:~/Work/Dev/StackOverflow/q003207219]> ./code_ctypes.py
    3.5.2 (default, Nov 12 2018, 13:43:14)
    [GCC 5.4.0 20160609] on linux
    
    ["root_dir", ["dir2", "dir1", "dir3", "dir0"], ["file1", "file0"]]
    


  1. [ActiveState.Docs]: win32file.FindFilesW (Win specific)

    Retrieves a list of matching filenames, using the Windows Unicode API. An interface to the API FindFirstFileW/FindNextFileW/Find close functions.


    >>> import os, win32file, win32con
    >>> root_dir = "root_dir"
    >>> wildcard = "*"
    >>> root_dir_wildcard = os.path.join(root_dir, wildcard)
    >>> entry_list = win32file.FindFilesW(root_dir_wildcard)
    >>> len(entry_list)  # Don"t display the whole content as it"s too long
    8
    >>> [entry[-2] for entry in entry_list]  # Only display the entry names
    [".", "..", "dir0", "dir1", "dir2", "dir3", "file0", "file1"]
    >>>
    >>> [entry[-2] for entry in entry_list if entry[0] & win32con.FILE_ATTRIBUTE_DIRECTORY and entry[-2] not in (".", "..")]  # Filter entries and only display dir names (except self and parent)
    ["dir0", "dir1", "dir2", "dir3"]
    >>>
    >>> [os.path.join(root_dir, entry[-2]) for entry in entry_list if entry[0] & (win32con.FILE_ATTRIBUTE_NORMAL | win32con.FILE_ATTRIBUTE_ARCHIVE)]  # Only display file "full" names
    ["root_dir\file0", "root_dir\file1"]
    

    Notes:


  1. Install some (other) third-party package that does the trick
    • Most likely, will rely on one (or more) of the above (maybe with slight customizations)


Notes:

  • Code is meant to be portable (except places that target a specific area - which are marked) or cross:

    • platform (Nix, Win, )
    • Python version (2, 3, )
  • Multiple path styles (absolute, relatives) were used across the above variants, to illustrate the fact that the "tools" used are flexible in this direction

  • os.listdir and os.scandir use opendir / readdir / closedir ([MS.Docs]: FindFirstFileW function / [MS.Docs]: FindNextFileW function / [MS.Docs]: FindClose function) (via [GitHub]: python/cpython - (master) cpython/Modules/posixmodule.c)

  • win32file.FindFilesW uses those (Win specific) functions as well (via [GitHub]: mhammond/pywin32 - (master) pywin32/win32/src/win32file.i)

  • _get_dir_content (from point #1.) can be implemented using any of these approaches (some will require more work and some less)

    • Some advanced filtering (instead of just file vs. dir) could be done: e.g. the include_folders argument could be replaced by another one (e.g. filter_func) which would be a function that takes a path as an argument: filter_func=lambda x: True (this doesn"t strip out anything) and inside _get_dir_content something like: if not filter_func(entry_with_path): continue (if the function fails for one entry, it will be skipped), but the more complex the code becomes, the longer it will take to execute
  • Nota bene! Since recursion is used, I must mention that I did some tests on my laptop (Win 10 x64), totally unrelated to this problem, and when the recursion level was reaching values somewhere in the (990 .. 1000) range (recursionlimit - 1000 (default)), I got StackOverflow :). If the directory tree exceeds that limit (I am not an FS expert, so I don"t know if that is even possible), that could be a problem.
    I must also mention that I didn"t try to increase recursionlimit because I have no experience in the area (how much can I increase it before having to also increase the stack at OS level), but in theory there will always be the possibility for failure, if the dir depth is larger than the highest possible recursionlimit (on that machine)

  • The code samples are for demonstrative purposes only. That means that I didn"t take into account error handling (I don"t think there"s any try / except / else / finally block), so the code is not robust (the reason is: to keep it as simple and short as possible). For production, error handling should be added as well

Other approaches:

  1. Use Python only as a wrapper

    • Everything is done using another technology
    • That technology is invoked from Python
    • The most famous flavor that I know is what I call the system administrator approach:

      • Use Python (or any programming language for that matter) in order to execute shell commands (and parse their outputs)
      • Some consider this a neat hack
      • I consider it more like a lame workaround (gainarie), as the action per se is performed from shell (cmd in this case), and thus doesn"t have anything to do with Python.
      • Filtering (grep / findstr) or output formatting could be done on both sides, but I"m not going to insist on it. Also, I deliberately used os.system instead of subprocess.Popen.
      (py35x64_test) E:WorkDevStackOverflowq003207219>"e:WorkDevVEnvspy35x64_testScriptspython.exe" -c "import os;os.system("dir /b root_dir")"
      dir0
      dir1
      dir2
      dir3
      file0
      file1
      

    In general this approach is to be avoided, since if some command output format slightly differs between OS versions/flavors, the parsing code should be adapted as well; not to mention differences between locales).

Answer #9

Here is a robust function that uses both os.remove and shutil.rmtree:

def remove(path):
    """ param <path> could either be relative or absolute. """
    if os.path.isfile(path) or os.path.islink(path):
        os.remove(path)  # remove the file
    elif os.path.isdir(path):
        shutil.rmtree(path)  # remove dir and all contains
    else:
        raise ValueError("file {} is not a file or dir.".format(path))

Answer #10

For python 3.3 and later:

import shutil

command = "ls"
shutil.which(command) is not None

As a one-liner of Jan-Philip Gehrcke Answer:

cmd_exists = lambda x: shutil.which(x) is not None

As a def:

def cmd_exists(cmd):
    return shutil.which(cmd) is not None

For python 3.2 and earlier:

my_command = "ls"
any(
    (
        os.access(os.path.join(path, my_command), os.X_OK) 
        and os.path.isfile(os.path.join(path, my_command)
    )
    for path in os.environ["PATH"].split(os.pathsep)
)

This is a one-liner of Jay"s Answer, Also here as a lambda func:

cmd_exists = lambda x: any((os.access(os.path.join(path, x), os.X_OK) and os.path.isfile(os.path.join(path, x))) for path in os.environ["PATH"].split(os.pathsep))
cmd_exists("ls")

Or lastly, indented as a function:

def cmd_exists(cmd, path=None):
    """ test if path contains an executable file with name
    """
    if path is None:
        path = os.environ["PATH"].split(os.pathsep)

    for prefix in path:
        filename = os.path.join(prefix, cmd)
        executable = os.access(filename, os.X_OK)
        is_not_directory = os.path.isfile(filename)
        if executable and is_not_directory:
            return True
    return False