Python | Create archives and search for files by name



The shutil module has two functions — make_archive() and unpack_archive() which might be just such a solution.

Code # 1:

import shutil

shutil.unpack_archive ( `Python-3.3.0.tgz` )

shutil.make_archive ( `py33` , `zip` , ` Python-3.3.0` )

Output:

 `/ Users / Dell / Downloads / py33.zip` 

Second argument for make_archive () — this is the desired output format. For a list of supported archive formats, use get_archive_formats () .

Code # 2:

shutil.get_archive_formats ()

Output:

 [(`bztar`," bzip2`ed tar-file "), (` gztar`, "gzip`ed tar-file"), (`tar`, `uncompressed tar file`), (` zip`, `ZIP file`)] 

Python has other library modules for working with low-level details of various archive formats (for example, tarfile, zipfile, gzip, bz2 and t.D.). However, there is no need to go so low to create or extract an archive. 
Instead, you can just use these high-level functions in shutil. The functions have many additional options for logging, dry runs, file permissions, and so on.

Let`s write a script that includes searching for files, such as a file rename script or a log archiving utility, but shouldn`t call shell utilities from a Python script, or provide specialized behavior that is not easy to get by “flaking”.

Use the os.walk () function to find files, giving it a top-level directory.

Code # 3: a function that finds a specific filename and prints the full path of all matches.

Save this script as abc.py and run it from the command line, abc.py start point and name as positional arguments as —

import os

 

def findfile (start, name):

  for relpath, dirs, files in os.walk (start):

  

  if name in files:

full_path = os.path.join (start, relpath, name)

print (os.path.normpath (os.path.abspath (full_path)))

 

if __ name__ = = `__main__` :

  findfile (sys.argv [ 1 ], sys.argv [ 2 ])

bash % . / abc.py .myfile.txt

How does it work?

  • Method os.walk () traverses the directory hierarchy for us, and for each directory it belongs to, it returns a 3-tuple containing the relative path to the directory it is checking for, a list containing all the names directories in that directory, and a list of filenames in that directory.
  • For each tuple, just check if the target filename is in the file list. If so, os.path.join () is used to compose the path.
  • To avoid the possibility of strange looking paths like ././foo//bar , two additional functions are used to correct the result.
  • The first is os.path.abspath () , which takes a relative path and forms an absolute path.
  • Second —  os.path.normpath () which normalizes the path, os.path.normpath () problems with double slashes, multiple references to the current directory, etc.

Although the code is quite simple compared to the capabilities of the search utility found on UNIX platforms, it is cross-platform. In addition, a lot of additional functionality can be added in a portable way without much more work.

Code # 4: A function that prints all files that are last modified

import os

import time

 

def modified_within (top, seconds):

now = time.time ()

  

  for path , dirs, files in os.walk (top):

for name in files:

fullpath = os.path.join ( path, name)

 

if os.path.exists (fullpath):

mtime = os.path.getmtime (fullpath)

if mtime & gt; (now - seconds):

print (fullpath)

 

if __ name__ = = `__main__` :

import sys

  

if len (sys.argv)! = 3 :

  print ( `Usage: {} dir secon ds` . format (sys.argv [ 0 ]))

raise SystemExit ( 1 )

 

modified_within ( sys.argv [ 1 ], float (sys.argv [ 2 ]))

It won`t take you long to build much more complex operations on top of this little function using various os, os functions .path, glob and similar modules.