Change language

Working with zip files in Python

| |

What is a zip file?

ZIP — is an archive file format that supports lossless data compression. By lossless compression, we mean that the compression algorithm allows you to completely recover the original data from the compressed data. Thus, the ZIP file — it is a single file containing one or more compressed files, offering an ideal way to reduce the size of large files and keep related files together.

Why do we need ZIP files?

  • To reduce storage requirements.
  • To improve transfer speed over standard connections.

To work with zip files using python, we we will use a built-in python module called zipfile .

1. Unzip the zip file

# import required modules

from zipfile import ZipFile

 
# specifying the zip file name

file_name = "my_python_files.zip"

 
# open the zip file in read mode

with ZipFile (file_name, ’r’ ) as zip :

  # prints the entire contents of the zip file

  zip . printdir ()

 

# extract all files

print ( ’ Extracting all the files now ... ’ )

  zip . extractall ()

print ( ’Done!’ )

The above program extracts a zip file named "my_python_files.zip" in the same directory as this Python script. 
The output of the above program might look like this:

Let’s try to understand the above code piece by piece:

  •  from zipfile import ZipFile 

    ZipFile — it is a zipfile module class for reading and writing zip files. Here we import only the ZipFile class from the zipfile module.

  •  with ZipFile (file_name, ’r’) as zip: 

    Here the ZipFile object is created by calling the constructor ZipFile, which accepts a zip file name and mode parameters. We create a ZipFile object in READ mode and name it zip code.

  •  zip.printdir () 

    The printdir () method prints the table of contents for the archive.

  •  zip.extractall () 

    The extractall method () will extract the entire contents of the zip file to the current working directory. You can also call the extract () method to extract any file by specifying its path in the zip file. 
    For example:

     zip.extract (’python_files / python_wiki.txt’) 

    This will extract only the specified file.

    If you want to read some specific file, you can go like this:

     data = zip.read (name_of_file_to_read) 

2. Writing to a zip file

Consider a directory (folder) with the following format:

Here we will need to scan the entire directory and its subdirectories to get a list of all file paths before writing them to the zip file. 
The following program does this by scanning the directory to be archived:

# import required modules

from zipfile import ZipFile

import os

 

def get_all_file_paths (directory):

 

# initialize an empty list of file paths

file_paths = []

 

  # scanning through directories and subdirectories

  for root, directories, files in os.walk (directory):

for filename in files:

  # combine the two lines to form the full path to the file.

filepath = os.path.join ( (root, filename)

file_paths.append (filepath)

  

# returns all file paths

return file_paths 

 

def main ():

# path to the folder, which should be archived

directory = ’. / python_files’

  

# call a function to get all file paths in a directory

file_paths = get_all_file_paths (directory)

 

  # prints a list of all files to be archived

  print ( ’The following files will be zipped:’ )

  for file_name in file_paths:

  print (file_name)

 

# writing files to zipfile

with ZipFile ( ’ my_python_files.zip’ , ’w’ ) as zip :

# write each file one at a time

  for file in file_paths:

  zip . write ( file )

 

print ( ’All files zipped successfully!’

  

  

if __ name__ = = "__ main__" :

main ()

Output the above program looks like this:

Let’s try to understand the above code, breaking it down into fragments:

  •  def get_all_file_paths (directory): file_paths = [] for root, directories, files in os.walk (directory): for filename in files: filepath =  os.path.join ( (root, filename) file_paths.append (filepath ) return file_paths 

    First of all, to get all the file paths in our directory, we created this function that uses the os.walk () method. At each iteration, all files present in this directory are added to a list named file_paths
    Finally, we return all the file paths.

  •  file_paths = get_all_file_paths (directory) 

    Here we pass the directory to archived to function get_all_file_paths () and get a list containing all file paths.

  •  with ZipFile (’my_python_files.zip’,’ w ’) as zip: 

    Here we create a ZipFile object in WRITE mode this time.

  •  for file in file_paths: zip.write (file) 

    Here we write all files to a zip file one by one using the write method.

3. Get all information about a zip file

# import required modules

from zipfile import ZipFile

import datetime

 
# specifying the zip file name

file_name = "example.zip"

 
# open zip file in read mode

with ZipFile (file_name, ’ r’ ) as zip :

  for info in zip . infolist ():

print (info.filename)

print ( ’Modified:’ + str (datetime.datetime ( * info.date_time)))

  print ( ’ System: ’ + str (info.create_system) + ’(0 = Windows, 3 = Unix)’ )

print ( ’ ZIP version: ’ + str (info.create_version))

print ( ’Compressed:’ + str (info.compress_size) + ’bytes’ )

print ( ’Uncompressed:’ + str (info.file_size) + ’bytes’ )

The output of the above program might look like this:

 for info in zip.infolist (): 

Here the method is infolist () creates an instance of the ZipInfo class which contains all the information about the zip file. 

This article contributed by Nikhil Kumar . If you like Python.Engineering and would like to contribute, you can also write an article using contrib.python.engineering, or email your article to [email protected] See my article appearing on the Python.Engineering homepage and help other geeks.

Please post comments if you find anything wrong or if you would like to share more information on the topic discussed above.

Working with zip files in Python _files: Questions

Working with zip files in Python File handling: Questions

Shop

Best laptop for Fortnite

$

Best laptop for Excel

$

Best laptop for Solidworks

$

Best laptop for Roblox

$

Best computer for crypto mining

$

Best laptop for Sims 4

$

Best laptop for Zoom

$499

Best laptop for Minecraft

$590

Latest questions

NUMPYNUMPY

psycopg2: insert multiple rows with one query

12 answers

NUMPYNUMPY

How to convert Nonetype to int or string?

12 answers

NUMPYNUMPY

How to specify multiple return types using type-hints

12 answers

NUMPYNUMPY

Javascript Error: IPython is not defined in JupyterLab

12 answers

News

Wiki

Python OpenCV | cv2.putText () method

numpy.arctan2 () in Python

Python | os.path.realpath () method

Python OpenCV | cv2.circle () method

Python OpenCV cv2.cvtColor () method

Python - Move item to the end of the list

time.perf_counter () function in Python

Check if one list is a subset of another in Python

Python os.path.join () method