Change language

Working with CSV files in Python

| | |

First of all, what is CSV?
CSV (comma separated values) — it is a simple file format file used to store tabular data such as a spreadsheet or database. A CSV file stores tabular data (numbers and text) in plain text. Each line of the file represents a record of data. Each record consists of one or more fields, separated by commas. Using a comma as a field separator is the source of the name for this file format.

There is a built-in module for working with CSV files in Python csv .

Read CSV file

# CSV module import

import csv

 
# csv file name

filename = "aapl.csv"

 
# initialize the list of headers and lines

fields = []

rows = []

 
# read CSV file

with open (filename, ’ r’ ) as csvfile:

# create a csv reader

csvreader = csv.reader (csvfile)

 

# extract field names across the first row

fields = csvreader. next ()

  

# fetch each row of data one by one

for row in csvreader :

rows.append (row)

 

# get the total number of lines

print ( "Total no. of rows:% d " % (csvreader.line_num))

 
# print field names

print ( ’Field names are:’ + ’, ’ . join (field for field in fields))

 
# print the first 5 lines

print ( ’First 5 rows are:’ )

for row in rows [: 5 ]:

# parse each row column

for col in row:

print ( "% 10s " % col),

print ( ’’ )

The output of the above program looks like this:

In the above example re used CSV file aapl.csv, which can be downloaded here .
Run this program with the aapl.csv file in the same directory.

Let’s try to understand this piece of code.

  •  with open (filename, ’ r’) as csvfile: csvreader = csv.reader (csvfile) 

    Here we first open the CSV file in read mode. The file object is named csvfile . The file object is converted to a csv.reader object. We save the csv.reader object as csvreader .

  •  fields = csvreader.next () 

    csvreader — it is a repeatable object. Therefore, the .next () method returns the current line and advances the iterator to the next line. Since the first line of our CSV file contains headers (or field names), we store them in a list called fields .

  •  for row in csvreader: rows.append (row) 

    We now iterate over the remaining rows using a for loop. Each line is added to a list called lines . If you try to print each line, you will find that this line — nothing more than a list containing all the field values.

  •  print ("Total no. of rows:% d"% (csvreader.line_num)) 

    csvreader.line_num — nothing more than a counter that returns the number of lines that have been repeated.

Write to CSV file

# csv module import

import csv

 
# field names

fields = [ ’ Name’ , ’Branch’ , ’Year’ , ’ CGPA’ ]

  
# CSV file data lines

rows = [[[ ’Nikhil’ , ’COE’ , ’ 2’ , ’9.0’ ],

[ ’Sanchit’ , ’COE’ , ’ 2 ’ , ’ 9.1’ ],

[ ’Aditya’ , ’ IT’ , ’2’ , ’9.3’ ],

[ ’Sagar’ , ’SE’ , ’ 1’ , ’9.5’ ],

  [ ’Prateek’ , ’MCE’ , ’ 3’ , ’7.8’ ],

  [ ’Sahil’ , ’EP’ , ’ 2’ , ’9.1’ ]]

  
# csv file name

filename = "university_records.csv"

 
# writing to CSV file

with open (filename, ’w’ ) as csvfile:

# create csv writer object

csvwriter = csv.writer (csvfile)

 

# writing fields

csvwriter.writerow (fields)

  

  # writing data lines

csvwriter.writerows (rows)

Let’s try to understand the above code piece by piece.

  • fields and lines are already defined. fields — a list containing all the field names. Strings — this is a list of lists. Each line is a list containing the field values ​​of that line.
  •  with open (filename, ’w’) as csvfile: csvwriter = csv.writer (csvfile) 

    Here we first open the CSV file in WRITE mode. The file object is named csvfile . The file object is converted to a csv.writer object. We save the csv.writer object as csvwriter .

  •  csvwriter.writerow (fields) 

    We now use the writerow method to write the first line, which is nothing more than the field names.

  •  csvwriter.writerows (rows) 

    We use the writerows method to write multiple lines at the same time.

Write dictionary to CSV file

# csv module import

import csv

 
# my data lines as dictionary objects

mydict = [{ ’branch’ : ’ COE’ , ’cgpa’ : ’ 9.0’ , ’name’ : ’ Nikhil’ , ’year’ : ’ 2’ },

{ ’branch’ : ’ COE’ , ’cgpa’ : ’ 9.1’ , ’name’ : ’ Sanchit ’ , ’ year’ : ’2’ },

  { ’branch’ : ’ IT’ , ’cgpa’ : ’9.3’ , ’ name’ : ’Aditya’ , ’ year’ : ’2’ },

  { ’ branch’ : ’SE’ , ’ cgpa’ : ’9.5’ , ’ name’ : ’Sagar’ , ’ year ’ : ’ 1’ },

{ ’branch’ : ’ MCE’ , ’cgpa’ : ’7.8’ , ’name’ : ’ Prateek’ , ’year’ : ’ 3’ } ,

{ ’branch’ : ’EP’ , ’cgpa’ : ’ 9.1’ , ’name’ : ’ Sahil’ , ’year’ : ’ 2’ }]

 
# field names

fields = [ ’name’ , ’ branch’ , ’ year’ , ’cgpa’ ]

 
# csv file name

filename = "university_records.csv"

 
# write to CSV file

with open (filename , ’w’ ) as csvfile:

  # create a csv dict writer

  writer = csv.DictWriter (csvfile, fieldnames = fields)

 

# writing headers (field names)

  writer.writeheader ()

  

  # writing data lines

  writer.writerows (mydict)

In this example, we writing the dictionary mydict to a CSV file.

  •  with open (filename, ’w’) as csvfile: writer = csv.DictWriter (csvfile, fieldnames = fields ) 

    Here the file object ( csvfile ) is converted to a DictWriter object.
    Here we are specifying field names as an argument.

  •  writer.writeheader () 

    The writeheader method just writes the first line of your CSV file using the predefined field names.

  •  writer.writerows (mydict) 

    The writerows method is simple writes all lines, but writes only values ​​(not keys) on each line.

So at the end our CSV file looks like this:

Important points:

  • In csv modules an optional dialect parameter can be given which is used to define a set of parameters specific to a particular CSV format . By default, the CSV module uses the Excel dialect, which makes them compatible with Excel spreadsheets. You can define your own dialect using the register_dialect method.
    Here’s an example:
  csv.register_dialect (’mydialect’ , delimiter = ’, ’, quotechar = ’ "’ , doublequote =  True , skipinitialspace =  True , lineterminator =  ’’ , quoting = csv.QUOTE_MINIMAL)  

Now, by defining a csv.reader or csv.writer object, we can specify the dialect as
this:

  csvreader = csv.reader (csvfile, dialect =  ’mydialect’ )  
  • Now let’s assume the CSV file looks like this:

    Note that the separator — is not a comma, but a semicolon. Also, the lines section are separated by two newlines instead of one. In such cases, we can specify the line separator and terminator as follows:

      csvreader = csv.reader (csvfile,  delimiter = ’;’, lineterminator = ’’ )  

So this was a short but short discussion on how to download and parse CSV files in a Python program.

This blog is contributed by Nikhil Kumar . If you like Python.Engineering and would like to contribute, you can also write an article using contrib.python.engineering, or email your article to [email protected] See my article appearing on the Python.Engineering homepage and help other geeks.

Please post comments if you find anything wrong or if you would like to share more information on the topic discussed above.

Shop

Learn programming in R: courses

$

Best Python online courses for 2022

$

Best laptop for Fortnite

$

Best laptop for Excel

$

Best laptop for Solidworks

$

Best laptop for Roblox

$

Best computer for crypto mining

$

Best laptop for Sims 4

$

Latest questions

NUMPYNUMPY

Common xlabel/ylabel for matplotlib subplots

12 answers

NUMPYNUMPY

How to specify multiple return types using type-hints

12 answers

NUMPYNUMPY

Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

12 answers

NUMPYNUMPY

Flake8: Ignore specific warning for entire file

12 answers

NUMPYNUMPY

glob exclude pattern

12 answers

NUMPYNUMPY

How to avoid HTTP error 429 (Too Many Requests) python

12 answers

NUMPYNUMPY

Python CSV error: line contains NULL byte

12 answers

NUMPYNUMPY

csv.Error: iterator should return strings, not bytes

12 answers

News


Wiki

Python | How to copy data from one Excel sheet to another

Common xlabel/ylabel for matplotlib subplots

Check if one list is a subset of another in Python

sin

How to specify multiple return types using type-hints

exp

Printing words vertically in Python

exp

Python Extract words from a given string

Cyclic redundancy check in Python

Finding mean, median, mode in Python without libraries

cos

Python add suffix / add prefix to strings in a list

Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

Python - Move item to the end of the list

Python - Print list vertically