+

Python | Pandas DataFrame.set_index ()

Pandas set_index() — it is a method for setting a list, series, or dataframe as the index of a dataframe. The column index can be set when creating the data frame too. But sometimes a data frame is made up of two or more data frames, and hence the later index can be changed using this method.

Syntax :

DataFrame.set_index (keys, drop = True, append = False, inplace = False, verify_integrity = False)

Parameters:

keys: Column name or list of column name.
drop: Boolean value which drops the column used for index if True.
append: Appends the column to existing index column if True.
inplace: Makes the changes in the dataframe if True.
verify_integrity: Checks the new index column for duplicates if True.

To download the CSV file you are using, click here.

Code # 1: changing an index column
In this example, the Name column has been made an index column of a DataFrame.

 

# import pandas package

import pandas as pd

  
# create data frame from CSV file

data = pd.read_csv ( " employees.csv " )

 
# setting name as index column

data.set_index ( "First Name" , inplace = True )

 
# display
data.head ()

Output:
As shown in the output images, previously the index column was a sequence of numbers, but it was later changed to Name.

Before the operation —

After the operation —

Code # 2: multi-index column
In this example, two columns will be made as an index column. The delete option is used to remove a column, and the add option — to add missing columns to an already existing index column.

# pandas package import

import pandas as pd

 
# create data frame from CSV file

data = pd.read_csv ( "employees.csv" )

 
# setting name as index column

data.set_index ([ " First Name " , " Gender " ], inplace = True  ,

append = True , drop = False )

 
# display
data.head ()

Output:
As shown in the output image, the data has 3 index columns.

Get Solution for free from DataCamp guru