Reindexing in Pandas DataFrame

Reindexing in Pandas can be used to change the index of rows and columns in a DataFrame. Indexes can be used with reference to multiple DataStructure indexes associated with multiple panda series or panda DataFrame. Let`s see how we can reindex columns and rows in a Pandas DataFrame.

Reindex Rows

You can reindex one or more rows using reindex () . The default values ​​in the new index that are not in the data frame are assigned NaN.

Example # 1:

# import numpy and pandas module

import pandas as pd

import numpy as np

 

column = [ `a` , ` b` , `c` , ` d` , `e` ]

index = [ `A` , ` B` , `C` , ` D` , `E` ]

  
# create a data frame of random array values ​​

df1 = pd.DataFrame (np.random.rand ( 5 , 5 ), 

columns = column, index = index)

 

print (df1)

  

print ( `Dataframe after reindexing rows:`

df1.reindex ([ `B` , ` D` , ` A` , `C` , `E` ]))

Output:

Example # 2:

# import numpy and pandas modules

import pandas as pd

import numpy as np

 

column = [ `a` , ` b` , `c` , ` d` , `e` ]

index = [ `A` , `B` , ` C` , `D` , ` E ` ]

  
# create data frame of random array values ​​

df1 = pd.DataFrame (np.random.rand ( 5 , 5 ), 

columns = column, index = index )

 
# create a new index for rows

new_index = [ `U` , ` A` , `B` , ` C` , `Z` ]

 

print (df1.reindex (new_index))

Output:

Re-indexing columns using keyword axis

It is possible to reindex one column or multiple columns using reindex () and specifying the axis we want to reindex. The default values ​​in the new index that are not in the data frame are assigned NaN.

Example # 1:

# import numpy and pandas module

import pandas as pd

import numpy as np

 

column = [ `a` , ` b` , `c` , ` d` , `e` ]

index = [ `A` , ` B` , `C` , ` D` , `E` ]

  
# create a data frame of random array values ​​

df1 = pd.DataFrame (np.random.rand ( 5 , 5 ), 

columns = column, index = index)

 

colum = [ `e` , ` a` , `b` , `c` , ` d` ]

 
# create a new index for columns

print (df1.reindex (colum, axis = ` columns` ))

Output:

Example # 2:

# import numpy and pandas module

import pandas as pd

import numpy as np

 

column = [ `a` , `b` , ` c` , ` d` , `e` ]

index = [ `A` , ` B` , `C` , ` D` , `E` ]

  
# create a data frame of random array values ​​

df1 = pd.DataFrame (np.random.rand ( 5 , 5 ), 

columns = column, index = index)

 

colum = [ `a` , ` b` , `c` , ` g` , ` h` ]

 
# create a new index for columns

print (df1.reindex (colum, axis = `columns` ))

Output:

Replacing missing values ​​

Code # 1: You can fill in missing values ​​from a data frame by passing a value to the keyword fill_value . This keyword replaces NaN values.

# import numpy and pandas module

import pandas as pd

import numpy as np

 

column = [ ` a` , `b` , ` c` , `d` , ` e` ]

index = [ ` A` ,  `B` , ` C` , `D` , ` E` ]

 
# create a data frame of random array values ​​

df1 = pd.DataFrame (np.random.rand ( 5 , 5 ) , 

columns = column, index = index)

 

colum = [ `a` , ` b` , `c` , `g` , `h` ]

  
# create a new index for columns

print (df1.reindex (colum, axis = `columns` , fill_value = 1.5 ))

Output:

Code # 2: Replace missing data with a string.

# import numpy and pandas module

import pandas as pd

import numpy as np

 

column = [ `a` , ` b` , ` c` , `d` , `e` ]

index = [ `A` , `B` , ` C` , `D` , `E` ]

  
# create a data frame of random array values ​​

df1 = pd.DataFrame (np.random.rand ( 5 , 5 ), 

  columns = column, index = index)

 

colum = [ `a` , `b` , ` c` , ` g` , `h` ]

 
# create a new index for columns

print (df1.reindex (colum, axis = `columns` , fill_value = ` data missing` ))

Output: