Python | Pandas Series.factorize ()

Python Methods and Functions

Series.factorize() Pandas Series.factorize() encodes an object as an enumerated type or categorical variable ... This method is useful for getting the numeric representation of an array when all that matters is — it is an identification of different values.

Syntax: Series.factorize (sort = False, na_sentinel = -1)

Parameter:
sort: Sort uniques and shuffle labels to maintain the relationship.
na_sentinel: Value to mark “not found”.

Returns:
labels: ndarray
uniques: ndarray, Index, or Categorical

Example # 1: Use Series.factorize () to encode the underlying data for this series object.

# import pandas as pd

import pandas as pd

 
# Create a series

sr = pd.Series ([ ` New York` , `Chicago` , ` Toronto` , None , `Rio` ])

 
# Create index

sr.index = [ `City 1` , ` City 2` , `City 3` , ` City 4` , `City 5`

 
# set index

sr.index = index_

 
# Print series

print (sr)

Output:


Now we will use Series.factorize () to encode the underlying data of this series object.

# encode values ​​

result = sr.factorize ()

 
# Print result

print (result)

Output:

As we can see in the output, Series.factorize () is successful encoded the underlying data of this series object. Note that the missing values ​​are assigned code -1.

Example # 2: Use Series.factorize () to encode the underlying data for this series object.

# import pandas as pd

import pandas as pd

 
# Create a series

sr = pd.Series ([ 80 , 25 , 3 , 80 , 24 , 25 ])

  
# Create index

index_ = [ `Coca Cola` , `Sprite` , ` Coke` , `Fanta` , ` Dew ` , ` ThumbsUp` ]

 
# set index

sr.index = index_

 
# Print series

print (sr)

< p>

Output:

We will now use Series.factorize () to encode the underlying data for this series object.

# encode values ​​

result = sr.factorize ()

 
# Print result

print (result)

Output:

How we see in the output that Series.factorize () has successfully encoded the underlying data of the given series object.





Get Solution for free from DataCamp guru