Python | Pandas Series.searchsorted ()

NumPy | Python Methods and Functions | searchsorted

Pandas searchsorted() — this is a method for sorted rows. This allows the user to pass values ​​as a parameter to be inserted into the series, and returns an array of positions where values ​​can be inserted so that the order of the series is still preserved.

Syntax: Series.searchsorted (value, side = `left`, sorter = None)

Parameters:
value : Values ​​to be inserted into self (Caller series)
side: `left` or `right`, returns first or last suitable position for value respectively
sorter: Array of indices which is of same size as series is passed. If sorter is None, caller series must be in ascending order, otherwise sorter should be array of indices that sorts it.

Return type: Array of indices

Example # 1:

In this example, the searchsorted () method is called on a sorted series and 3 values ​​are passed as a parameter.

# pandas module import

import pandas as pd 

 
# numpy module import

import numpy as np 

 
# create a list

list = [ 0  , 2 , 3 , 7 , 12 , 12 , 15 , 24 ]

 
# create series

series = pd.Series ( list )

 
# values ​​to insert

val = [ 1 , 7 , 14 ]

 
# calling the .searchsorted () method

result = series.searchsorted (value = val)

 
# display
result

Exit :

 array ([1, 3, 6]) 

As shown in the output, an index of each value was returned. Since 7 already exists sequentially, index position 6 was returned for it because of the default side parameter, which is "abandoned". Therefore, it returns the left side index in case of equal values.

Example # 2: Searchsorted () in a sequence of lines.

In this example, a sorted series of fruit names consists of a list of pythons using the Pandas Series method. The two-line list is then passed as the searchsorted () parameter of the searchsorted () method.

# pandas module import

import pandas as pd 

 
# numpy module import

import numpy as np 

 
# creating a list

data = [ `apple` , ` banana` , ` mango` , `pineapple`  , `pizza` ]

  
# create series

series = pd.Series (data)

 
# values ​​to insert

val = [ `grapes` , ` watermelon` ]

 
# method call .searchsorted ()

result = series .searchsorted (value = val)

  
# display
result

Exit :

 array ([2, 5]) 

As shown in the output, the index position is returned for each value in the passed list, so that the order of the rows will be preserved if the values ​​are placed in that index.





Python | Pandas Series.searchsorted (): StackOverflow Questions

Answer #1

You can use pandas.cut:

bins = [0, 1, 5, 10, 25, 50, 100]
df["binned"] = pd.cut(df["percentage"], bins)
print (df)
   percentage     binned
0       46.50   (25, 50]
1       44.20   (25, 50]
2      100.00  (50, 100]
3       42.12   (25, 50]

bins = [0, 1, 5, 10, 25, 50, 100]
labels = [1,2,3,4,5,6]
df["binned"] = pd.cut(df["percentage"], bins=bins, labels=labels)
print (df)
   percentage binned
0       46.50      5
1       44.20      5
2      100.00      6
3       42.12      5

Or numpy.searchsorted:

bins = [0, 1, 5, 10, 25, 50, 100]
df["binned"] = np.searchsorted(bins, df["percentage"].values)
print (df)
   percentage  binned
0       46.50       5
1       44.20       5
2      100.00       6
3       42.12       5

...and then value_counts or groupby and aggregate size:

s = pd.cut(df["percentage"], bins=bins).value_counts()
print (s)
(25, 50]     3
(50, 100]    1
(10, 25]     0
(5, 10]      0
(1, 5]       0
(0, 1]       0
Name: percentage, dtype: int64

s = df.groupby(pd.cut(df["percentage"], bins=bins)).size()
print (s)
percentage
(0, 1]       0
(1, 5]       0
(5, 10]      0
(10, 25]     0
(25, 50]     3
(50, 100]    1
dtype: int64

By default cut returns categorical.

Series methods like Series.value_counts() will use all categories, even if some categories are not present in the data, operations in categorical.

Get Solution for free from DataCamp guru