Python | Pandas Index.searchsorted ()

Python Methods and Functions | searchsorted

Index.searchsorted() Pandas Index.searchsorted() finds the indexes where elements should be inserted to maintain order. The function finds the indices in the sorted self IndexOpsMixin so that if matching elements in the value were inserted before the indices, the order of self will be preserved.

Syntax: Index.searchsorted (value , side = 'left', sorter = None)

Parameters:
value: Values ​​to insert into self.
side: If 'left', the index of the first suitable location found is given. If 'right', return the last such index. If there is no suitable index, return either 0 or N (where N is the length of self).
sorter: Optional array of integer indices that sort self into ascending order. They are typically the result of np.argsort.

Returns: [indices: array of ints] Array of insertion points with the same shape as value.

Example # 1: Use Index.searchsorted () to find the correct insertion position to keep the index sorted.

# import pandas as pd

import pandas as pd

 
# Create index

idx = pd.Index ([ 1 , 5 , 8 , 9 , 11 , 24 , 56 , 81 ])

 
# Print index
idx

Output:

Let's find position to insert if the element to insert is 10

# find insertion position

idx.searchsorted ( 10 )

Output:

As we can see in the output, the function returned 4, indicating that the correct position to insert 10 into the index is 4 if the order is to be maintained.

Example # 2: Use Index.searchsorted () to find the correct insertion position for more than one item in the index. The insertion must be done so that the order is maintained.

# import pandas as pd

import pandas as pd

  
# Create index

idx = pd.Index ([ 1 , 5 , 8 , 9 , 11 , 24 , 56 , 81 ])

  
# Print index
idx

Output:

Let's find the position to insert if the element to insert 7 and 29

# find insertion position

idx.searchsorted ([ 7 , 29 ])

Output:

As we can see in the output, the function returned 2 and 6, indicating that the correct position to insert 7 and 29 into the — these are the 2nd and 6th positions if you want to keep the order.





Python | Pandas Index.searchsorted (): StackOverflow Questions

Answer #1

You can use pandas.cut:

bins = [0, 1, 5, 10, 25, 50, 100]
df["binned"] = pd.cut(df["percentage"], bins)
print (df)
   percentage     binned
0       46.50   (25, 50]
1       44.20   (25, 50]
2      100.00  (50, 100]
3       42.12   (25, 50]

bins = [0, 1, 5, 10, 25, 50, 100]
labels = [1,2,3,4,5,6]
df["binned"] = pd.cut(df["percentage"], bins=bins, labels=labels)
print (df)
   percentage binned
0       46.50      5
1       44.20      5
2      100.00      6
3       42.12      5

Or numpy.searchsorted:

bins = [0, 1, 5, 10, 25, 50, 100]
df["binned"] = np.searchsorted(bins, df["percentage"].values)
print (df)
   percentage  binned
0       46.50       5
1       44.20       5
2      100.00       6
3       42.12       5

...and then value_counts or groupby and aggregate size:

s = pd.cut(df["percentage"], bins=bins).value_counts()
print (s)
(25, 50]     3
(50, 100]    1
(10, 25]     0
(5, 10]      0
(1, 5]       0
(0, 1]       0
Name: percentage, dtype: int64

s = df.groupby(pd.cut(df["percentage"], bins=bins)).size()
print (s)
percentage
(0, 1]       0
(1, 5]       0
(5, 10]      0
(10, 25]     0
(25, 50]     3
(50, 100]    1
dtype: int64

By default cut returns categorical.

Series methods like Series.value_counts() will use all categories, even if some categories are not present in the data, operations in categorical.

Tutorials