numpy.argsort () in Python

argsort | Arrays | NumPy | Python Methods and Functions

numpy.argsort() is used to perform indirect sorting on a given axis using the algorithm given by the keyword kind. It returns an array of indices in the same shape as arr, which will sort the array.

Syntax: numpy.argsort (arr, axis = -1, kind = ` quicksort `, order = None)

Parameters:
arr: [array_like] Input array.
axis : [int or None] Axis along which to sort. If None, the array is flattened before sorting. The default is -1, which sorts along the last axis.
kind: [`quicksort`, `mergesort`, `heapsort`] Selection algorithm. Default is `quicksort`.
order: [str or list of str] When arr is an array with fields defined, this argument specifies which fields to compare first, second, etc.

Return: [index_array, ndarray] Array of indices that sort arr along the specified axis.If arr is one-dimensional then arr [index_array] returns a sorted arr.

Code # 1:

# Python program explaining
# argpartition () function

 

import numpy as geek

 
# input array

in_arr = geek.array ([ 2 , 0 1 , 5 , 4 , 1 , 9 ])

print ( "Input unsorted array:" , in_arr) 

 

out_arr = geek.argsort (in_arr)

print ( "Output sorted array indices:" , out_arr)

print ( "Output sorted array: " , in_arr [out_arr])

Output :

 Input unsorted array: [2 0 1 5 4 1 9] Output sorted array indices: [1 2 5 0 4 3 6] Output sorted array: [0 1 1 2 4 5 9] 

Code # 2:

# Python program explaining
# argpartition () function

 

import numpy as geek

 
# input 2d array

in_arr = geek.array ([[ 2 , 0 , 1 ], [ 5 , 4 , 3 ]])

print ( "Input array:" , in_arr) 

  
# output sorted index arrays

out_arr1 = geek.argsort (in_arr, kind = `mergesort` , axis = 0 )

print ( "Output sorteded array indices along axis 0:" , out_arr1)

out_arr2 = geek.argsort (in_arr, kind = `heapsort` , axis = 1 )

print ( " Output sorteded array indices along axis 1: " , out_arr2)

Output:

 Input array: [[2 0 1] [5 4 3 ]] Output sorteded array indices along axis 0: [[0 0 0] [1 1 1]] Output sorteded array indices along axis 1: [[1 2 0] [2 1 0]] 




numpy.argsort () in Python: StackOverflow Questions

Is it possible to use argsort in descending order?

Consider the following code:

avgDists = np.array([1, 8, 6, 9, 4])
ids = avgDists.argsort()[:n]

This gives me indices of the n smallest elements. Is it possible to use this same argsort in descending order to get the indices of n highest elements?

Numpy argsort - what is it doing?

Why is numpy giving this result:

x = numpy.array([1.48,1.41,0.0,0.1])
print x.argsort()

>[2 3 1 0]

when I"d expect it to do this:

[3 2 0 1]

Clearly my understanding of the function is lacking.

Answer #1

To sort by the second column of a:

a[a[:, 1].argsort()]

Answer #2

Newer NumPy versions (1.8 and up) have a function called argpartition for this. To get the indices of the four largest elements, do

>>> a = np.array([9, 4, 4, 3, 3, 9, 0, 4, 6, 0])
>>> a
array([9, 4, 4, 3, 3, 9, 0, 4, 6, 0])
>>> ind = np.argpartition(a, -4)[-4:]
>>> ind
array([1, 5, 8, 0])
>>> a[ind]
array([4, 9, 6, 9])

Unlike argsort, this function runs in linear time in the worst case, but the returned indices are not sorted, as can be seen from the result of evaluating a[ind]. If you need that too, sort them afterwards:

>>> ind[np.argsort(a[ind])]
array([1, 8, 5, 0])

To get the top-k elements in sorted order in this way takes O(n + k log k) time.

Answer #3

The simplest I"ve been able to come up with is:

In [1]: import numpy as np

In [2]: arr = np.array([1, 3, 2, 4, 5])

In [3]: arr.argsort()[-3:][::-1]
Out[3]: array([4, 3, 1])

This involves a complete sort of the array. I wonder if numpy provides a built-in way to do a partial sort; so far I haven"t been able to find one.

If this solution turns out to be too slow (especially for small n), it may be worth looking at coding something up in Cython.

Answer #4

If you negate an array, the lowest elements become the highest elements and vice-versa. Therefore, the indices of the n highest elements are:

(-avgDists).argsort()[:n]

Another way to reason about this, as mentioned in the comments, is to observe that the big elements are coming last in the argsort. So, you can read from the tail of the argsort to find the n highest elements:

avgDists.argsort()[::-1][:n]

Both methods are O(n log n) in time complexity, because the argsort call is the dominant term here. But the second approach has a nice advantage: it replaces an O(n) negation of the array with an O(1) slice. If you"re working with small arrays inside loops then you may get some performance gains from avoiding that negation, and if you"re working with huge arrays then you can save on memory usage because the negation creates a copy of the entire array.

Note that these methods do not always give equivalent results: if a stable sort implementation is requested to argsort, e.g. by passing the keyword argument kind="mergesort", then the first strategy will preserve the sorting stability, but the second strategy will break stability (i.e. the positions of equal items will get reversed).

Example timings:

Using a small array of 100 floats and a length 30 tail, the view method was about 15% faster

>>> avgDists = np.random.rand(100)
>>> n = 30
>>> timeit (-avgDists).argsort()[:n]
1.93 µs ± 6.68 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
>>> timeit avgDists.argsort()[::-1][:n]
1.64 µs ± 3.39 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
>>> timeit avgDists.argsort()[-n:][::-1]
1.64 µs ± 3.66 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

For larger arrays, the argsort is dominant and there is no significant timing difference

>>> avgDists = np.random.rand(1000)
>>> n = 300
>>> timeit (-avgDists).argsort()[:n]
21.9 µs ± 51.2 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
>>> timeit avgDists.argsort()[::-1][:n]
21.7 µs ± 33.3 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
>>> timeit avgDists.argsort()[-n:][::-1]
21.9 µs ± 37.1 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

Please note that the comment from nedim below is incorrect. Whether to truncate before or after reversing makes no difference in efficiency, since both of these operations are only striding a view of the array differently and not actually copying data.

Answer #5

If you are using numpy, you have the argsort() function available:

>>> import numpy
>>> numpy.argsort(myList)
array([0, 1, 2, 4, 3])

http://docs.scipy.org/doc/numpy/reference/generated/numpy.argsort.html

This returns the arguments that would sort the array or list.

Answer #6

Nikie"s answer solved my problem, but his answer was in Mathematica. So I thought I should give its OpenCV adaptation here. But after implementing I could see that OpenCV code is much bigger than nikie"s mathematica code. And also, I couldn"t find interpolation method done by nikie in OpenCV ( although it can be done using scipy, i will tell it when time comes.)

1. Image PreProcessing ( closing operation )

import cv2
import numpy as np

img = cv2.imread("dave.jpg")
img = cv2.GaussianBlur(img,(5,5),0)
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
mask = np.zeros((gray.shape),np.uint8)
kernel1 = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(11,11))

close = cv2.morphologyEx(gray,cv2.MORPH_CLOSE,kernel1)
div = np.float32(gray)/(close)
res = np.uint8(cv2.normalize(div,div,0,255,cv2.NORM_MINMAX))
res2 = cv2.cvtColor(res,cv2.COLOR_GRAY2BGR)

Result :

Result of closing

2. Finding Sudoku Square and Creating Mask Image

thresh = cv2.adaptiveThreshold(res,255,0,1,19,2)
contour,hier = cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)

max_area = 0
best_cnt = None
for cnt in contour:
    area = cv2.contourArea(cnt)
    if area > 1000:
        if area > max_area:
            max_area = area
            best_cnt = cnt

cv2.drawContours(mask,[best_cnt],0,255,-1)
cv2.drawContours(mask,[best_cnt],0,0,2)

res = cv2.bitwise_and(res,mask)

Result :

enter image description here

3. Finding Vertical lines

kernelx = cv2.getStructuringElement(cv2.MORPH_RECT,(2,10))

dx = cv2.Sobel(res,cv2.CV_16S,1,0)
dx = cv2.convertScaleAbs(dx)
cv2.normalize(dx,dx,0,255,cv2.NORM_MINMAX)
ret,close = cv2.threshold(dx,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
close = cv2.morphologyEx(close,cv2.MORPH_DILATE,kernelx,iterations = 1)

contour, hier = cv2.findContours(close,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
for cnt in contour:
    x,y,w,h = cv2.boundingRect(cnt)
    if h/w > 5:
        cv2.drawContours(close,[cnt],0,255,-1)
    else:
        cv2.drawContours(close,[cnt],0,0,-1)
close = cv2.morphologyEx(close,cv2.MORPH_CLOSE,None,iterations = 2)
closex = close.copy()

Result :

enter image description here

4. Finding Horizontal Lines

kernely = cv2.getStructuringElement(cv2.MORPH_RECT,(10,2))
dy = cv2.Sobel(res,cv2.CV_16S,0,2)
dy = cv2.convertScaleAbs(dy)
cv2.normalize(dy,dy,0,255,cv2.NORM_MINMAX)
ret,close = cv2.threshold(dy,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
close = cv2.morphologyEx(close,cv2.MORPH_DILATE,kernely)

contour, hier = cv2.findContours(close,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
for cnt in contour:
    x,y,w,h = cv2.boundingRect(cnt)
    if w/h > 5:
        cv2.drawContours(close,[cnt],0,255,-1)
    else:
        cv2.drawContours(close,[cnt],0,0,-1)

close = cv2.morphologyEx(close,cv2.MORPH_DILATE,None,iterations = 2)
closey = close.copy()

Result :

enter image description here

Of course, this one is not so good.

5. Finding Grid Points

res = cv2.bitwise_and(closex,closey)

Result :

enter image description here

6. Correcting the defects

Here, nikie does some kind of interpolation, about which I don"t have much knowledge. And i couldn"t find any corresponding function for this OpenCV. (may be it is there, i don"t know).

Check out this SOF which explains how to do this using SciPy, which I don"t want to use : Image transformation in OpenCV

So, here I took 4 corners of each sub-square and applied warp Perspective to each.

For that, first we find the centroids.

contour, hier = cv2.findContours(res,cv2.RETR_LIST,cv2.CHAIN_APPROX_SIMPLE)
centroids = []
for cnt in contour:
    mom = cv2.moments(cnt)
    (x,y) = int(mom["m10"]/mom["m00"]), int(mom["m01"]/mom["m00"])
    cv2.circle(img,(x,y),4,(0,255,0),-1)
    centroids.append((x,y))

But resulting centroids won"t be sorted. Check out below image to see their order:

enter image description here

So we sort them from left to right, top to bottom.

centroids = np.array(centroids,dtype = np.float32)
c = centroids.reshape((100,2))
c2 = c[np.argsort(c[:,1])]

b = np.vstack([c2[i*10:(i+1)*10][np.argsort(c2[i*10:(i+1)*10,0])] for i in xrange(10)])
bm = b.reshape((10,10,2))

Now see below their order :

enter image description here

Finally we apply the transformation and create a new image of size 450x450.

output = np.zeros((450,450,3),np.uint8)
for i,j in enumerate(b):
    ri = i/10
    ci = i%10
    if ci != 9 and ri!=9:
        src = bm[ri:ri+2, ci:ci+2 , :].reshape((4,2))
        dst = np.array( [ [ci*50,ri*50],[(ci+1)*50-1,ri*50],[ci*50,(ri+1)*50-1],[(ci+1)*50-1,(ri+1)*50-1] ], np.float32)
        retval = cv2.getPerspectiveTransform(src,dst)
        warp = cv2.warpPerspective(res2,retval,(450,450))
        output[ri*50:(ri+1)*50-1 , ci*50:(ci+1)*50-1] = warp[ri*50:(ri+1)*50-1 , ci*50:(ci+1)*50-1].copy()

Result :

enter image description here

The result is almost same as nikie"s, but code length is large. May be, better methods are available out there, but until then, this works OK.

Regards ARK.

Answer #7

Use numpy.argsort. It returns the indices one would use to sort the array.

import numpy as np
import numpy.linalg as linalg

A = np.random.random((3,3))
eigenValues, eigenVectors = linalg.eig(A)

idx = eigenValues.argsort()[::-1]   
eigenValues = eigenValues[idx]
eigenVectors = eigenVectors[:,idx]

If the eigenvalues are complex, the sort order is lexicographic (that is, complex numbers are sorted according to their real part first, with ties broken by their imaginary part).

Answer #8

According to the documentation

Returns the indices that would sort an array.

  • 2 is the index of 0.0.
  • 3 is the index of 0.1.
  • 1 is the index of 1.41.
  • 0 is the index of 1.48.

Get Solution for free from DataCamp guru