argsort | Arrays | NumPy | Python Methods and Functions

** **** **

** **

Record arrays allow you to access fields as elements of an array using ` arr.a and arr.b `

** numpy.recarray.argsort () returns the indices that will sort this array. **

Syntax:`numpy.recarray.argsort (arr, axis = -1, kind = `quicksort` , order = None)`

Parameters:

arr:Input Record array.

axis :[int or None] Axis along which to sort. If None, the array is flattened before sorting. The default is -1, which sorts along the last axis.

kind:[`quicksort`, `mergesort`, `heapsort`] Selection algorithm. Default is `quicksort`.

order:[str or list of str] When arr is an array with fields defined, this argument specifies which fields to compare first, second, etc.

Return:[index_array, ndarray] Array of indices that sort arr along the specified axis.

** Code # 1: **

` ` |

** Output: **

Input array: [[(5.0, 2) (3.0, -4) (6.0, 9)] [(9.0, 1) (5.0, 4) (-12.0, -7)]] Record array of float: [[5. 3. 6.] [9. 5. -12.]] Record array of int: [[2 -4 9] [1 4 -7]] Output sorted array indices along axis 1 : [[1 0 2] [2 1 0]] Output sorted array indices array along axis 0: [[1 0 1] [0 1 0]]

** Code no. 2: **

We are applying ` numpy.recarray.argsort () `

to the entire array of records.

` ` |

** Output:**

Input array: [[(5.0, 2 ) (3.0, 4) (6.0, -7)] [(9.0, 1) (6.0, 4) (-2.0, -7)]] Output sorted array indices: [[1 0 2] [2 1 0]]

Consider the following code:

```
avgDists = np.array([1, 8, 6, 9, 4])
ids = avgDists.argsort()[:n]
```

This gives me indices of the `n`

smallest elements. Is it possible to use this same `argsort`

in descending order to get the indices of `n`

highest elements?

Why is numpy giving this result:

```
x = numpy.array([1.48,1.41,0.0,0.1])
print x.argsort()
>[2 3 1 0]
```

when I"d expect it to do this:

[3 2 0 1]

Clearly my understanding of the function is lacking.

To sort by the second column of `a`

:

```
a[a[:, 1].argsort()]
```

Newer NumPy versions (1.8 and up) have a function called `argpartition`

for this. To get the indices of the four largest elements, do

```
>>> a = np.array([9, 4, 4, 3, 3, 9, 0, 4, 6, 0])
>>> a
array([9, 4, 4, 3, 3, 9, 0, 4, 6, 0])
>>> ind = np.argpartition(a, -4)[-4:]
>>> ind
array([1, 5, 8, 0])
>>> a[ind]
array([4, 9, 6, 9])
```

Unlike `argsort`

, this function runs in linear time in the worst case, but the returned indices are not sorted, as can be seen from the result of evaluating `a[ind]`

. If you need that too, sort them afterwards:

```
>>> ind[np.argsort(a[ind])]
array([1, 8, 5, 0])
```

To get the top-*k* elements in sorted order in this way takes O(*n* + *k* log *k*) time.

The simplest I"ve been able to come up with is:

```
In [1]: import numpy as np
In [2]: arr = np.array([1, 3, 2, 4, 5])
In [3]: arr.argsort()[-3:][::-1]
Out[3]: array([4, 3, 1])
```

This involves a complete sort of the array. I wonder if `numpy`

provides a built-in way to do a partial sort; so far I haven"t been able to find one.

If this solution turns out to be too slow (especially for small `n`

), it may be worth looking at coding something up in Cython.

If you negate an array, the lowest elements become the highest elements and vice-versa. Therefore, the indices of the `n`

highest elements are:

```
(-avgDists).argsort()[:n]
```

Another way to reason about this, as mentioned in the comments, is to observe that the big elements are coming *last* in the argsort. So, you can read from the tail of the argsort to find the `n`

highest elements:

```
avgDists.argsort()[::-1][:n]
```

Both methods are *O(n log n)* in time complexity, because the `argsort`

call is the dominant term here. But the second approach has a nice advantage: it replaces an *O(n)* negation of the array with an *O(1)* slice. If you"re working with small arrays inside loops then you may get some performance gains from avoiding that negation, and if you"re working with huge arrays then you can save on memory usage because the negation creates a copy of the entire array.

Note that these methods do not always give equivalent results: if a stable sort implementation is requested to `argsort`

, e.g. by passing the keyword argument `kind="mergesort"`

, then the first strategy will preserve the sorting stability, but the second strategy will break stability (i.e. the positions of equal items will get reversed).

*Example timings:*

Using a small array of 100 floats and a length 30 tail, the view method was about 15% faster

```
>>> avgDists = np.random.rand(100)
>>> n = 30
>>> timeit (-avgDists).argsort()[:n]
1.93 ¬µs ¬± 6.68 ns per loop (mean ¬± std. dev. of 7 runs, 1000000 loops each)
>>> timeit avgDists.argsort()[::-1][:n]
1.64 ¬µs ¬± 3.39 ns per loop (mean ¬± std. dev. of 7 runs, 1000000 loops each)
>>> timeit avgDists.argsort()[-n:][::-1]
1.64 ¬µs ¬± 3.66 ns per loop (mean ¬± std. dev. of 7 runs, 1000000 loops each)
```

For larger arrays, the argsort is dominant and there is no significant timing difference

```
>>> avgDists = np.random.rand(1000)
>>> n = 300
>>> timeit (-avgDists).argsort()[:n]
21.9 ¬µs ¬± 51.2 ns per loop (mean ¬± std. dev. of 7 runs, 10000 loops each)
>>> timeit avgDists.argsort()[::-1][:n]
21.7 ¬µs ¬± 33.3 ns per loop (mean ¬± std. dev. of 7 runs, 10000 loops each)
>>> timeit avgDists.argsort()[-n:][::-1]
21.9 ¬µs ¬± 37.1 ns per loop (mean ¬± std. dev. of 7 runs, 10000 loops each)
```

Please note that the comment from nedim below is incorrect. Whether to truncate before or after reversing makes no difference in efficiency, since both of these operations are only striding a view of the array differently and not actually copying data.

If you are using numpy, you have the argsort() function available:

```
>>> import numpy
>>> numpy.argsort(myList)
array([0, 1, 2, 4, 3])
```

http://docs.scipy.org/doc/numpy/reference/generated/numpy.argsort.html

This returns the arguments that would sort the array or list.

Nikie"s answer solved my problem, but his answer was in Mathematica. So I thought I should give its OpenCV adaptation here. But after implementing I could see that OpenCV code is much bigger than nikie"s mathematica code. And also, I couldn"t find interpolation method done by nikie in OpenCV ( although it can be done using scipy, i will tell it when time comes.)

**1. Image PreProcessing ( closing operation )**

```
import cv2
import numpy as np
img = cv2.imread("dave.jpg")
img = cv2.GaussianBlur(img,(5,5),0)
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
mask = np.zeros((gray.shape),np.uint8)
kernel1 = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(11,11))
close = cv2.morphologyEx(gray,cv2.MORPH_CLOSE,kernel1)
div = np.float32(gray)/(close)
res = np.uint8(cv2.normalize(div,div,0,255,cv2.NORM_MINMAX))
res2 = cv2.cvtColor(res,cv2.COLOR_GRAY2BGR)
```

Result :

**2. Finding Sudoku Square and Creating Mask Image**

```
thresh = cv2.adaptiveThreshold(res,255,0,1,19,2)
contour,hier = cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
max_area = 0
best_cnt = None
for cnt in contour:
area = cv2.contourArea(cnt)
if area > 1000:
if area > max_area:
max_area = area
best_cnt = cnt
cv2.drawContours(mask,[best_cnt],0,255,-1)
cv2.drawContours(mask,[best_cnt],0,0,2)
res = cv2.bitwise_and(res,mask)
```

Result :

**3. Finding Vertical lines**

```
kernelx = cv2.getStructuringElement(cv2.MORPH_RECT,(2,10))
dx = cv2.Sobel(res,cv2.CV_16S,1,0)
dx = cv2.convertScaleAbs(dx)
cv2.normalize(dx,dx,0,255,cv2.NORM_MINMAX)
ret,close = cv2.threshold(dx,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
close = cv2.morphologyEx(close,cv2.MORPH_DILATE,kernelx,iterations = 1)
contour, hier = cv2.findContours(close,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
for cnt in contour:
x,y,w,h = cv2.boundingRect(cnt)
if h/w > 5:
cv2.drawContours(close,[cnt],0,255,-1)
else:
cv2.drawContours(close,[cnt],0,0,-1)
close = cv2.morphologyEx(close,cv2.MORPH_CLOSE,None,iterations = 2)
closex = close.copy()
```

Result :

**4. Finding Horizontal Lines**

```
kernely = cv2.getStructuringElement(cv2.MORPH_RECT,(10,2))
dy = cv2.Sobel(res,cv2.CV_16S,0,2)
dy = cv2.convertScaleAbs(dy)
cv2.normalize(dy,dy,0,255,cv2.NORM_MINMAX)
ret,close = cv2.threshold(dy,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
close = cv2.morphologyEx(close,cv2.MORPH_DILATE,kernely)
contour, hier = cv2.findContours(close,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
for cnt in contour:
x,y,w,h = cv2.boundingRect(cnt)
if w/h > 5:
cv2.drawContours(close,[cnt],0,255,-1)
else:
cv2.drawContours(close,[cnt],0,0,-1)
close = cv2.morphologyEx(close,cv2.MORPH_DILATE,None,iterations = 2)
closey = close.copy()
```

Result :

Of course, this one is not so good.

**5. Finding Grid Points**

```
res = cv2.bitwise_and(closex,closey)
```

Result :

**6. Correcting the defects**

Here, nikie does some kind of interpolation, about which I don"t have much knowledge. And i couldn"t find any corresponding function for this OpenCV. (may be it is there, i don"t know).

Check out this SOF which explains how to do this using SciPy, which I don"t want to use : Image transformation in OpenCV

So, here I took 4 corners of each sub-square and applied warp Perspective to each.

For that, first we find the centroids.

```
contour, hier = cv2.findContours(res,cv2.RETR_LIST,cv2.CHAIN_APPROX_SIMPLE)
centroids = []
for cnt in contour:
mom = cv2.moments(cnt)
(x,y) = int(mom["m10"]/mom["m00"]), int(mom["m01"]/mom["m00"])
cv2.circle(img,(x,y),4,(0,255,0),-1)
centroids.append((x,y))
```

But resulting centroids won"t be sorted. Check out below image to see their order:

So we sort them from left to right, top to bottom.

```
centroids = np.array(centroids,dtype = np.float32)
c = centroids.reshape((100,2))
c2 = c[np.argsort(c[:,1])]
b = np.vstack([c2[i*10:(i+1)*10][np.argsort(c2[i*10:(i+1)*10,0])] for i in xrange(10)])
bm = b.reshape((10,10,2))
```

Now see below their order :

Finally we apply the transformation and create a new image of size 450x450.

```
output = np.zeros((450,450,3),np.uint8)
for i,j in enumerate(b):
ri = i/10
ci = i%10
if ci != 9 and ri!=9:
src = bm[ri:ri+2, ci:ci+2 , :].reshape((4,2))
dst = np.array( [ [ci*50,ri*50],[(ci+1)*50-1,ri*50],[ci*50,(ri+1)*50-1],[(ci+1)*50-1,(ri+1)*50-1] ], np.float32)
retval = cv2.getPerspectiveTransform(src,dst)
warp = cv2.warpPerspective(res2,retval,(450,450))
output[ri*50:(ri+1)*50-1 , ci*50:(ci+1)*50-1] = warp[ri*50:(ri+1)*50-1 , ci*50:(ci+1)*50-1].copy()
```

Result :

The result is almost same as nikie"s, but code length is large. May be, better methods are available out there, but until then, this works OK.

Regards ARK.

Use numpy.argsort. It returns the indices one would use to sort the array.

```
import numpy as np
import numpy.linalg as linalg
A = np.random.random((3,3))
eigenValues, eigenVectors = linalg.eig(A)
idx = eigenValues.argsort()[::-1]
eigenValues = eigenValues[idx]
eigenVectors = eigenVectors[:,idx]
```

If the eigenvalues are complex, the sort order is lexicographic (that is, complex numbers are sorted according to their real part first, with ties broken by their imaginary part).

According to the documentation

Returns the indices that would sort an array.

`2`

is the index of`0.0`

.`3`

is the index of`0.1`

.`1`

is the index of`1.41`

.`0`

is the index of`1.48`

.

Managing and analyzing data have always offered the greatest benefits and the greatest challenges for organizations of all sizes and across all industries. Businesses have long struggled with finding ...

10/07/2020

Learn how data literacy is changing the world and giving you a better understanding of life's biggest problems in this "Important and Comprehensive" Guide to Statistical Thinking (New York). The bi...

28/08/2021

This book introduces machine learning methods in finance. It features a unified treatment of machine learn...

12/08/2021

Systems programming provides the basis for global calculation. Developing performance-sensitive code requires a programming language that allows programmers to control the use of memory, processor tim...

23/09/2021

X
# Submit new EBook