numpy.argsort() is used to perform indirect sorting on a given axis using the algorithm given by the keyword kind. It returns an array of indices in the same shape as arr, which will sort the array.
Syntax: numpy.argsort (arr, axis = -1, kind = ` quicksort `, order = None)
arr: [array_like] Input array.
axis : [int or None] Axis along which to sort. If None, the array is flattened before sorting. The default is -1, which sorts along the last axis.
kind: [`quicksort`, `mergesort`, `heapsort`] Selection algorithm. Default is `quicksort`.
order: [str or list of str] When arr is an array with fields defined, this argument specifies which fields to compare first, second, etc. p>
Return: [index_array, ndarray] Array of indices that sort arr along the specified axis.If arr is one-dimensional then arr [index_array] returns a sorted arr.
Code # 1:
Input unsorted array: [2 0 1 5 4 1 9] Output sorted array indices: [1 2 5 0 4 3 6] Output sorted array: [0 1 1 2 4 5 9]
Code # 2:
Input array: [[2 0 1] [5 4 3 ]] Output sorteded array indices along axis 0: [[0 0 0] [1 1 1]] Output sorteded array indices along axis 1: [[1 2 0] [2 1 0]]
Consider the following code:
avgDists = np.array([1, 8, 6, 9, 4]) ids = avgDists.argsort()[:n]
This gives me indices of the
n smallest elements. Is it possible to use this same
argsort in descending order to get the indices of
n highest elements?
Why is numpy giving this result:
x = numpy.array([1.48,1.41,0.0,0.1]) print x.argsort() >[2 3 1 0]
when I"d expect it to do this:
[3 2 0 1]
Clearly my understanding of the function is lacking.
To sort by the second column of
Newer NumPy versions (1.8 and up) have a function called
argpartition for this. To get the indices of the four largest elements, do
>>> a = np.array([9, 4, 4, 3, 3, 9, 0, 4, 6, 0]) >>> a array([9, 4, 4, 3, 3, 9, 0, 4, 6, 0]) >>> ind = np.argpartition(a, -4)[-4:] >>> ind array([1, 5, 8, 0]) >>> a[ind] array([4, 9, 6, 9])
argsort, this function runs in linear time in the worst case, but the returned indices are not sorted, as can be seen from the result of evaluating
a[ind]. If you need that too, sort them afterwards:
>>> ind[np.argsort(a[ind])] array([1, 8, 5, 0])
To get the top-k elements in sorted order in this way takes O(n + k log k) time.
The simplest I"ve been able to come up with is:
In : import numpy as np In : arr = np.array([1, 3, 2, 4, 5]) In : arr.argsort()[-3:][::-1] Out: array([4, 3, 1])
This involves a complete sort of the array. I wonder if
numpy provides a built-in way to do a partial sort; so far I haven"t been able to find one.
If this solution turns out to be too slow (especially for small
n), it may be worth looking at coding something up in Cython.
If you negate an array, the lowest elements become the highest elements and vice-versa. Therefore, the indices of the
n highest elements are:
Another way to reason about this, as mentioned in the comments, is to observe that the big elements are coming last in the argsort. So, you can read from the tail of the argsort to find the
n highest elements:
Both methods are O(n log n) in time complexity, because the
argsort call is the dominant term here. But the second approach has a nice advantage: it replaces an O(n) negation of the array with an O(1) slice. If you"re working with small arrays inside loops then you may get some performance gains from avoiding that negation, and if you"re working with huge arrays then you can save on memory usage because the negation creates a copy of the entire array.
Note that these methods do not always give equivalent results: if a stable sort implementation is requested to
argsort, e.g. by passing the keyword argument
kind="mergesort", then the first strategy will preserve the sorting stability, but the second strategy will break stability (i.e. the positions of equal items will get reversed).
Using a small array of 100 floats and a length 30 tail, the view method was about 15% faster
>>> avgDists = np.random.rand(100) >>> n = 30 >>> timeit (-avgDists).argsort()[:n] 1.93 ¬µs ¬± 6.68 ns per loop (mean ¬± std. dev. of 7 runs, 1000000 loops each) >>> timeit avgDists.argsort()[::-1][:n] 1.64 ¬µs ¬± 3.39 ns per loop (mean ¬± std. dev. of 7 runs, 1000000 loops each) >>> timeit avgDists.argsort()[-n:][::-1] 1.64 ¬µs ¬± 3.66 ns per loop (mean ¬± std. dev. of 7 runs, 1000000 loops each)
For larger arrays, the argsort is dominant and there is no significant timing difference
>>> avgDists = np.random.rand(1000) >>> n = 300 >>> timeit (-avgDists).argsort()[:n] 21.9 ¬µs ¬± 51.2 ns per loop (mean ¬± std. dev. of 7 runs, 10000 loops each) >>> timeit avgDists.argsort()[::-1][:n] 21.7 ¬µs ¬± 33.3 ns per loop (mean ¬± std. dev. of 7 runs, 10000 loops each) >>> timeit avgDists.argsort()[-n:][::-1] 21.9 ¬µs ¬± 37.1 ns per loop (mean ¬± std. dev. of 7 runs, 10000 loops each)
Please note that the comment from nedim below is incorrect. Whether to truncate before or after reversing makes no difference in efficiency, since both of these operations are only striding a view of the array differently and not actually copying data.
If you are using numpy, you have the argsort() function available:
>>> import numpy >>> numpy.argsort(myList) array([0, 1, 2, 4, 3])
This returns the arguments that would sort the array or list.
Nikie"s answer solved my problem, but his answer was in Mathematica. So I thought I should give its OpenCV adaptation here. But after implementing I could see that OpenCV code is much bigger than nikie"s mathematica code. And also, I couldn"t find interpolation method done by nikie in OpenCV ( although it can be done using scipy, i will tell it when time comes.)
1. Image PreProcessing ( closing operation )
import cv2 import numpy as np img = cv2.imread("dave.jpg") img = cv2.GaussianBlur(img,(5,5),0) gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY) mask = np.zeros((gray.shape),np.uint8) kernel1 = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(11,11)) close = cv2.morphologyEx(gray,cv2.MORPH_CLOSE,kernel1) div = np.float32(gray)/(close) res = np.uint8(cv2.normalize(div,div,0,255,cv2.NORM_MINMAX)) res2 = cv2.cvtColor(res,cv2.COLOR_GRAY2BGR)
2. Finding Sudoku Square and Creating Mask Image
thresh = cv2.adaptiveThreshold(res,255,0,1,19,2) contour,hier = cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE) max_area = 0 best_cnt = None for cnt in contour: area = cv2.contourArea(cnt) if area > 1000: if area > max_area: max_area = area best_cnt = cnt cv2.drawContours(mask,[best_cnt],0,255,-1) cv2.drawContours(mask,[best_cnt],0,0,2) res = cv2.bitwise_and(res,mask)
3. Finding Vertical lines
kernelx = cv2.getStructuringElement(cv2.MORPH_RECT,(2,10)) dx = cv2.Sobel(res,cv2.CV_16S,1,0) dx = cv2.convertScaleAbs(dx) cv2.normalize(dx,dx,0,255,cv2.NORM_MINMAX) ret,close = cv2.threshold(dx,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU) close = cv2.morphologyEx(close,cv2.MORPH_DILATE,kernelx,iterations = 1) contour, hier = cv2.findContours(close,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE) for cnt in contour: x,y,w,h = cv2.boundingRect(cnt) if h/w > 5: cv2.drawContours(close,[cnt],0,255,-1) else: cv2.drawContours(close,[cnt],0,0,-1) close = cv2.morphologyEx(close,cv2.MORPH_CLOSE,None,iterations = 2) closex = close.copy()
4. Finding Horizontal Lines
kernely = cv2.getStructuringElement(cv2.MORPH_RECT,(10,2)) dy = cv2.Sobel(res,cv2.CV_16S,0,2) dy = cv2.convertScaleAbs(dy) cv2.normalize(dy,dy,0,255,cv2.NORM_MINMAX) ret,close = cv2.threshold(dy,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU) close = cv2.morphologyEx(close,cv2.MORPH_DILATE,kernely) contour, hier = cv2.findContours(close,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE) for cnt in contour: x,y,w,h = cv2.boundingRect(cnt) if w/h > 5: cv2.drawContours(close,[cnt],0,255,-1) else: cv2.drawContours(close,[cnt],0,0,-1) close = cv2.morphologyEx(close,cv2.MORPH_DILATE,None,iterations = 2) closey = close.copy()
Of course, this one is not so good.
5. Finding Grid Points
res = cv2.bitwise_and(closex,closey)
6. Correcting the defects
Here, nikie does some kind of interpolation, about which I don"t have much knowledge. And i couldn"t find any corresponding function for this OpenCV. (may be it is there, i don"t know).
Check out this SOF which explains how to do this using SciPy, which I don"t want to use : Image transformation in OpenCV
So, here I took 4 corners of each sub-square and applied warp Perspective to each.
For that, first we find the centroids.
contour, hier = cv2.findContours(res,cv2.RETR_LIST,cv2.CHAIN_APPROX_SIMPLE) centroids =  for cnt in contour: mom = cv2.moments(cnt) (x,y) = int(mom["m10"]/mom["m00"]), int(mom["m01"]/mom["m00"]) cv2.circle(img,(x,y),4,(0,255,0),-1) centroids.append((x,y))
But resulting centroids won"t be sorted. Check out below image to see their order:
So we sort them from left to right, top to bottom.
centroids = np.array(centroids,dtype = np.float32) c = centroids.reshape((100,2)) c2 = c[np.argsort(c[:,1])] b = np.vstack([c2[i*10:(i+1)*10][np.argsort(c2[i*10:(i+1)*10,0])] for i in xrange(10)]) bm = b.reshape((10,10,2))
Now see below their order :
Finally we apply the transformation and create a new image of size 450x450.
output = np.zeros((450,450,3),np.uint8) for i,j in enumerate(b): ri = i/10 ci = i%10 if ci != 9 and ri!=9: src = bm[ri:ri+2, ci:ci+2 , :].reshape((4,2)) dst = np.array( [ [ci*50,ri*50],[(ci+1)*50-1,ri*50],[ci*50,(ri+1)*50-1],[(ci+1)*50-1,(ri+1)*50-1] ], np.float32) retval = cv2.getPerspectiveTransform(src,dst) warp = cv2.warpPerspective(res2,retval,(450,450)) output[ri*50:(ri+1)*50-1 , ci*50:(ci+1)*50-1] = warp[ri*50:(ri+1)*50-1 , ci*50:(ci+1)*50-1].copy()
The result is almost same as nikie"s, but code length is large. May be, better methods are available out there, but until then, this works OK.
Use numpy.argsort. It returns the indices one would use to sort the array.
import numpy as np import numpy.linalg as linalg A = np.random.random((3,3)) eigenValues, eigenVectors = linalg.eig(A) idx = eigenValues.argsort()[::-1] eigenValues = eigenValues[idx] eigenVectors = eigenVectors[:,idx]
If the eigenvalues are complex, the sort order is lexicographic (that is, complex numbers are sorted according to their real part first, with ties broken by their imaginary part).
According to the documentation
Returns the indices that would sort an array.
2is the index of
3is the index of
1is the index of
0is the index of
In Learn Python 3 the Hard Way PDF, you'll learn Python by working through 52 brilliantly crafted exercises. Read them. Enter your code exactly. (No copying and pasting!) Correct your mistakes. ...
This book project was first presented to me during my first week in my current role of managing the data mining development at SAS. Writ- ing a book has always been a bucket‐list item, and I was ver...
I have developed this book to investigate Mesos-based cluster development and integration. I found that data center operating system (DCOS; and it’s command-line interface [CLI]) was a natural progr...
Executing Data Quality Projects: Ten Steps to Quality Data and Trusted Information (TM), 2nd Edition. Execute Data Quality Projects, Second Edition presents a structured yet flexible approach to cr...