 # Analyzing Test Data Using K-Means Clustering in Python

matplot -lib
Let`s first render the test data with Multiple Features using the matplot-lib tool.

` `

``` # import of required tools import numpy as np from matplotlib import pyplot as plt   # create two test data X = np.random.randint ( 10 , 35 , ( 25 , 2 )) Y = np.random.randint ( 55 , 70 , ( 25 , 2 )) Z = np.vstack ((X, Y)) Z = Z.reshape (( 50 , 2 ))    # convert to np.float32 Z = np.float32 (Z)   plt .xlabel ( `Test Data` ) plt.ylabel ( `Z samples` )    plt.hist (Z, 256 , [ 0 , 256 ])   plt.show () ```

Here & # 39; Z & # 39; — it is an array of size 100 and values ​​in the range 0 to 255. Now the shape of & # 39; z & # 39; per column vector. It will be more useful when more than one function is present. Then change the data to type np.float32.

Output: Now apply the k-Means clustering algorithm to the same example as in the test data above and see its behavior.
Steps included:
1) First, we need to install the test data.
2) Define the criteria and apply kmeans ().
3) Now split the data.
4) Finally, fill in the data.

 ` import ` ` numpy as np ` ` import ` ` cv2 ` ` from ` ` matplotlib ` ` import ` ` pyplot as plt `   ` X ` ` = ` ` np.random.randint (` ` 10 ` `, ` ` 45 ` `, (` ` 25 ` `, ` ` 2 ` `)) ` ` Y ` ` = ` ` np.random.randint (` ` 55 ` `, ` ` 70 ` `, (` ` 25 ` `, ` ` 2 ` `)) ` ` Z ` ` = ` ` np.vstack ((X, Y)) `   ` # convert to np.float32 ` ` Z = np.float32 (Z) ````   # define criteria and apply kmeans () criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 10 , 1.0 ) ret, lab el, center = cv2.kmeans (Z, 2 , None , criteria, 10 , cv2.KMEANS_RANDOM_CENTERS)   # Now strip the data A = Z [label.ravel () = = 0 ] B = Z [label.ravel () = = 1 ]    # Data plot plt.scatter (A [:,  0 ], A [:, 1 ]) plt.scatter (B [:, 0 ], B [:, 1 ], c = `r` ) plt.scatter (center [:, 0 ], center [:, 1 ], s = 80 , c = `y` , marker = `s` ) plt.xlabel ( `Test Data` ), plt.ylabel ( `Z sample s` ) plt.show () ```

Output: This example is intended to illustrate where k-means creates intuitively possible clusters.

Applications :
1) Identification of cancer data.
2) Predicting student progress.
3) Prediction of drug activity.