Change language

Analyzing Test Data Using K-Means Clustering in Python

| | |

matplot -lib
Let’s first render the test data with Multiple Features using the matplot-lib tool.

# import of required tools

import numpy as np

from matplotlib import pyplot as plt

 
# create two test data

X = np.random.randint ( 10 , 35 , ( 25 , 2 ))

Y = np.random.randint ( 55 , 70 , ( 25 , 2 ))

Z = np.vstack ((X, Y))

Z = Z.reshape (( 50 , 2 ))

  
# convert to np.float32

Z = np.float32 (Z)

 

plt .xlabel ( ’Test Data’ )

plt.ylabel ( ’Z samples’ )

  

plt.hist (Z, 256 , [ 0 , 256 ])

 
plt.show ()

Here & # 39; Z & # 39; — it is an array of size 100 and values ​​in the range 0 to 255. Now the shape of & # 39; z & # 39; per column vector. It will be more useful when more than one function is present. Then change the data to type np.float32.

Output:

Now apply the k-Means clustering algorithm to the same example as in the test data above and see its behavior. 
Steps included:
1) First, we need to install the test data. 
2) Define the criteria and apply kmeans (). 
3) Now split the data. 
4) Finally, fill in the data.

import numpy as np

import cv2

from matplotlib import pyplot as plt

 

X = np.random.randint ( 10 , 45 , ( 25 , 2 ))

Y = np.random.randint ( 55 70 , ( 25 , 2 ))

Z = np.vstack ((X, Y))

 
# convert to np.float32

Z = np.float32 (Z)

 
# define criteria and apply kmeans ()

criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 10 , 1.0 )

ret, lab el, center = cv2.kmeans (Z, 2 , None , criteria, 10 , cv2.KMEANS_RANDOM_CENTERS)

 
# Now strip the data

A = Z [label.ravel () = = 0 ]

B = Z [label.ravel () = = 1 ]

  
# Data plot

plt.scatter (A [:,  0 ], A [:, 1 ])

plt.scatter (B [:, 0 ], B [:, 1 ], c = ’r’ )

plt.scatter (center [:, 0 ], center [:, 1 ], s = 80 , c = ’y’ , marker = ’s’ )

plt.xlabel ( ’Test Data’ ), plt.ylabel ( ’Z sample s’ )

plt.show ()

Output:

This example is intended to illustrate where k-means creates intuitively possible clusters.

Applications :
1) Identification of cancer data. 
2) Predicting student progress. 
3) Prediction of drug activity.

Shop

Learn programming in R: courses

$

Best Python online courses for 2022

$

Best laptop for Fortnite

$

Best laptop for Excel

$

Best laptop for Solidworks

$

Best laptop for Roblox

$

Best computer for crypto mining

$

Best laptop for Sims 4

$

Latest questions

NUMPYNUMPY

psycopg2: insert multiple rows with one query

12 answers

NUMPYNUMPY

How to convert Nonetype to int or string?

12 answers

NUMPYNUMPY

How to specify multiple return types using type-hints

12 answers

NUMPYNUMPY

Javascript Error: IPython is not defined in JupyterLab

12 answers

News


Wiki

Python OpenCV | cv2.putText () method

numpy.arctan2 () in Python

Python | os.path.realpath () method

Python OpenCV | cv2.circle () method

Python OpenCV cv2.cvtColor () method

Python - Move item to the end of the list

time.perf_counter () function in Python

Check if one list is a subset of another in Python

Python os.path.join () method