Change language

ML | Clustering mean shift

Shift Meanshift falls into under the clustering algorithm category, as opposed to unsupervised learning, which iteratively assigns data points to clusters by shifting points to mode (mode — highest data point density in a region in context of Meanshift). As such, it is also known as the mode search algorithm . The mean shift algorithm finds applications in image processing and computer vision.

Given a set of data points, the algorithm iteratively assigns each data point towards the closest cluster centroid and direction to the closest cluster centroid is determined by where most of the points nearby are at. So each iteration each data point will move closer to where the most points are at, which is or will lead to the cluster center. When the algorithm stops, each point is assigned to a cluster.

Unlike the popular K-Means clustering algorithm, the average bias does not require specifying the number of clusters in advance. The number of clusters is determined by the algorithm in relation to the data.

Note . The disadvantage of Mean Shift is that it is computationally expensive O (n²).

Estimating kernel density —

The first step when applying mean shift clustering algorithms is to represent your data in a mathematical way which mean s representing your data as dots like the set below.

The average bias is based on the concept of kernel density estimation, this is the kind of KDE. Imagine the above data were sampled from a probability distribution. KDE — it is a technique for estimating the underlying distribution, also called the probability density function for a dataset.

It works by placing a kernel at each point in the dataset. The core — it is a fancy mathematical word for a weight function commonly used in convolution. There are many different types of kernels, but the most popular is the Gaussian kernel. Adding all the individual kernels generates an example of the density of the probability surface function. Depending on the kernel bandwidth used, the resulting density function will be different.

Below is the KDE surface for our points using a Gaussian kernel with kernel bandwidth 2.

Site area :

Outline plot:

Below is the Python implementation:

import numpy as np

import pandas as pd

from sklearn.cluster import MeanShift

from sklearn .datasets.samples_generator import make_blobs

from matplotlib import pyplot as plt

from mpl_toolkits.mplot3d import Axes3D

# We’ll use the make_blobs method
# to generate our own data.


clusters = [[ 2 , 2  , 2 ], [ 7 , 7 , 7 ], [ 5 , 13 , 13 ]]


X, _ = make_blobs (n_samples = 150 , centers = clusters,

cluster_std = 0.60 )

# After training the model, we store
# coordinates for cluster centers

ms = MeanShift () (X)

cluster_centers = ms.cluster_centers_

# Finally, we plot the data points
# and centroids in 3D.

fig = plt.figure ()


ax = fig.add_subplot ( 111 , projection = ’ 3d’ )


ax.scatter (X [:, 0 ], X [:, 1 ], X [:, 2 ], marker = ’o’ )


ax.scatter (cluster_centers [:, 0 ], cluster_centers [:, 1 ],

cluster_centers [:, 2 ], marker = ’ x’ , color = ’red’ ,

s = 300 , linewidth = 5 , zorder = 10 ) ()

Try the code here

Exit :

To illustrate this, suppose we are given a dataset {ui} of points in d-dimensional space, sampled from some larger population, and that we have chosen the kernel K, which has the bandwidth parameter h. Together, this data and the kernel function return the following kernel density estimator for the full population density function.

The kernel function here must satisfy the following two conditions:

-" The first requirement is needed to ensure that our estimate is normalized.
-" The second is associated with the symmetry of our space.

Two popular kernel functions that satisfy these conditions are defined

Below is an example in one dimension using a Gaussian kernel to estimate the density of a population along the x-axis. We can see that each sample point adds a small Gaussian to our estimate centered around it, and the above equations may seem a little intimidating, but the figure here should make it clear that the concept is pretty simple.

Search in iterative mode —

 1. Initialize random seed and window W. 2. Calculate the center of gravity ( mean ) of W. 3. Shift the search window to the  mean . 4. Repeat Step 2 until convergence. 

General scheme of the algorithm —

for p in copied_points:

while not at_kde_peak:

p = shift (p, original_points)

The shift function looks like this —

def shift (p, origin al_points):

shift_x = float ( 0 )

shift_y = float ( 0 )

scale_factor = float ( 0 )


for p_temp in original_points:

  # numerator

dist = euclidean_dist (p, p_temp)

weight = kernel (dist, kernel_bandwidth)

shift_x + = p_temp [ 0 ] * weight

shift_y + = p_temp [ 1 ] * weight

  # denominator

  scale_factor + = weight


shift_x = shift_x / scale_factor

shift_y = shift_y / scale_factor

  return [shift_x, shift_y]

Pros :

  • Finds a variable number of modes
  • Durable to emissions
  • Generic, application-independent tool
  • Without a model, does not take any preliminary shape like spherical, elliptical, etc. In data clusters
  • Just one parameter ( window size h), where h has physical mean ing (as opposed to k- mean s)

Cons :

  • Output depends from window size
  • Selecting window size (bandwidth) is not trivial
  • Computationally (relatively) expensive (about 2s / image)
  • Does not scale with dimensions object spaces.


Gifts for programmers

Learn programming in R: courses

Gifts for programmers

Best Python online courses for 2022

Gifts for programmers

Best laptop for Fortnite

Gifts for programmers

Best laptop for Excel

Gifts for programmers

Best laptop for Solidworks

Gifts for programmers

Best laptop for Roblox

Gifts for programmers

Best computer for crypto mining

Gifts for programmers

Best laptop for Sims 4


Latest questions


Common xlabel/ylabel for matplotlib subplots

1947 answers


Check if one list is a subset of another in Python

1173 answers


How to specify multiple return types using type-hints

1002 answers


Printing words vertically in Python

909 answers


Python Extract words from a given string

798 answers


Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

606 answers


Python os.path.join () method

384 answers


Flake8: Ignore specific warning for entire file

360 answers



Python | How to copy data from one Excel sheet to another

Common xlabel/ylabel for matplotlib subplots

Check if one list is a subset of another in Python

How to specify multiple return types using type-hints

Printing words vertically in Python

Python Extract words from a given string

Cyclic redundancy check in Python

Finding mean, median, mode in Python without libraries

Python add suffix / add prefix to strings in a list

Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

Python - Move item to the end of the list

Python - Print list vertically