Change language

# Python | Binning method for data smoothing

The binning method is used to smooth data or process noisy data. In this method, the data is first sorted and then the sorted values ​​are spread across multiple segments or cells. Because binning methods refer to a neighborhood of values, they perform local smoothing.

There are three approaches to performing smoothing:

Smoothing by bin mean s: In smoothing by bin mean s, each value in a bin is replaced by the mean value of the bin.
Smoothing by bin mean -median-mode-in-python-without-libraries/">median: In this method each bin value is replaced by its bin mean -median-mode-in-python-without-libraries/">median value.
Smoothing by bin boundary: In smoothing by bin boundaries, the minimum and maximum values ​​in a given bin are identified as the bin boundaries. Each bin value is then replaced by the closest boundary value.

Fit :

1. Sort an array of a given dataset.
2. Divides the range into N bins, each containing approximately the same number of samples (division by equal depth).
3. Store the mean / mean -median-mode-in-python-without-libraries/">median / bounds in each row.
4. Examples :

` Sorted data for price (in dollars): 4, 8, 9, 15, 21, 21, 24, 25, 26, 28, 29, 34  Smoothing by bin  mean s:  - Bin 1: 9, 9, 9, 9 - Bin 2: 23, 23, 23, 23 - Bin 3: 29, 29, 29, 29  Smoothing by bin boundaries:  - Bin 1: 4, 4, 4, 15 - Bin 2: 21, 21, 25, 25 - Bin 3: 26, 26, 26, 34  Smoothing by bin mean -median-mode-in-python-without-libraries/">median :  - Bin 1: 9 9, 9, 9 - Bin 2: 24, 24, 24, 24 - Bin 3: 29, 29, 29, 29 `

Below is Python implementation for the above algorithm —

 ` import ` ` numpy as np ` ` import ` ` math ` ` from ` ` sklearn.datasets ` ` import ` ` load_iris ` ` from ` ` sklearn ` ` import ` ` datasets, linear_model, metrics `   ` # load iris dataset ` ` dataset ` ` = ` ` load_iris () ` ` a ` ` = ` ` dataset.data ` ` b ` ` = ` ` np.zeros (` ` 150 ` ` ) `     ` # take the 1st column among the 4 columns of the dataset ` ` for ` ` i ` ` in ` ` range ` ` (` ` 150 ` `): ` ` b [i] ` ` = ` ` a [i, ` ` 1 ` `] ` ` `  ` b ` ` = ` ` np.sort ( b) ` ` # sort array `   ` # create bins ` ` bin1 ` ` = ` ` np.zeros ((` ` 30 ` `, ` ` 5 ` `)) ` ` bin2 ` ` = ` ` np.zeros ((` ` 30 ` `, ` ` 5 ` `)) ` ` bin3 ` ` = ` ` np.zeros ((` ` 30 ` `, ` ` 5 ` `)) `   ` # Ben mean s ` ` for ` ` i ` ` in ` ` range ` ` ( ` ` 0 ` `, ` ` 150 ` `, ` ` 5 ` `): ` ` k ` ` = ` ` int ` ` (i ` ` / ` ` 5 ` `) ` ` mean ` ` = ` ` (b [i] ` ` + ` ` b [i ` ` + ` ` 1 ` `] ` ` + ` ` b [i ` ` + ` ` 2 ` `] ` ` + ` ` b [i ` ` + ` ` 3 ` `] ` ` + ` ` b [i ` ` + ` ` 4 ` `]) ` ` / ` ` 5 `   ` for ` ` j ` ` in ` ` range ` ` (` ` 5 ` `) : ` ` bin1 [k, j] ` ` = ` ` mean ` ` print ` ` (` ` "Bin Mean:" ` `, bin1) `   ` # Border bin ` ` for ` ` i ` ` in ` ` range ` ` (` ` 0 ` `, ` ` 150 ` `, ` ` 5 ` `): ` ` ` ` k ` ` = ` ` int ` ` (i ` ` / ` ` 5 ` `) ` ` for ` ` j ` ` in ` ` range ` ` (` ` 5 ` `): ` ` if ` ` (b [i ` ` + ` ` j] ` ` - ` ` b [i]) & lt; (b [i ` ` + ` ` 4 ` `] ` ` - ` ` b [i ` ` + ` ` j]): ` ` bin2 [k, j] ` ` = ` ` b [i] ` ` else ` `: ` ` bin2 [k, j ] ` ` = ` ` b [i ` ` + ` ` 4 ` `] ` ` print ` ` (` ` "Bin Boundaries:" ` `, bin2) `   ` # Ben mean -median-mode-in-python-without-libraries/">median ` ` for ` ` i ` ` in ` ` range ` ` (` ` 0 ` `, ` ` 150 ` `, ` ` 5 ` `): ` ` k ` ` = ` ` int ` ` (i ` ` / ` ` 5 ` `) ` ` ` ` for ` ` j ` ` in ` ` range ` ` (` ` 5 ` `): ` ` bin3 [k, j] ` ` = ` ` b [i ` ` + ` ` 2 ] `` print ( "Bin Median:" , bin3) `

## Shop

Best laptop for Excel

\$

Best laptop for Solidworks

\$399+

Best laptop for Roblox

\$399+

Best laptop for development

\$499+

Best laptop for Cricut Maker

\$299+

Best laptop for hacking

\$890

Best laptop for Machine Learning

\$699+

Raspberry Pi robot kit

\$150

Latest questions

PythonStackOverflow

Common xlabel/ylabel for matplotlib subplots

PythonStackOverflow

Check if one list is a subset of another in Python

PythonStackOverflow

How to specify multiple return types using type-hints

PythonStackOverflow

Printing words vertically in Python

PythonStackOverflow

Python Extract words from a given string

PythonStackOverflow

Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

PythonStackOverflow

Python os.path.join () method

PythonStackOverflow

Flake8: Ignore specific warning for entire file

## Wiki

Python | How to copy data from one Excel sheet to another

Common xlabel/ylabel for matplotlib subplots

Check if one list is a subset of another in Python

How to specify multiple return types using type-hints

Printing words vertically in Python

Python Extract words from a given string

Cyclic redundancy check in Python

Finding mean, median, mode in Python without libraries

Python add suffix / add prefix to strings in a list

Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

Python - Move item to the end of the list

Python - Print list vertically