Change language

Interquartile range and quartile deviation using NumPy and SciPy

| | | |

Quartile search algorithm:
Quartiles are calculated using the mean -median-mode-in-python-without-libraries/">median. If the number of records is an even number, i.e. 2n, then the first quartile (Q1) is equal to the mean -median-mode-in-python-without-libraries/">median of n smallest records, and the third quartile (Q3) is equal to the mean -median-mode-in-python-without-libraries/">median of n largest records.

If the number of records is odd, that is, in the form (2n + 1), then

  • the first quartile (Q1) is the mean -median-mode-in-python-without-libraries/">median of n smallest records
  • third quartile (Q1) is the mean -median-mode-in-python-without-libraries/">median of n largest records
  • second quartile (Q2) is the same, like a normal mean -median-mode-in-python-without-libraries/">median.

Range: is the difference between the largest value and the smallest value in a given dataset. 
Interquartile range:
Interquartile range (IQR), also called mean or mean 50% , or technically H-spread — it is the difference between the third quartile (Q3) and the first quartile (Q1). It covers the distribution center and contains 50% of the observations.  IQR = Q3 — Q1

Uses :

  • The interquartile range has a breakdown point of 25%, which is why it is often preferred over the entire range.
  • IQR is used to plot box plots, simple graphical representations of probability distributions.
  • IQR can also be used to identify outliers in a given dataset.
  • IQR gives the central trend of the data .

Make a decision

  • The dataset has a higher interquartile range (IQR) and more variability.
  • A dataset with a lower interquartile range (IQR) is preferred.

Suppose that if we have two datasets and their interquartile ranges are IR1 and IR2, and if IR1" IR2, it is said that the data in IR1 has more variability than the data in IR2, and the data in IR2 is preferable.

Example :

  • Below is the number of candidates enrolled each day in the last 20 days for the course — 
    Data Structures and Algorithms —  DSA Online 3 in Python.Engineering
    75, 69, 56, 46, 47, 79, 92, 97, 89, 88, 36, 96, 105, 32, 116, 101, 79, 93, 91, 112
  • After sorting the above dataset:
    32, 36, 46, 47, 56, 69, 75, 79, 79, 88, 89, 91, 92, 93, 96, 97, 101, 105, 112, 116
  • The total number of terms here is 20.
  • The second quartile (Q2) or mean -median-mode-in-python-without-libraries/">median of the above data is (88 + 89) / 2 = 88.5
  • First quartile (Q1) is the mean -median-mode-in-python-without-libraries/">median of the first n, that is, 10 terms (or n, that is, 10 smallest values) = 62.5
  • Third quartile (Q3) — this is the mean -median-mode-in-python-without-libraries/">median n, i.e. The 10 largest values ​​(or the last n, i.e. 10 values) = 96.5.
  • Then IQR = Q3 — Q1 = 96.5 — 62.5 = 34.0

Interquartile range using numpy.mean -median-mode-in-python-without-libraries/">median

# Import numpy library as np

import numpy as np

 

data = [ 32 , 36 , 46 , 47 , 56 , 69 , 75 , 79 , 79 , 88 , 89 , 91 , 92 , 93 , 96 , 97

101 , 105 , 112 , 116 ]

 
# First quartile (Q1)

Q1 = np.mean -median-mode-in-python-without-libraries/">median (data [: 10 ])

  
# Third quartile (Q3)

Q3 = np.mean -median-mode-in-python-without-libraries/">median (data [ 10 :])

 
# Interquartile range (IQR)

IQR = Q3 - Q1

 

print (IQR)

  Output:  34.0 

Interquartile range using numpy.percentile

# Import digital library

 

import numpy as np

  

data = [ 32 , 36 , 46 , 47 , 56 , 69 , 75 , 79 , 79 , 88 , 89 , 91 , 92 , 93 < / code> , 96 , 97

101 , 105 , 112 , 116 ]

 
# First quartile (Q1)

Q1 = np.percentile (data, 25 , interpolation = ’midpoint’ )

 
# Third quartile (Q3)

Q3 = np.percentile (data, 75 , interpolation = ’ midpoint’ )

  
# Inter-apartment range (IQR)

IQR = Q3 - Q1

 

print (IQR)

  Output:  34.0 

Interquartile range using scipy.stats.iqr

# Import statistics from the Scipy library

from scipy import stats

 

data = [ 32 , 36 , 46 , 47 , 56 , 69 , 75 , 79 , 79 , 88 , 89 , 91 , 92 , 93 , 96 , 97

101 , 105 , 112 , 116 ]

 
# Interquartile range (IQR)

IQR = stats.iqr (data, interpolation = ’ midpoint’ )

  

print (IQR)

  Output:  34.0 

< strong> Quartile Deviation
Quartile Deviation — this is half the difference between the third quartile (Q3) and the first quartile (Q1), i.e. half of the interquartile range (IQR).  (Q3 — Q1) / 2 = IQR / 2

Make a decision
Dataset with higher quartile deviation , has higher volatility.

Quartile deflection using numpy.mean -median-mode-in-python-without-libraries/">median

# import the numpy library as np

import numpy as np

 

data = [ 32 , 36 , 46 , 47 , 56 , 69 , 75 , 79 , 79 , 88 , 89 , 91 , 92 , 93 , 96 , 97

101 , 105 , 112 , 116 ]

 
# First quartile (Q1)

Q1 = np.mean -median-mode-in-python-without-libraries/">median (data [: 10 ])

 
# Third quartile (Q3)

Q3 = np.mean -median-mode-in-python-without-libraries/">median (data [ 10 :])

 
# Interquartile range (IQR)

IQR = Q3 - Q1

  
# Quartile Deviation

qd = IQR / 2

  

print (qd) 

  Output:  17.0 

Shop

Learn programming in R: courses

$

Best Python online courses for 2022

$

Best laptop for Fortnite

$

Best laptop for Excel

$

Best laptop for Solidworks

$

Best laptop for Roblox

$

Best computer for crypto mining

$

Best laptop for Sims 4

$

Latest questions

NUMPYNUMPY

psycopg2: insert multiple rows with one query

12 answers

NUMPYNUMPY

How to convert Nonetype to int or string?

12 answers

NUMPYNUMPY

How to specify multiple return types using type-hints

12 answers

NUMPYNUMPY

Javascript Error: IPython is not defined in JupyterLab

12 answers

News


Wiki

Python OpenCV | cv2.putText () method

numpy.arctan2 () in Python

Python | os.path.realpath () method

Python OpenCV | cv2.circle () method

Python OpenCV cv2.cvtColor () method

Python - Move item to the end of the list

time.perf_counter () function in Python

Check if one list is a subset of another in Python

Python os.path.join () method