Quartile search algorithm:
Quartiles are calculated using the mean -median-mode-in-python-without-libraries/">median. If the number of records is an even number, i.e. 2n, then the first quartile (Q1) is equal to the mean -median-mode-in-python-without-libraries/">median of n smallest records, and the third quartile (Q3) is equal to the mean -median-mode-in-python-without-libraries/">median of n largest records.
If the number of records is odd, that is, in the form (2n + 1), then
- the first quartile (Q1) is the mean -median-mode-in-python-without-libraries/">median of n smallest records
- third quartile (Q1) is the mean -median-mode-in-python-without-libraries/">median of n largest records
- second quartile (Q2) is the same, like a normal mean -median-mode-in-python-without-libraries/">median.
Range: is the difference between the largest value and the smallest value in a given dataset.
Interquartile range:
Interquartile range (IQR), also called mean or mean 50% , or technically H-spread — it is the difference between the third quartile (Q3) and the first quartile (Q1). It covers the distribution center and contains 50% of the observations. IQR = Q3 — Q1
Uses :
- The interquartile range has a breakdown point of 25%, which is why it is often preferred over the entire range.
- IQR is used to plot box plots, simple graphical representations of probability distributions.
- IQR can also be used to identify outliers in a given dataset.
- IQR gives the central trend of the data .
Make a decision
- The dataset has a higher interquartile range (IQR) and more variability.
- A dataset with a lower interquartile range (IQR) is preferred.
Suppose that if we have two datasets and their interquartile ranges are IR1 and IR2, and if IR1" IR2, it is said that the data in IR1 has more variability than the data in IR2, and the data in IR2 is preferable.
Example :
- Below is the number of candidates enrolled each day in the last 20 days for the course —
Data Structures and Algorithms — DSA Online 3 in Python.Engineering
75, 69, 56, 46, 47, 79, 92, 97, 89, 88, 36, 96, 105, 32, 116, 101, 79, 93, 91, 112 - After sorting the above dataset:
32, 36, 46, 47, 56, 69, 75, 79, 79, 88, 89, 91, 92, 93, 96, 97, 101, 105, 112, 116 - The total number of terms here is 20.
- The second quartile (Q2) or mean -median-mode-in-python-without-libraries/">median of the above data is (88 + 89) / 2 = 88.5
- First quartile (Q1) is the mean -median-mode-in-python-without-libraries/">median of the first n, that is, 10 terms (or n, that is, 10 smallest values) = 62.5
- Third quartile (Q3) — this is the mean -median-mode-in-python-without-libraries/">median n, i.e. The 10 largest values (or the last n, i.e. 10 values) = 96.5.
- Then IQR = Q3 — Q1 = 96.5 — 62.5 = 34.0
Interquartile range using numpy.mean -median-mode-in-python-without-libraries/">median
|
Output: 34.0
Interquartile range using numpy.percentile
|
Output: 34.0
Interquartile range using scipy.stats.iqr
|
Output: 34.0
< strong> Quartile Deviation
Quartile Deviation — this is half the difference between the third quartile (Q3) and the first quartile (Q1), i.e. half of the interquartile range (IQR). (Q3 — Q1) / 2 = IQR / 2
Make a decision
Dataset with higher quartile deviation , has higher volatility.
Quartile deflection using numpy.mean -median-mode-in-python-without-libraries/">median
|
Output: 17.0