- & gt; Deviation - & gt; Variance - & gt; Standard Deviation - & gt; Mean Absolute Deviation - & gt; Meadian Absolute Deviation - & gt; Order Statistics - & gt; Range - & gt; Percentile - & gt; Inter-quartile Range
Median absolute deviation: Mean absolute deviation, variance and standard deviation (discussed in the previous section) are not robust to extremes and outliers. We average the sum of the deviations from the median.
Sequence: [2, 4, 6, 8] Mean = 5 Deviation around mean = [-3, -1, 1, 3] Mean Absolute Deviation = (3 + 1 + 1 + 3) / 4
# Median Absolute Deviation
import numpy as np
def mad (data):
return np.median (np.absolute (
data - np.median (data)))
Sequence = [ 2 , 4 , 10 , 6 , 8 , 11 ]
print (" Median Absolute Deviation: ", mad (Sequence))
Median Absolute Deviation: 3.0
Statistics orders. This approach to measuring variability is based on the scatter of ranked (sorted) data.
Range: This is the most basic measurement related to order statistics. This is the difference between the largest and smallest value in the dataset. It is useful to know the dissemination of data, but it is very sensitive to outliers. We can do this better by dropping the extremes. Example:
Sequence: [2, 30, 50, 46, 37, 91] Here, 2 and 91 are outliers Range = 91 - 2 = 89 Range without outliers = 50 - 30 = 20
Percentile: This is a very good measure for measuring data variability while avoiding outliers. P — percentile in data — is the value at which the P% values or less are at least less than it, and the values at least (100 — P) are greater than P. Median — this is the 50th percentile of the data. Example:
print ( "60th Percentile:" , np. percentile (Sequence, 60 ))
50th Percentile: 41.5 60th Percentile: 46.0
Inter Quartile Range (IQR): works for ranked (sorted data). He has data on the division into 3 quartiles — Q1 (25- th percentile), Q2 (50- th percentile) and Q3 (75- th percentile). Interquartile range — this is the difference between Q3 and Q1.