Python | Root mean square error



Steps to find the MSE

  1. Find the equation for the regression line.

    (1)  

  2. Insert the X values ​​into the equation found in step 1 to get the corresponding Y values, i.e.

    (2)  

  3. Now subtract the new Y values ​​(ie ) from the original values ​​of Y. Thus, the found values ​​are erroneous terms. This is also known as the vertical distance of a given point from the regression line.

    (3)  

  4. Square the errors found in step 3.

    (4)  

  5. Summarize all squares.

    (5)  

  6. Divide the value found in step 5 by the total number of observations.

    (6)  

Example:
Consider these points: (1,1), (2,1), (3 , 2), (4,2), (5,4)
You can use this online calculator to find the equation / regression line.

Regression line equation: Y = 0.7X — 0,1

X Y

1 1 0.6
2 1 1.29
3 2 1.99
4 2 2.69
5 4 3.4

Now using the formula found for MSE on step 6 above, we can get MSE = 0.21606

MSE using scikit — learn:

  Output:  0.21606 

MSE using Numpy module:

from sklearn.metrics import mean_squared_error

  
# Specified values ​​

Y_true = [ 1 , 1 , 2 , 2 , 4 # Y_true = Y (original values )

 
# calculated values ​​

Y_pred = < / code> [ 0.6 , 1.29 , 1.99 , 2.69 , 3.4 # Y_pred = Y & # 39;

 
# Calculation mean square error (MSE)
mean_squared_error (Y_true, Y_pred)

import numpy as np

  
# Specified values ​​

Y_true = [ 1 , 1 , 2 , 2 , 4 # Y_true = Y (original values)

 
# Estimated values ​​

Y_pred = [ 0.6 , 1.29 , 1.99 , 2.69 , 3.4 # Y_pred = Y & # 39;

  
# Mean square error

MSE = np.square (np.subtract (Y_true, Y_pred)). mean ()

  Output:  0.21606