Change language

ML | Mathematical explanation of standard deviation and R-squared error

RMSE: root mean square error — it is a measure of how well the regression line fits the data points. RMSE can also be interpreted as the standard deviation of the residual.
Consider these points: (1, 1), (2, 2), (2, 3), (3, 6).
We split the above data points into the 1st list.
Login :

  x =  [1, 2, 2, 3]  y =  [1, 2, 3, 6] 

Code: Regression Plot

import matplotlib.pyplot as plt 

import math

 
# dots
plt. plot (x, y) 

 
# x-axis title

plt.xlabel ( ’x - axis’

 
# Y-axis name

plt.ylabel ( ’y - axis’

 
# giving a title to my graphic

plt.title ( ’Regression Graph’

  
# plot show function
plt.show () 


Code: Average

# in the next step we will find the equation of the line of best fit
# we will use the slope of a linear algebra point to find the equation of the regression line
# the shape of the slope is represented by y = mx + c
# where m mean s slope (change in y) / (change in x)
# c is a constant, it represents where the line will cross the Y-axis
# Slope m can be formulated like this:
"" "

  N

m =? (xi - X mean ) (yi - Y mean ) /? (xi - X mean ) ^ 2

i = 1

"" "
# calculate X mean and Y mean

ct = len (x)

sum_x = 0

sum_y = 0

 

for i in x:

  sum_x = sum_x + i

x_ mean = sum_x / ct

print ( ’ Value of X mean , x_ mean )

 

for i in y:

sum_y = sum_y + i

y_ mean = sum_y / ct

print ( ’value of Y mean , y_ mean )

 
# we have x mean and y_ mean

Output:

 Value of X  mean  2.0 value of Y  mean  3.0 

Code: linear equation

# below is the process of finding a linear equation in mathematical terms
# our line’s slope is 2.5
# evaluate c to find the equation

 

m = 2.5

c = y_ mean - m * x_ mean

print ( ’Intercept’ , c)

Output:

 Intercept -2.0 

Code: Medium square error

Our regression line equation looks like this:
# y_pred = 2.5x-2.0
# we name the string y_pred
# insert a regression line plot

from sklearn.metrics import mean _squared_error 

# y_pred for our exhaustive data points as shown below

 

y = [ 1 , 2 , 3 , 6 ]

y_pred = [ 0.5 , 3 , 3 , 5.5 ]

# sklearn root mean square

mse = math. sqrt ( mean _squared_error (y, y_pred))

print ( ’Root mean square error’ , mse)

Output:

 Root  mean  square error 0.6123724356957945 

Code: RMSE Calculation

# let’s see how the mean square is calculated mathematically
# let’s introduce a term called residuals
# remainder - it is basically the distance from the data point to the regression line
# the residuals are indicated by the red line in the graph below
# RMS and residuals are calculated as shown below
# we have 4 data points
"" "
r = 1, ri = yi-y_pred
y_pred is mx + c
ri = yi- (mx + c)
eg x = 1, we have the y value as 1
we want to evaluate exactly what our model predicted for x = 1
(1, 1) r1 = 1, x = 2
"" "
# y_pred1 = 1- (2.5 * 1-2.0 ) = 0.5

r1 = 1 - ( 2.5 * 1 - 2.0 )

 
# (2, 2) r2 = 2, x = 2
# y_pred2 = 2- (2.5 * 2-2.0) = - 1

r2 = 2 - ( 2.5 * 2 - 2.0 )

 
# (2, 3) r3 = 3, x = 2
# y_pred3 = 3- (2.5 * 2-2.0) = 0

r3 = 3 - ( 2.5 * 2 - 2.0 )

  
# (3, 6) r4 = 4, x = 3
# y_pred4 = 6- (2.5 * 3-2.0) =. 5

r4 = 6 - ( 2.5 * 3 - 2.0 )

 
# on top of the calculation we have residual values ​​

residuals = [ 0.5 , - 1 , 0 ,. 5 ]

 
# now calculate the root mean square error
# N = 4 tons data points

N = 4

rmse = math. sqrt ((r1 * * 2 + r2 * * 2 + r3 * * 2 + r4 * * 2 ) / N)

print ( ’Root Mean square error using maths’ , rmse)

 
# RMS value actually calculated using math
# both RMSEs are calculated the same

Output:

 Root Mean square error using maths 0.6123724356957945 

R-squared error or coefficient of determination
Error R2 answers the following question.
How much y changes with change in x. Basically, the percentage change in y when changing from x

Code: R-Squared Error

# SEline = (y1- (mx1 + b ) ** 2 + y2- (mx2 + b) ** 2 ... + yn- (mxn + b) ** 2)
# SE_line = ( 1- (2.5 * 1 + (- 2)) ** 2 + (2- (2.5 * 2 + (- 2)) ** 2) + (3- (2.5 * (2) + (-) 2)) ** 2) + (6- (2.5 * (3) + (- 2)) ** 2))

 

val1 = ( 1 - ( 2.5 * 1 + ( - 2 ))) * * 2

val2 = ( 2 - ( 2.5 * 2 + ( - 2 ))) * * 2

val3 = ( 3 - ( 2.5 * 2 + ( - 2 ))) * * 2

val4 = ( 6 - ( 2.5 * 3 + ( - 2 ))) * * 2

SE_line = val1 + val2 + val3 + val4

print ( ’val’ , val1, val2, val3, val4)

 
# next to the total deviation of Y from the mean
# the change in y is calculated as
# y_var = (y1-y mean ) ** 2+ (y2-y mean ) ** 2 ... + (yn-y mean ) 2

 

y = [ 1 , 2 , 3 , 6 ]

 

y_var = ( 1 - 3 ) * * 2 + ( 2 - 3 ) * * 2 + ( 3 - 3 ) * * 2 + ( 6 - 3 ) * * 2

SE_ mean = y_var

 
# by computing y_var, we compute the distance
# between data points y and mean y
# so answer to our question,% of total deviation
# of x is denoted as below:

r_squared = 1 - (SE_line / SE_ mean )

  
# [SE_line / SE_ mean ] -" tells us what% variation
# not described by the regression line
# 1- (SE_li ne / SE_ mean ) -" gives us the exact mean ing
# how much% y varies with x

print ( ’Rsquared error’ , r_squared)

Output:

 Rsquared error 0.8928571428571429 

Code: R-Squared Error with sklear

from sklearn.metrics import r2_score

  
Error # r2 calculated by sklearn is similar
# to our mathematically calculated error r2
# compute r2 error with sklearn
r2 _score (y, y_pred)

Output:

0.8928571428571429 

Shop

Gifts for programmers

Learn programming in R: courses

$FREE
Gifts for programmers

Best Python online courses for 2022

$FREE
Gifts for programmers

Best laptop for Fortnite

$399+
Gifts for programmers

Best laptop for Excel

$
Gifts for programmers

Best laptop for Solidworks

$399+
Gifts for programmers

Best laptop for Roblox

$399+
Gifts for programmers

Best computer for crypto mining

$499+
Gifts for programmers

Best laptop for Sims 4

$

Latest questions

PythonStackOverflow

Common xlabel/ylabel for matplotlib subplots

1947 answers

PythonStackOverflow

Check if one list is a subset of another in Python

1173 answers

PythonStackOverflow

How to specify multiple return types using type-hints

1002 answers

PythonStackOverflow

Printing words vertically in Python

909 answers

PythonStackOverflow

Python Extract words from a given string

798 answers

PythonStackOverflow

Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

606 answers

PythonStackOverflow

Python os.path.join () method

384 answers

PythonStackOverflow

Flake8: Ignore specific warning for entire file

360 answers

News


Wiki

Python | How to copy data from one Excel sheet to another

Common xlabel/ylabel for matplotlib subplots

Check if one list is a subset of another in Python

How to specify multiple return types using type-hints

Printing words vertically in Python

Python Extract words from a given string

Cyclic redundancy check in Python

Finding mean, median, mode in Python without libraries

Python add suffix / add prefix to strings in a list

Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

Python - Move item to the end of the list

Python - Print list vertically