RMSE: root mean square error — it is a measure of how well the regression line fits the data points. RMSE can also be interpreted as the standard deviation of the residual. Consider these points: (1, 1), (2, 2), (2, 3), (3, 6). We split the above data points into the 1st list. Login :
x = [1, 2, 2, 3] y = [1, 2, 3, 6]
Code: Regression Plot
import matplotlib.pyplot as plt
import math
# dots plt. plot (x, y)
# x-axis title
plt.xlabel ( ’x - axis’ )
# Y-axis name
plt.ylabel ( ’y - axis’ )
# giving a title to my graphic
plt.title ( ’Regression Graph’ )
# plot show function plt.show ()
Code: Average
# in the next step we will find the equation of the line of best fit # we will use the slope of a linear algebra point to find the equation of the regression line # the shape of the slope is represented by y = mx + c # where m mean s slope (change in y) / (change in x) # c is a constant, it represents where the line will cross the Y-axis # Slope m can be formulated like this: "" "
N
m =? (xi - X mean ) (yi - Y mean ) /? (xi - X mean ) ^ 2
# let’s see how the mean square is calculated mathematically # let’s introduce a term called residuals # remainder - it is basically the distance from the data point to the regression line # the residuals are indicated by the red line in the graph below # RMS and residuals are calculated as shown below # we have 4 data points "" " r = 1, ri = yi-y_pred y_pred is mx + c ri = yi- (mx + c) eg x = 1, we have the y value as 1 we want to evaluate exactly what our model predicted for x = 1 (1, 1) r1 = 1, x = 2 "" " # y_pred1 = 1- (2.5 * 1-2.0 ) = 0.5
print ( ’Root Mean square error using maths’ , rmse)
# RMS value actually calculated using math # both RMSEs are calculated the same
Output:
Root Mean square error using maths 0.6123724356957945
R-squared error or coefficient of determination Error R2 answers the following question. How much y changes with change in x. Basically, the percentage change in y when changing from x
# next to the total deviation of Y from the mean # the change in y is calculated as # y_var = (y1-y mean ) ** 2+ (y2-y mean ) ** 2 ... + (yn-y mean ) 2
# by computing y_var, we compute the distance # between data points y and mean y # so answer to our question,% of total deviation # of x is denoted as below:
# [SE_line / SE_ mean ] -" tells us what% variation # not described by the regression line # 1- (SE_li ne / SE_ mean ) -" gives us the exact mean ing # how much% y varies with x
print ( ’Rsquared error’ , r_squared)
Output:
Rsquared error 0.8928571428571429
Code: R-Squared Error with sklear
from sklearn.metrics import r2_score
Error # r2 calculated by sklearn is similar # to our mathematically calculated error r2 # compute r2 error with sklearn r2 _score (y, y_pred)
Output:
0.8928571428571429
Shop
Learn programming in R: courses
$FREE
Best Python online courses for 2022
$FREE
Best laptop for Fortnite
$399+
Best laptop for Excel
$
Best laptop for Solidworks
$399+
Best laptop for Roblox
$399+
Best computer for crypto mining
$499+
Best laptop for Sims 4
$
Latest questions
PythonStackOverflow
Common xlabel/ylabel for matplotlib subplots
1947 answers
PythonStackOverflow
Check if one list is a subset of another in Python
1173 answers
PythonStackOverflow
How to specify multiple return types using type-hints
1002 answers
PythonStackOverflow
Printing words vertically in Python
909 answers
PythonStackOverflow
Python Extract words from a given string
798 answers
PythonStackOverflow
Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?