  # ML | Mathematical explanation of standard deviation and R-squared error RMSE: root mean square error — it is a measure of how well the regression line fits the data points. RMSE can also be interpreted as the standard deviation of the residual.
Consider these points: (1, 1), (2, 2), (2, 3), (3, 6).
We split the above data points into the 1st list.

`  x =  [1, 2, 2, 3]  y =  [1, 2, 3, 6] `

Code: Regression Plot

 ` import ` ` matplotlib.pyplot as plt ` ` import ` ` math `   ` # dots ` ` plt. plot (x, y) `   ` # x-axis title ` ` plt.xlabel (` ` `x - axis` ` `) `   ` # Y-axis name ` ` plt.ylabel (` ` `y - axis` ` `) `   ` # giving a title to my graphic ` ` plt.title (` ` `Regression Graph` ` `) ` ` `  ` # plot show function ` ` plt.show () `

Code: Average

` `

` # in the next step we will find the equation of the line of best fit # we will use the slope of a linear algebra point to find the equation of the regression line # the shape of the slope is represented by y = mx + c # where m means slope (change in y) / (change in x) # c is a constant, it represents where the line will cross the Y-axis # Slope m can be formulated like this: "" "   N m =? (xi - Xmean) (yi - Ymean) /? (xi - Xmean) ^ 2 i = 1 "" " # calculate Xmean and Ymean ct = len (x) sum_x = 0 sum_y = 0   for i in x:   sum_x = sum_x + i x_mean = sum_x / ct print ( ` Value of X mean` , x_mean)   for i in y: sum_y = sum_y + i y_mean = sum_y / ct print ( `value of Y mean` , y_mean)   # we have x mean and y_mean `

` `

Output:

` Value of X mean 2.0 value of Y mean 3.0 `

Code: linear equation

 ` # below is the process of finding a linear equation in mathematical terms ` ` # our line`s slope is 2.5 ` ` # evaluate c to find the equation `   ` m ` ` = ` ` 2.5 ` ` c ` ` = ` ` y_mean ` ` - ` ` m ` ` * ` ` x_mean ` ` print ` ` ( ` ` `Intercept` ` `, c) `

Output:

` Intercept -2.0 `

Code: Medium square error

 ` Our regression line equation looks like this: ` ` # y_pred = 2.5x-2.0 ` ` # we name the string y_pred ` ` # insert a regression line plot `` from sklearn.metrics import mean_squared_error  # y_pred for our exhaustive data points as shown below   y = [ 1 , 2 , 3 , 6 ] y_pred = [ 0.5 , 3 , 3 , 5.5 ] `

 ` # sklearn root mean square ` ` mse ` ` = ` ` math.sqrt (mean_squared_error (y, y_pred)) ` ` print ` ` (` ` `Root mean square error` ` `, mse) `

Output:

` Root mean square error 0.6123724356957945 `

Code: RMSE Calculation

 ` # let`s see how the mean square is calculated mathematically ` ` # let`s introduce a term called residuals ` ` # remainder - it is basically the distance from the data point to the regression line ` ` # the residuals are indicated by the red line in the graph below ` ` # RMS and residuals are calculated as shown below ` ` # we have 4 data points ` ` "" "` ` r = 1, ri = yi-y_pred ` ` y_pred is mx + c ` ` ri = yi- (mx + c) ` ` eg x = 1, we have the y value as 1 ` ` we want to evaluate exactly what our model predicted for x = 1 ` ` (1, 1) r1 = 1, x = 2 ` ` "" "` ` # y_pred1 = 1- (2.5 * 1-2.0 ) = 0.5 ` ` r1 ` ` = ` ` 1 ` ` - ` ` (` ` 2.5 ` ` * ` ` 1 ` ` - ` ` 2.0 ` `) `   ` # (2, 2) r2 = 2, x = 2 ` ` # y_pred2 = 2- (2.5 * 2-2.0) = - 1 ` ` r2 ` ` = 2 - ( 2.5 * 2 - 2.0 ) ``   # (2, 3) r3 = 3, x = 2 # y_pred3 = 3- (2.5 * 2-2.0) = 0 r3 = 3 - ( 2.5 * 2 - 2.0 )    # (3, 6) r4 = 4, x = 3 # y_pred4 = 6- (2.5 * 3-2.0) =. 5 r4 = 6 - ( 2.5 * 3 - 2.0 )   # on top of the calculation we have residual values ​​ residuals = [ 0.5 , - 1 , 0 ,. 5 ]   # now calculate the root mean square error # N = 4 tons data points N = 4 rmse = math.sqrt ((r1 * * 2 + r2 * * 2 + r3 * * 2 + r4 * * 2 ) / N) print ( `Root Mean square error using maths` , rmse)   # RMS value actually calculated using math # both RMSEs are calculated the same `

Output:

` Root Mean square error using maths 0.6123724356957945 `

R-squared error or coefficient of determination
Error R2 answers the following question.
How much y changes with change in x. Basically, the percentage change in y when changing from x