# Python | Linear regression using sklearns

| | |

Linear Regression — it is a supervised learning based machine learning algorithm. It performs a regression task. Regression models the prediction target based on the explanatory variables. It is mainly used to figure out the relationship between variables and forecasting. Different regression models differ depending on the type of relationship between the dependent and explanatory variables they are looking at and the number of explanatory variables used.

This article will demonstrate how to use various Python libraries to implement linear regression on a given set data. We will demonstrate a binary linear model as it will be easier to visualize.

` import ` ` numpy as np `

` import ` ` pandas as pd `

` import ` ` seaborn as sns `

` import ` ` matplotlib.pyplot as plt `

` from ` ` sklearn ` ` import ` ` preprocessing, svm `

` from ` ` sklearn.model_selection ` ` import ` ` train_test_split `

` from ` ` sklearn.linear_model ` ` import ` ` LinearRegression `

` `

` Step 3: Examining data scatter cd C: UsersDevDesktopKaggleSalinity   # Change the file reading location to match the dataset location df = pd.read_csv ( ’bottle.csv’ ) df_binary = df [[ ’Salnty’ , ’ T_degC’ ]]   # Take only two selected attributes from the dataset df_binary.columns = [ ’Sal’ , ’ Temp’ ]   # Renaming columns for easier coding df_binary.head ()   # Display only the first lines along with the column names sns.lmplot (x = "Sal" , y = " Temp " , data = df_binary, order = 2 , ci = None )    # Plotting data scatter `

` ` Step 4: Clean up the data

 ` # Eliminate NaNs or missing input numbers ` < code class = "plain"> df_binary.fillna (method ` = ` ` ’ffill’ ` `, inplace ` ` = ` ` True ` `) `

Step 5: Train Our Model

 ` X ` ` = ` ` np.array (df_binary [` ` ’Sal’ ` `]). reshape (` ` - ` ` 1 ` `, ` ` 1 ` `) ` ` y ` ` = ` ` np.array (df_binary [` ` ’Temp’ ` `]). reshape (` ` - ` ` 1 ` `, ` ` 1 ` `) `   ` # Separating data into independent and dependent variables ` ` # Convert each data frame to a NumPy array ` ` # since each data frame contains only one column ` ` df_binary.dropna (inplace ` ` = ` ` True ` `) `   ` # Delete any lines with Nan values ​​` ` X_train, X_test, y_train, y_test ` ` = ` ` train_test_split (X, y, test_size ` ` = ` ` 0.25 ` `) `   ` # Divide data into training and test data ` ` regr ` ` = ` ` LinearRegression () `   ` regr.fit (X_train, y_train) ` ` print ` ` (regr.score (X_test, y_test)) ` Step 6: Examine our results

 ` y_pred ` ` = ` ` regr.predict (X_test) ` ` plt. scatter (X_test, y_test, color ` ` = ` ` ’b’ ` `) ` ` plt.plot (X_t est, y_pred, color ` ` = ` ` ’k’ ` `) `   ` plt.show () ` ` # Scatter of data by predicted values ​​` The low accuracy of our model indicates that our regression model did not fit the existing ones very well data. This suggests that our data is not suitable for linear regression. But sometimes a dataset can accept a linear regressor if we only consider a part of it. Let’s check it out.

Step 7: Working with a smaller dataset

` `

` df_binary500 = df_binary [:] [: 500 ]    # Select the first 500 lines of data sns.lmplot (x = "Sal" , y = "Temp" , data = df_binary500, order = 2 , ci = None ) `

` ` We already see that the first 500 lines follow a linear models. Continue with the same steps as before.

 ` df_binary500.fillna (method ` ` = ` ` ’ffill’ ` `, inplace ` ` = ` ` True ` `) `   ` X ` ` = ` ` np.array (df_binary500 [` `’ Sal’ ` `]). reshape (` ` - ` ` 1 ` `, ` ` 1 ` `) ` ` y ` ` = ` ` np.array (df_binary500 [` ` ’Temp’ ` `]). reshape (` ` - ` ` 1 ` `, ` ` 1 ` `) `   ` df_binary500.dropna (inplace ` ` = ` ` True ` `) ` ` X_train, X_test, y_train, y_test ` ` = ` ` train_test_split (X, y, test_size ` ` = ` ` 0.25 ` `) `   ` regr ` ` = ` ` LinearRegression () ` ` regr.fit (X_train, y_train) ` ` print ` ` (regr.score (X_test, y_test)) ` ` y_pred ` ` = ` ` regr.predict (X_test) ` ` plt.scatter (X_test, y_test, color ` ` = ` ` ’b’ ` `) ` ` plt.plot (X_test, y_pred, color ` ` = ` ` ’k’ ` `) `   ` plt.show () ` ## Shop Learn programming in R: courses

\$ Best Python online courses for 2022

\$ Best laptop for Fortnite

\$ Best laptop for Excel

\$ Best laptop for Solidworks

\$ Best laptop for Roblox

\$ Best computer for crypto mining

\$ Best laptop for Sims 4

\$

Latest questions

NUMPYNUMPY

psycopg2: insert multiple rows with one query

NUMPYNUMPY

How to convert Nonetype to int or string?

NUMPYNUMPY

How to specify multiple return types using type-hints

NUMPYNUMPY

Javascript Error: IPython is not defined in JupyterLab

## Wiki

Python OpenCV | cv2.putText () method

numpy.arctan2 () in Python

Python | os.path.realpath () method

Python OpenCV | cv2.circle () method

Python OpenCV cv2.cvtColor () method

Python - Move item to the end of the list

time.perf_counter () function in Python

Check if one list is a subset of another in Python

Python os.path.join () method