Change language

ML | Implementing L1 and L2 regularization using Sklearn

| |

This article aims to implement L2 and L1 regularization for linear regression using the Ridge and Lasso modules of the Sklearn library from Python. 
Dataset — Dataset on House Prices .

Step 1: Import required libraries

import pandas as pd

import numpy as np

import matplotlib.pyplot as plt

from sklearn.linear_model import LinearRegression, Ridge, Lasso

from sklearn.model_selection import train_test_split, cross_val_score

from statis tics import mean

Step 2: Loading and cleaning data

# Change the desktop locations for data location
cd C: UsersDevDesktopKaggleHouse Prices

 
# Loading data into Pandas DataFrame

data = pd.read_csv ( ’kc_house_data.csv’ )

 
# Discarding numerically mean ingless variables

dropColumns = [ ’ id’ , ’date’ , ’zipcode’ ]

data = data.drop (dropColumns, axis = 1 )

  
# Separate dependent and independent variables

y = data [ ’ price’ ]

X = data.drop ( ’ price’ , axis = 1 )

 
# Split data into training and test set

X_train, X_test, y_train, y_test = train_test_split (X, y, test_size = 0.25 )

Step 3: Build and evaluate different models

a) Linear regression:

# Building and fitting a linear regression model

linearModel = LinearRegression ()

linearModel.fit (X_train, y_train)

  
# Evaluate the linear regression model

print (linearModel.score (X_test, y_test))

b) Ridge (L2) regression:

# List to support various cross-validation metrics

cross_val_scores_ridge = []

 
# List to maintain different alpha values ​​

alpha = []

 
# The loop for the computation is different x cross validation score values ​​

for i in range ( 1 , 9 ):

  ridgeModel = Ridge (alpha = i * 0.25 )

ridgeModel.fit (X_train, y_train)

scores = cross_val_score (ridgeModel, X, y, cv = 10 )

avg_cross_val_score = mean (scores) * 100

cross_val_scores_ridge.append (avg_cross_val_score)

alpha.append (i * 0.25 )

 
# Loop for printing different cross-validation score values ​​

for i in range ( 0 , len (alpha)):

  print ( str (alpha [i]) + ’:’ + str (cross_val_scores_ridge [i]))

From the above output, we can conclude that the best alpha value for the data is 2.

# Building and installing the Ridge Regression model

ridgeModelChosen = Ridge (alpha = 2 )

ridgeModelChosen.fit (X_train, y_train)

 
# Ridge regression model estimation

print (ridgeModelChosen.score (X_test, y_test))

c) Lasso (L1) regression:

From the above output, we can conclude that the best lambda value is 2.

 

# List to maintain cross validation scores

cross_val_scores_lasso = []

 
# List for maintaining different lambda values ​​

Lambda = [ ]

 
# Loop for calculating cross-validation results

for i in range ( 1 , 9 ):

lassoModel = Lasso (alpha = i * 0.25 , tol = 0.0925 )

  lassoModel.fit (X_train, y_train)

  scores = cross_val_score (lassoModel, X, y, cv = 10 )

  avg_cross_val_score = mean (scores) * 100

  cross_val_scores_lasso .append (avg_cross_val_score)

Lambda.append (i * 0.25 )

  
# Cycle for printing different values ​​of cross pr overrides

for i in range ( 0 , len (alpha)):

print ( str (alpha [i]) + ’:’ + str (cross_val_scores_lasso [i]))

# Build and install the Lasso regression model

lassoModelChosen = Lasso (alpha = 2 , tol = 0.0925 )

lassoModelChosen.fit (X_train, y_train)

 
# Evaluate the Lasso regression model

print (lassoModelChosen.score (X_test, y_test) )

Step 4 : Compare and render results

# Build two lists for rendering

models = [ ’ Linear Regression’ , ’Ridge Regression’ , ’ Lasso Regression’ ]

scores = [linearModel.score (X_test, y_test),

ridgeModelChosen.score (X_test, y_test),

  lassoModelChosen.score (X_test, y_test)]

 
# Created no dictionary for comparing scores

mapping = {}

mapping [ ’Linear Regreesion’ ] = linearModel.score (X_test, y_test)

mapping [ ’Ridge Regreesion’ ] = ridgeModelChosen.score (X_test, y_test)

mapping [ ’Lasso Regression’ ] = lassoModelChosen.score (X_test, y_test )

 
# Print scores for different models

for key, val in < / code> mapping.items ():

print ( str (key) + ’: ’ + str (val))

# Building results
plt.bar (models, scores)

plt.xlabel ( ’Regression Models’ )

plt.ylabel ( ’ Sco re’ )

plt.show ()

Shop

Learn programming in R: courses

$

Best Python online courses for 2022

$

Best laptop for Fortnite

$

Best laptop for Excel

$

Best laptop for Solidworks

$

Best laptop for Roblox

$

Best computer for crypto mining

$

Best laptop for Sims 4

$

Latest questions

NUMPYNUMPY

Common xlabel/ylabel for matplotlib subplots

12 answers

NUMPYNUMPY

How to specify multiple return types using type-hints

12 answers

NUMPYNUMPY

Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

12 answers

NUMPYNUMPY

Flake8: Ignore specific warning for entire file

12 answers

NUMPYNUMPY

glob exclude pattern

12 answers

NUMPYNUMPY

How to avoid HTTP error 429 (Too Many Requests) python

12 answers

NUMPYNUMPY

Python CSV error: line contains NULL byte

12 answers

NUMPYNUMPY

csv.Error: iterator should return strings, not bytes

12 answers

News


Wiki

Python | How to copy data from one Excel sheet to another

Common xlabel/ylabel for matplotlib subplots

Check if one list is a subset of another in Python

sin

How to specify multiple return types using type-hints

exp

Printing words vertically in Python

exp

Python Extract words from a given string

Cyclic redundancy check in Python

Finding mean, median, mode in Python without libraries

cos

Python add suffix / add prefix to strings in a list

Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

Python - Move item to the end of the list

Python - Print list vertically