Change language

Setting SVM Hyperparameter Using GridSearchCV | ML

SVM also has some hyperparameters (for example, which C or gamma values ​​to use), and finding the optimal hyperparameter is very difficult. But you can find it just by trying all the combinations and see which parameters work best. The basic idea is to create a grid of hyperparameters and just try all combinations of them (hence this method is called Gridsearch , but don’t worry! We don’t have to do it manually because Scikit-learn has this functionality built into GridSearchCV.

GridSearchCV uses a dictionary that describes parameters that can be tested on the model to train it. The parameter grid is defined as a dictionary where the keys — these are parameters, and the values ​​— are parameters to check.

This article shows you how to use the method of the search GridSearchCV, to find the optimal hyperparameters and therefore improve the accuracy / prediction results.

Import the required libraries and get the data —

We will use the built-in breast cancer dataset from Scikit Learn. We can get with the function z load:

import pandas as pd

import numpy as np

from sklearn.metrics import classification_report, confusion_matrix

from sklearn.datasets import load_breast_cancer

from sklearn.svm import SVC


cancer = load_breast_cancer ()

# The dataset is represented as a dictionary:

print (cancer. keys ())

 dict_keys ([’data’,’ target’, ’target_names’,’ DESCR’, ’feature_names’,’ filename’]) 

Now we will extract all features into a new dataframe and our target features into a separate dataframe.

df_feat = pd.DataFrame (cancer [ ’data’ ],

  columns = cancer [ ’feature_names’ ])

# cancer column - our goal

df_target = pd.DataFrame (cancer [ ’target’ ], 

  columns = [ ’ Cancer’ ])


print ( "Feature Variables:" )

print ( ())

print ( " Dataframe looks like: " )

print (df_feat.head ())

Train Test Split

Now we will split our data into training and test suites with a ratio of 70: 30.

from sklearn.model_selection import train_test_split


X_train, X_test, y_train, y_test = train_test_split (

df_feat, np.ravel (df_target),

test_size = 0.30 , random_state = 101 )

Train the support vector classifier without tweaking the hyperparameters —

First, we will train our model, calling the standard SVC () function without setting the hyperparameter, and we will see its classification and confusion matrix.

# train the model for the train

model = SVC () (X_train, y_train)

# print forecast results

predictions = model.predict (X_test)

print (classification_report (y_test, predictions))

We got 61% accuracy, but did you notice something strange?
Note that the return and precision for class 0 is always 0. This means that the classifier always classifies everything into one class, that is, class 1! This means our model needs to tweak the parameters.

That’s when the usefulness of GridSearch comes into picture. We can search for parameters using GridSearch!

Use GridsearchCV

One of the great things about GridSearchCV is that it is a meta-evaluator. It takes an evaluator like SVC and creates a new evaluator that behaves exactly the same — in this case, as a classifier. You have to add refit = True and choose verbose for whatever number you want: the larger the number, the more verbose (verbose means text output describing the process).

from sklearn.model_selection import GridSearchCV

# define a range of parameters

param_grid = { ’ C’ : [ 0.1 , 1 , 10 , 100 , 1000 ], 

’gamma’ : [ 1 , 0.1 , 0.01 , 0.001 , 0.0001 ],

’kernel’ : [ ’rbf’ ]} 


grid = GridSearchCV (SVC (), param_grid, refit = True , verbose = 3 )

# fitting the model for grid search ( X_train, y_train)

What fits , it’s a little more complicated than usual. First, the same loop is performed with cross validation to find the best combination of parameters. Having obtained the best combination, it does the fit again on all data passed for fitting (no cross validation) to build a single new model using the best parameter setting.

You can check the best parameters found by GridSearchCV in the best_params_ attribute , and the best mark in the best_estimator_ attribute:

# display the best parameter after setting

print (grid.best_params_)

# print how our model looks after setting hyperparameters

print (grid. best_estimator_)

Then m, you can rerun the predictions and view the classification report on this mesh object as if you were working with a conventional model.

grid_predictions = grid.predict (X_test)

# print classification report

print (classification_report ( y_test, grid_predictions))

We got almost 95% predictable result.


Gifts for programmers

Learn programming in R: courses

Gifts for programmers

Best Python online courses for 2022

Gifts for programmers

Best laptop for Fortnite

Gifts for programmers

Best laptop for Excel

Gifts for programmers

Best laptop for Solidworks

Gifts for programmers

Best laptop for Roblox

Gifts for programmers

Best computer for crypto mining

Gifts for programmers

Best laptop for Sims 4


Latest questions


Common xlabel/ylabel for matplotlib subplots

1947 answers


Check if one list is a subset of another in Python

1173 answers


How to specify multiple return types using type-hints

1002 answers


Printing words vertically in Python

909 answers


Python Extract words from a given string

798 answers


Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

606 answers


Python os.path.join () method

384 answers


Flake8: Ignore specific warning for entire file

360 answers



Python | How to copy data from one Excel sheet to another

Common xlabel/ylabel for matplotlib subplots

Check if one list is a subset of another in Python

How to specify multiple return types using type-hints

Printing words vertically in Python

Python Extract words from a given string

Cyclic redundancy check in Python

Finding mean, median, mode in Python without libraries

Python add suffix / add prefix to strings in a list

Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

Python - Move item to the end of the list

Python - Print list vertically