Change language

# ML | Mini batch gradient descent with python

Depending on the number of training examples considered when updating model parameters, we have 3 types of gradient descent:

1. Batch gradient descent: parameters are updated after the gradient is computed errors across the entire training set
2. Stochastic Gradient Descent: the parameters are updated after the error gradient is computed with respect to one training case
3. Mini Batch Gradient Descent : parameters are updated after calculating the error gradient relative to a subset of the training set
 Batch Gradient Descent Stochastic Gradient Descent Mini-Batch Gradient Descent Since entire training data is considered before taking a step in the direction of gradient, therefore it takes a lot of time for making a single update. Since onl ya single training example is considered before taking a step in the direction of gradient, we are forced to loop over the training set and thus cannot exploit the speed associated with vectorizing the code. Since a subset of training examples is considered, it can make quick updates in the model parameters and can also exploit the speed associated with vectorizing the code. It makes smooth updates in the model parameters It makes very noisy updates in the parameters Depending upon the batch size, the updates can be made less noisy - greater the batch size less noisy is the update

So mini-batch gradient descent makes a trade-off between fast convergence and noise associated with gradient updates, making it a more flexible and robust algorithm .

Algorithm-

Let theta = model parameters an d max_iters = number of epochs.

for itr = 1, 2, 3,…, max_iters:
for mini_batch (X_mini, y_mini):

• Forward Pass on the batch X_mini:
• Make predictions on the mini-batch
• Compute error in predictions (J (theta)) with the current values ​​of the parameters
• Backward Pass:
• Compute gradient (theta) = partial derivative of J (theta) wrt theta
• Update parameters:
• theta = theta - learning_rate * gradient (theta)

Below is the Python implementation:

Step # 1: The first step — import dependencies, generate linear regression data, and visualize the generated data. We have created 8000 sample data, each with 2 attributes / functions. These sample data are further subdivided into a training set (X_train, y_train) and a test set (X_test, y_test), which have 7200 and 800 examples, respectively.

 ` # import dependencies ` ` import ` ` numpy as np ` ` import ` ` matplotlib.pyplot as plt `   ` # data creation ` ` mean ` ` = ` ` np.array ([` ` 5.0 ` `, ` ` 6.0 ` `]) ` ` cov ` ` = ` ` np.array ([[` ` 1.0 ` `, ` ` 0.95 ` `], [` ` 0.95 ` `, ` ` 1.2 ` `]]) ` ` data ` ` = ` ` np.random.multivariate_normal ( mean , cov, ` ` 8000 ` `) `   ` # data visualization ` ` plt.scatter (data [: ` ` 500 ` `, ` ` 0 ` `], data [: ` ` 500 ` `, ` ` 1 ` `], marker ` ` = ` ` ’.’ ` `) ` ` plt.show () `   ` # train- test-split ` ` data ` ` = ` ` np. hstack ((np.ones ((data.shape [` ` 0 ` `], ` ` 1 ` `)), data)) `   ` split_factor ` ` = ` ` 0.90 ` ` split ` ` = ` ` int ` ` (split_factor ` ` * ` ` data.shape [` ` 0 ` ` ]) `   ` X_train ` ` = ` ` data [: split,: ` ` - ` ` 1 ` `] ` ` y_train ` ` = ` ` data [: split, ` ` - ` ` 1 ` `]. reshape ((` ` - ` ` 1 ` `, ` ` 1 ` `)) ` ` X_test ` ` = ` ` data [split :,: ` ` - ` ` 1 ` `] ` ` y_test ` ` = ` ` data [split :, ` ` - ` ` 1 ` `]. reshape ((` ` - ` ` 1 ` `, ` ` 1 ` `)) ` ` `  ` print ` ` (` ` "Number of examples in training set =% d "` `% ` ` (X_train.shape [` ` 0 ` `])) ` ` print ` ` (` ` "Number of examples in testing set =% d" ` `% ` ` (X_test. shape [` ` 0 ` `])) `

Exit:

Number of examples in training set = 7200
Number of examples in test set = 800

Step # 2: Next, we write the code to implement linear regression using mini-batch gradient descent.
` gradientDescent () ` is the main function of the driver and the other functions are helper functions used to predict — ` hypothesis () `, calculating gradients — ` gradient () `, error computation — ` cost () ` and create mini-packages — ` create_mini_batches () `. The driver function initializes the parameters, calculates the best set of parameters for the model, and returns those parameters along with a list containing the error history when the parameters were updated.

` `

` # linear regression using “ mini-batch "gradient descent" # function for calculating hypotheses / predictions def hypothesis (X, theta): return np.dot (X, theta)   # function to compute the error gradient with theta function def gradient ( X, y, theta): h = hypothesis (X, theta) grad = np.dot (X.transpose (), (h - y)) return grad   # function to calculate the error for the current theta values ​​ def cost (X, y, theta): h = hypothesis (X, theta)   J = np.dot ((h - y) .trans pose (), (h - y))   J / = 2 return J [ 0 ]   # function to create a list containing mini-packages def create_mini_batches (X, y, batch_size): mini_batches = [] data = np.hstack ((X, y))   np.random.shuffle (data) n_minibatches = data.shape [ 0 ] / / batch_size i = 0      for i in range (n_minibatches + 1 ): mini_batch = data [i * batch_size: (i + 1 ) * batch_size,:] X_mini = mini_batch [:,: - 1 ] Y_mini = mini_batch [:, - 1 ]. reshape (( - 1 , 1 ))   mini_batches.append ((X_mini, Y_mini))   if data.shape [ 0 ] % batch_size! = 0 : mini_batch = data [i * batch_size: data.shape [ 0 ]] X_mini = mini_batch [:,: - 1 ] Y_mini = mini_batch [:, - 1 ]. reshape (( - 1 , 1 ) ) mini_batches.append ((X_mini, Y_mini)) return mini_batches   # function to perform mini-gradient descent def gradientDescent (X, y, learning_rate = 0.001 , batch_size = 32 ): theta = np.zeros ((X.shape [ 1 ], 1 )) error_list = []   max_iters = 3 for itr in range (max_iters): mini_batches = create_mini_batches (X, y, batch_size) for mini_batch in mini_batches: X_mini, y_mini = mini_batch theta = theta - learning_rate * gradient (X_mini, y_mini, theta) error_list.append (cost (X_mini, y_mini, theta))     return theta, error_list `

Function call ` gradientDescent () ` to compute the model parameters (theta) and visualize the change in the error function.

` theta, error_list ` ` = ` ` gradientDescent (X_train, y_train) `

` print ` ` (` ` "Bias =" ` `, theta [` ` 0 ` `]) `

` print ` ` ( ` ` "Coefficients =" ` `, theta [` ` 1 ` `:]) `

` # render gradient descent `
` plt.plot (error_list) `

` plt.xlabel (` ` "Number of iterations" ` `) `

` plt.ylabel (` ` "Cost" ` `) `

` plt.show () `

Output:
Offset = [0.81830471]
Odds = [[1.04586595] ]

Step # 3: Finally, we make predictions on the test set and calculate the mean absolute error in the predictions.

` `

` # predicting exit for X_test y_pred = hypothesis (X_test, theta ) plt.scatter (X_test [:, 1 ], y_test [:,], marker = ’ .’ ) plt.plot (X_test [:, 1 ], y_pred, color = ’orange’ ) plt .show ()   # calculating error in forecasts error = np. sum (np. abs (y_test - y_pred) / y_test .shape [ 0 ]) print ( "Mean absolute error =" , error) `

Exit:

Mean absolute error = 0.4366644295854125

The orange line represents the final function of the hypothesis: theta [0] + theta [1] * X_test [:, 1] + theta [2] * X_test [:, 2] = 0

## Shop

Learn programming in R: courses

\$FREE

Best Python online courses for 2022

\$FREE

Best laptop for Fortnite

\$399+

Best laptop for Excel

\$

Best laptop for Solidworks

\$399+

Best laptop for Roblox

\$399+

Best computer for crypto mining

\$499+

Best laptop for Sims 4

\$

Latest questions

PythonStackOverflow

Common xlabel/ylabel for matplotlib subplots

PythonStackOverflow

Check if one list is a subset of another in Python

PythonStackOverflow

How to specify multiple return types using type-hints

PythonStackOverflow

Printing words vertically in Python

PythonStackOverflow

Python Extract words from a given string

PythonStackOverflow

Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

PythonStackOverflow

Python os.path.join () method

PythonStackOverflow

Flake8: Ignore specific warning for entire file

## Wiki

Python | How to copy data from one Excel sheet to another

Common xlabel/ylabel for matplotlib subplots

Check if one list is a subset of another in Python

How to specify multiple return types using type-hints

Printing words vertically in Python

Python Extract words from a given string

Cyclic redundancy check in Python

Finding mean, median, mode in Python without libraries

Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

Python - Move item to the end of the list

Python - Print list vertically