Change language

Handwritten Equation Solver in Python

| |

Retrieving training data

  • Loading dataset
      Load dataset using this link . Unzip the zip file. There will be different folders with images for different mathematical symbols. For simplicity, use 0-9 digits, + ,? -? And, since the images are in our equation solver. Observing the dataset, we can see that it is biased for some numbers / characters as it contains 12000 images for one character and 3000 images for others. To correct this misalignment, reduce the number of images in each folder to approx. 4000.
  • Feature Extraction
      We can use outline extraction to get features.

    1. Invert the image and then convert it to binary image because extracting contours is best when the object is white and the environment is black.
    2. Use findContour to find contours. For objects, get the bounding rectangle-method/">rectangle of the path using the boundingRect function (the bounding rectangle-method/">rectangle — is the smallest horizontal rectangle-method/">rectangle that encloses the entire path).
    3. Since each image in our dataset contains only one character / digit, we only the bounding rectangle-method/">rectangle of the maximum size is needed. To do this, we calculate the area of ​​the bounding rectangle-method/">rectangle of each path and select the rectangle-method/">rectangle with the maximum area.
    4. Now change the maximum size of the bounding rectangle-method/">rectangle to 28 by 28. Change it to 784 by 1. This will now have 784 pixel values or function. Now assign a label to it (for example, for 0-9 images, the same label as their digit, for — assign a label 10, for + assign a label 11, for a time stamp, assign a label 12). So now our dataset contains 784 feature columns and one label column. After extracting the functions, save the data to a CSV file.
  • Train the data using a convolutional neural network

      Since a convolutional neural network operates on 2D data, and our dataset has a shape of 785 to 1. So we need to change it. First, assign the y_train variable to the label column in our dataset. Then drop the label column from the dataset and then change it to 28 to 28. Our dataset is now ready for CNN.
  • Building a convolutional neural network
      To create a CNN, import all required libraries.

      Convert y_train data to categorical data using the to_categorical function. Use the following line of code to create the model.

    import pandas as pd

    import numpy as np

    import pickle

    np.random.seed ( 1212 )

    import keras

    from keras.models import Model

    from keras.layers < / code> import * from keras import optimizers

    from keras.layers import Input , Dense

    from keras.models import Sequential

    from keras.layers import Dense

    from keras.layers import Dropout

    from keras.layers import Flat ten

    from keras.layers.convolutional import Conv2D

    from keras.layers. convolutional import MaxPooling2D

    from keras.utils import np_utils

    from keras import backend as K

    K.set_image_dim_ordering ( ’th’ )

    from keras.utils.np_utils import to_categorical

    from ker as.models import model_from_json

    model = Sequential ()

    model.add (Conv2D ( 30 , ( 5 , 5 ), input_shape = ( 1 , 28 , 28 ), activation = ’relu’ ))

    model.add (MaxPooling2D (pool_size = ( 2 , 2 )))

    model.add (Conv2D ( 15 , ( 3 , 3 ), activation = ’ relu’ ))

    model.add (MaxPooling2D (pool_size = ( 2 , 2 )))

    model.add (Dropout ( 0.2 ))

    model.add (Flatten ())

    model.add (Dense ( 128 , activation = ’relu’ ))

    model.add (Dense ( 50 , activation = ’relu’ ))

    model.add (Dense ( 13 , activation = ’softmax’ ))

    # Compile the model

    model. compile (loss = ’categorical_crossentropy’

    optimizer = ’adam’ , metrics = [ ’accuracy’ ])

  • Fitting the model to the data
      Use the following lines of code to fit the CNN to the data. (np.array (l), cat, epochs = 10 , batch_size = 200

    shuffle = True , verbose = 1 )

      Training our model will take about three hours with an accuracy of 98.46%. After training, we can save our model as a json file for future use so that we don’t have to train our model and wait three hours each time. To save our model, we can use the following line of codes.

    model_json = model.to_json ()

    with open ( "model_final.json" , " w " ) as json_file:

      json_file.write (model_json)

    # serialize weights to HDF5

    model .save_weights ( "model_final.h5" )

  • Testing our model or solving an equation with it

      First, import to shu the saved model using the following line of codes.

    json_file = open ( ’ model_final.json’ , ’r’ )

    loaded_model_json = ()

    json_file.close ()

    loaded_model = model_from_json (loaded_model_json)

    # load weight into new model

    loaded_model.load_weights ( "model_final.h5" )

  •  Now enter an image containing a handwritten equation. Convert the image to binary and then invert the image (if numbers / characters are in black).
  • Now we get the outlines of the image, by default we get the outlines from left to right.
  • Get the bounding rectangle-method/">rectangle for each outline.
  • This sometimes results in two or more outlines for the same digit / character. To avoid this, check if the bounding rectangle-method/">rectangle overlaps these two paths or not. If they overlap, then drop the smaller rectangle-method/">rectangle.
  • Now resize the entire remaining bounding rectangle-method/">rectangle from 28 to 28.
  • Using the model, predict the corresponding digit / symbol for each bounding rectangle-method/">rectangle and save it as a string.
  • Then use the & # 39; eval & # 39; in the line to solve the equation.
    1. Download the complete code for solving handwritten equations here .


    Learn programming in R: courses


    Best Python online courses for 2022


    Best laptop for Fortnite


    Best laptop for Excel


    Best laptop for Solidworks


    Best laptop for Roblox


    Best computer for crypto mining


    Best laptop for Sims 4


    Latest questions


    Common xlabel/ylabel for matplotlib subplots

    12 answers


    How to specify multiple return types using type-hints

    12 answers


    Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

    12 answers


    Flake8: Ignore specific warning for entire file

    12 answers


    glob exclude pattern

    12 answers


    How to avoid HTTP error 429 (Too Many Requests) python

    12 answers


    Python CSV error: line contains NULL byte

    12 answers


    csv.Error: iterator should return strings, not bytes

    12 answers



    Python | How to copy data from one Excel sheet to another

    Common xlabel/ylabel for matplotlib subplots

    Check if one list is a subset of another in Python


    How to specify multiple return types using type-hints


    Printing words vertically in Python


    Python Extract words from a given string

    Cyclic redundancy check in Python

    Finding mean, median, mode in Python without libraries


    Python add suffix / add prefix to strings in a list

    Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

    Python - Move item to the end of the list

    Python - Print list vertically