Change language

ML | Training Image Classifier using Tensorflow Object Discovery API

| | |


  • Python Programming
  • Machine Learning Fundamentals
  • Neural Network Fundamentals (optional)
  • Enthusiasm to build a cool project (required): p

Even if you don’t have the first three essentials, welcome to the adventure. Don’t worry about getting lost, I’ll guide you along the way!

What is Object Detection?
Object Detection — it is the process of finding instances of real objects, such as faces, buildings, and bicycles, in images or videos. Object detection algorithms typically use extracted functions and learning algorithms to recognize instances of an object category. It is commonly used in applications such as image search, security, surveillance, and advanced driver assistance systems (self-driving cars). Personally, I’ve used Object Discovery to prototype an image-based search engine.

What is the Tensorflow Object Discovery API?
Tensorflow — is an open source deep learning framework built by Google Brain. Tensorflow Object Discovery API — it is a powerful tool that allows anyone to create their own powerful image classifiers. No coding or programming knowledge is required to use the Tensorflow Object Discovery API. But to understand how it works, it is helpful to know Python programming and machine learning fundamentals.

Before starting Adventure, let’s make sure you have Python 3 installed on your system

For Python and PIP installations refer to this site

First things first! Make sure the below packages are installed on your system. This is essential in your adventure.

 pip install protobuf pip install pillow pip install lxml pip install Cython pip install jupyter pip install matplotlib pip install pandas pip install opencv-python install point tensorflow 

In order to start the adventure, we need to get a vehicle and make the necessary settings to it.
Tensorflow Object Discovery API

  1. We can get Tensorflow Object Discovery API from github
  2. Visit the provided link: Download here

After downloading the models folder, unzip it to your project directory. We can find the object_detection directory inside

 models-master / research / 
  • Create a variable PYTHONPATH:
    You need to create a PYTHONPATH variable that points to the / models, / models / research and / models / research / slim directories. Run the command as follows from any directory. In my case,
     set PYTHONPATH = F: Programmingpythonengineering_projectmodels-master; F: Programmingpythonengineering_projectmodels-master esearch; F: Programmingpythonengineering_projectmodels-master esearchslim 
  • Compile. py:
    Need to compile Protobuf files that TensorFlow uses to customize the model and training parameters.
    To compile the protoc files, we first need to get the protobuf compiler. You can download it here . Download the file for Windows and other operating systems. Download the appropriate ZIP file. Extract the bin folder to your research directory.
    Copy the below code and save it as in your research directory.

 import os import sys args = sys.argv directory = args [1] protoc_path = args [2] for file in os.listdir (directory): if file.endswith (". Proto"): os.system (protoc_path + "" + directory + "/" + file + "--python_out =. ") 

Go to the research directory on the command line and use the command below.

 python .object_detectionprotos .inprotoc 

This compiles all protobuf files and creates a file from each name.proto file in the / object_detection / protos folder.
Finally, run the following commands from the models-master / research directory:

 python build python install 

This completes the installation and installs the package with named object-Detection .

  • API Testing:
    To test the Object Detection API, go to object_detection and enter the following command:
     jupyter notebook object_detection_tutorial.ipynb 

    This opens Jupyter notebook in a browser.
    Note. If the first cell of your notebook contains the line sys.path.append ("..") , remove this line.

    Run all cells in the notebook and check if you get output similar to the image below:

  • Thanks to this we have successfully tuned our car.

    Let’s start our journey!

    To get to our destination, we need to cross 6 control points:

    1. Prepare the dataset
    2. Marking the dataset
    3. Creating records for training
    4. Setting up training
    5. Training the model
    6. Exporting an output graph

    Plan what objects you want to detect using the classifier.

    • Monitor point 1: preparing the dataset:

      In this adventure, I’m going to build a classifier that detects shoes and water bottles. Remember, the dataset — this is the most important thing in building a classifier. This will be the basis of your classifier, which is used to detect objects. Collect as many different and varied images of objects as possible. Create a directory called images inside the research directory. Save 80% of the images to the trains directory and 20% of the images to the test directory inside the image directory. I collected 50 images in the train catalog and 10 images in the test catalog. The more images, the higher the accuracy of your classifier.

      Images in the train catalog

      Images in the test directory

    • Checkpoint 2: dataset labeling:
      To cross this breakpoint, we need a tool called labelimg . You can get it from: labelimg download

      Open labelimg app and start drawing rectangle-method/">rectangles on the image wherever nor was the object located. And label them with the appropriate name as shown in the picture:

      Save each image after labeling that generates an XML file with the name of the corresponding image as shown in the image below.

    • Checkpoint Step 3: Creating Records for Training:
      To cross this breakpoint, we need to create TFRecords that can serve as input for training the object detector. To create TFRecords we will use two scripts from Racoon Detector from Dat Tran . Namely, the files and . Download them and save them in the object_detection folder.

      replace the main () method of the file with the following code:

       def main (): for folder in [’train’,’ test’]: image_path =  os.path.join ( (os.getcwd (), (’images /’ + folder)) xml_df = xml_to_csv (image_path) xml_df.to_csv ((’images /’ + folder + ’_labels.csv’), index = None) print (’Successfully converted xml to csv.’) 

      Also add the following lines of code to the xml_to_csv () method before the return statement, as shown in the picture below.

       names = [] for i in xml_df [’filename’]: names.append (i +’ .jpg’) xml_df [’filename’] = names 

      First, let’s convert all XML files to CSV files by running the file with the following command in the object_detection directory:


      This creates test.csv and train.csv files in the images folder.

      Then open the file in a text editor and edit Modify the class_text_to_int () method, which can be found on line 30, as shown in the image below.

      Then generate TFRecord files by typing these commands from the / object_detection folder:

       python generate_tfrecord. py --csv_input = images / train_labels.csv --image_dir = images / train --output_path = train.record python --csv_input = images / test_labels.csv --image_dir = images / test --output_path = test. record 

      This creates test.record and train.record files in the object_detection directory.

    • Checkpoint 4: setup training:

      To cross this milestone, you first need to create a cue map.

      Create a new directory named training in the object_detection directory.

      Use a text editor to create new file and save it as labelmap.pbtxt in the training directory. The cue map tells the trainer what each object is by mapping class names to class ID numbers.
      Now add content to your labelmap.pbtxt file in the following format to create a labelmap for your classifier.

       item {id: 1 name: ’shoe’} item {id: 2 name:’ bottle’} 

      The tag card ID numbers must match those defined in the file.

      Now let’s start customizing the training!

      We need a model, i.e. the training algorithm for our classifier. In this project we are going to use the fast_rcnn_inception model. The Tensorflow Object Discovery API comes with a huge number of models. Go to object_detection / samples / configs .
      Here you can find many config files for all models provided by the API. You can download the model from this link . Download fast_rcnn_inception_v2_coco . Once the download is complete, extract the rapid_rcnn_inception_v2_coco_2018_01_28 folder to the object_detection directory. To understand how the model works, see See this article .

      Since we are using the rapid_rcnn_inception_v2_coco model in this project, copy the fast_rcnn_inception_v2_coco.config file from object_detection / samples / configs and paste it to the previously created training directory.
      Using a text editor, open the configuration file and make the following changes to the fast_rcnn_inception_v2_pets.config file.
      Note: paths must be entered with a single forward slash (NOT a backslash), otherwise TensorFlow will give a file path error when trying to train the model! Also, paths must be in double quotes (”), not single quotes (& # 39;).

      • Line 10: set num_classes to the number objects that your classifier classifies. In my case, since I am classifying shoes and bottles, it will be num_classes: 2.
      • On line 107: give the absolute path to the model.ckpt file for parameter file_tuning_checkpoint . The model.ckpt file is located at object_detection / fast_rcnn_inception_v2_coco_2018_01_28 . In my case,

        fine_tune_checkpoint: "F: /Programming/pythonengineering_project/models-master/research/object_detection/faster_rcnn_inception_v2_coco_2018_01_28/model.ckpt"

      • Train section: b> you can find this section on line 120. In this section, set the input_path parameter for your train.record file. In my case, this is
        input_path: "F: /Programming/pythonengineering_project/models-master/research/object_detection/train.record."

        Set the label_map_path parameter for the labelmap.pbtxt file. In my case, this is:
        label_map_path: "F: /Programming/pythonengineering_project/models-master/research/object_detection/training/labelmap.pbtxt"

      • Eval configuration section : this section is on line 128. Set the num_examples parameter to the number of images in the test directory. In my case,
        num_examples: 10
      • The eval_input_reader: section you can find this section on line 134. Similar to the train_input_reader section, specify the paths to the test.record and labelmap files .pbtxt. In my case,
        input_path: "F: /Programming/pythonengineering_project/models-master/research/object_detection/train.record"

        label_map_path: "F: / Programming / pythonengineering_project / models-master / research / object_detection /training/labelmap.pbtxt "

      This completes all the settings and we will get to our last breakpoint.

    • Checkpoint 5: Train the model:
      It’s finally time to train our model. You can find a file named in the object_detection / legacy / folder.

      Copy the file and paste it into the object_detection directory.
      Go to the object_detection directory and run the following command to start training your model!

       python --logtostderr --train_dir = training / --pipeline_config_path = training / faster_rcnn_inception_v2_coco.config 

      It takes about 1 minute to initialize the setup before starting training. When a workout starts, it looks like this:

      Tensorflow creates a checkpoint every 5 minutes and saves it ... You can see that all the breakpoints are stored in the training directory.

      You can view the progress of the tutorial using TensorBoard. To do this, open a new command line, change to the object_detection directory and enter the following command:

       tensorboard --logdir = training 

      The tensorboard looks like this:

      Continue the training process until the loss is less than or equal to 0.1.

    • Checkpoint 6: Export Output Graph:
      This is the last checkpoint you need to cross to reach your destination.
      Now that we have a trained model, we need to generate an output graph that can be used to run the model. To do this, we need to first find out the largest number of steps stored. To do this, we need to go to the training directory and find the model.ckpt file with the highest index.

      We can then create an output graph by entering the following command at the command line.

       python --input_type image_tensor --pipeline_config_path training / faster_rcnn_inception_v2_coco.config --trained_checkpoint_prefix training / model.ckpt-XXXX --output_directory inference_graph 

      XXXX should be populated with the largest checkpoint number.
      This creates a frozen_inference_graph.pb file in the / object_detection / inference_graph folder. The .pb file contains the object detection classifier.

    This completes the construction of our classifier. All that’s left to end our adventure is — this is to use our model to detect objects.

    create a python file in the object_detection directory with the following code:

    # Write Python3 code here

    import os

    import cv2

    import numpy as np

    import tensorflow as tf

    import sys

    # This is required because the notebook is stored in the object_detection folder.

    sys.path.append ( " .. " )

    # Import utilities

    from utils import label_map_util

    from utils import visualization_utils as vis_util

    # The name of the directory where we are using the object detection module

    MODEL_NAME = ’inference_graph’ # Path to the directory, where the frozen_inference_graph is stored.

    IMAGE_NAME = ’ 11man.jpg’   # Path to the image where the object should be detected.

    # Get the path to the current working directory

    CWD_PATH = os.getcwd ()

    # Path to the frozen .pb detection graph file that contains the model to use
    # for object detection.

    PATH_TO_CKPT = os.path.join ( (CWD_PATH, MODEL_NAME, ’frozen_inference_graph.pb’ )

    # Path to map file label

    PATH_TO_LABELS = os.path.join ( (CWD_PATH, ’training’ , ’labelmap.pbtxt’ )

    # Image path

    PATH_TO_IMAGE = os. path.join (CWD_PATH, IMAGE_NAME)

    # Number of classes that can define object detector


    # Load a cue map.
    # The tag matches the indexes with the category names so that when we collapse
    # network predicts ’5’, we know it matches’ king’ .
    # We’re using internal utility functions here, but whatever returns
    # c A dictionary mapping integers to their corresponding string labels will be fine

    label_map = label_map_util.load_labelmap (PATH_TO_LABELS)

    categories = label_map_util.convert_label_map_to_categories (

    label_map, max_num_classes = NUM_CLASSES, use_display_name = True )

    category_index = label_map_util.create_category_index (categories)

    # Загрузить модель Tensorflow в память.

    detection_graph = tf.Graph()

    with detection_graph.as_default():

     od_graph_def = tf.GraphDef()

     with tf.gfile.GFile(PATH_TO_CKPT, ’rb’) as fid:

     serialized_graph =


     tf.import_graph_def(od_graph_def, name =’’)


     sess = tf.Session(graph = detection_graph)

    # Определить входные и выходные тензоры (то есть данные) для классификатора обнаружения объекта

    # Входной тензор - изображение

    image_tensor = detection_graph.get_tensor_by_name(’image_tensor:0’)

    # Выходными тензорами являются поля обнаружения, оценки и классы
    # Каждый блок представляет часть изображения, где был обнаружен конкретный объект

    detection_boxes = detection_graph.get_tensor_by_name ( ’ detection_boxes: 0’ )

    # Each rating represents the level of trust for each of the objects.
    # The rating is displayed in the result image along with the class label.

    detection_scores = detection_graph.get_tensor_by_name ( ’ detection_scores: 0’ )

    detection_classes = detection_graph.get_tensor_by_name ( ’ detection_classes: 0’ )

    your detected objects

    num_detections = detection_graph. get_tensor_by_name ( ’num_detections: 0’ )

    # Load the image using OpenCV and
    # scale up the image to have a shape: [1, None, None, 3]
    # that is, an array of one column, where each element in the column is RGB in pixels

    image = cv2.imread (PATH_TO_IMAGE)

    image_expanded = np.expand_dims (image, axis = 0 )

    # Do actual detection by running model with image as input

    (boxes, scores, classes, num) = (

    [detection_boxes, detection_scores, detection_classes, num_detections],

      feed_dict = {image_tensor: image_expanded})

    # Draw detection results (otherwise “render results”)

    vis_util.visualize_boxes_and_labels_on_image_array (


    np.squeeze (boxes),

    np.squeeze (classes) .astype (np.int32),

    np.squeeze (scores),


      use_normalized_coordinates = True ,

      line_thickness = 8 ,

    min_score_thresh = 0.60 )

    # All results were drawn in the image and. Now show the image.

    cv2.imshow ( ’Object detector’ , image)

    # Press any key to close image

    cv2.waitKey ( 0 )

    # Remove
    cv2.destroyAllWindows ()

    Specify the path to the image where the object should be detected on line 17.

    Below are some of the results from my model.

    So, finally, our model is ready. This model has also been used to create an image-based search engine that searches using image input by locating objects in the image.


    Learn programming in R: courses


    Best Python online courses for 2022


    Best laptop for Fortnite


    Best laptop for Excel


    Best laptop for Solidworks


    Best laptop for Roblox


    Best computer for crypto mining


    Best laptop for Sims 4


    Latest questions


    Common xlabel/ylabel for matplotlib subplots

    12 answers


    How to specify multiple return types using type-hints

    12 answers


    Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

    12 answers


    Flake8: Ignore specific warning for entire file

    12 answers


    glob exclude pattern

    12 answers


    How to avoid HTTP error 429 (Too Many Requests) python

    12 answers


    Python CSV error: line contains NULL byte

    12 answers


    csv.Error: iterator should return strings, not bytes

    12 answers


    Python | How to copy data from one Excel sheet to another

    Common xlabel/ylabel for matplotlib subplots

    Check if one list is a subset of another in Python


    How to specify multiple return types using type-hints


    Printing words vertically in Python


    Python Extract words from a given string

    Cyclic redundancy check in Python

    Finding mean, median, mode in Python without libraries


    Python add suffix / add prefix to strings in a list

    Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

    Python - Move item to the end of the list

    Python - Print list vertically