In this article, we will learn how to use pattern matching to detect related fields in a document image.
The above task can be achieved using pattern matching templates. Cropped margin images and apply template matching using cropped margin images and a document image. The algorithm is simple, but reproducible in complex versions to solve the problem of detecting and localizing fields for images of documents belonging to specific domains.
- Cut / crop field images from the main document and use them as separate templates.
- Define / adjust thresholds for different fields.
- Apply a template matching for each cropped field template using the OpenCV function
- Draw bounding boxes using the coordinates of rectangle-method/">rectangles selected from pattern matching.
- Optional: Complete templates margins and fine-tuning threshold to improve the result for different images in the document.
Below is the Python code:
Benefits of using pattern matching :
- Computationally inexpensive.
- Easy to use and customizable for different use cases use.
- Gives good results in case of lack of document data.
- The result is not very accurate compared to deep learning segmentation methods.
- No solution to overlapping pattern problems.