Explain Codes LogoExplain Codes Logo

Simple Digit Recognition OCR in OpenCV-Python

python
image-processing
machine-learning
data-preprocessing
Alex KataevbyAlex Kataev·Jan 17, 2025
TLDR

For effective digit recognition in OpenCV, utilize k-Nearest Neighbors (k-NN) to classify digit contours. Begin by pre-processing the image to a binary format, finding contours in the binary image using cv2.findContours(), and classify each contour using a trained k-NN model.

import cv2 from sklearn.neighbors import KNeighborsClassifier # Note: your k-NN is as good as its training. Just like you, back in school! 😁 knn = KNeighborsClassifier(n_neighbors=1) # Let's load and preprocess our image image = cv2.imread('digits.png', cv2.IMREAD_GRAYSCALE) _, binary_image = cv2.threshold(image, 127, 255, cv2.THRESH_BINARY_INV) # Roll the dice by finding and iterating through contours for prediction contours, _ = cv2.findContours(binary_image, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE) for contour in contours: x, y, w, h = cv2.boundingRect(contour) roi = binary_image[y:y+h, x:x+w] digit = knn.predict(roi.flatten().reshape(1, -1))[0] # This little snippet gives you your digit. Magic? No, it's science! 😎

Ensure your k-NN classifier is trained on digit images that are both normalized and flattened before they're predicted. This example is a quick preview of the actual process—genuine applications will require more specific pre- and post-processing steps, as well as better data normalization.

The OCR Ladder: One Step at a Time

First Step: Gathering and Labeling Data

Training requires good, representative data. For this, OpenCV FileStorage is helpful for storing your labeled TrainingData.yml and LabelData.yml files. This draws a good roadmap for your OCR journey. Net, gather a good dataset.

Second Step: Isolating Characters

With the dataset ready, step into the shoes of a contour artist. Extract digit contours using cv2.findContours(). When dealing with multiple characters, sort the contours to retain the sequence of digits. Exceptional cases during user interaction should be handled elegantly to ensure OCR stability.

Third Step: Embracing Consistency

OCR demands consistency. Instigate this by standardizing the size of Regions of Interest (ROIs)—the digits in our case. In addition, preprocess your images with Gaussian blur and adaptive thresholding to improve digit isolation.

Fourth Step: Reading Stored Data

Whenever your OCT application boots up, reading in your TrainingData.yml and LabelData.yml files keeps you from starting from scratch again. Efficiency is key here!

Final Step: Rotation, Scaling, and the End Goal

Some images just aren't “picture-perfect”. In such cases, consider rotating the image and extending its borders to make sure no digits get clipped. After all, every digit deserves to be seen.

Optimal Accuracy: Fruit of Fine-tuning

Inclusion of Diverse Conditions

Training on digit samples from varying conditions such as different fonts, sizes, and lighting conditions results in better versatility.

Using Additional Features

Explore additional features like aspect ratio and digit area for enhancing the predictive power of your classifier.

Experiment with Preprocessing

Consider complex image preprocessing techniques, like morphological operations, to refine digit isolation.

Combining Methodologies

While k-NN serves as an effective methodology, complementing it with SVM or neural networks might boost recognition accuracy for complex challenges.