Simple Digit Recognition OCR in OpenCV-Python

python

image-processing

machine-learning

data-preprocessing

byAlex Kataev·Jan 17, 2025

For effective digit recognition in OpenCV, utilize k-Nearest Neighbors (k-NN) to classify digit contours. Begin by pre-processing the image to a binary format, finding contours in the binary image using cv2.findContours(), and classify each contour using a trained k-NN model.

import cv2
from sklearn.neighbors import KNeighborsClassifier

# Note: your k-NN is as good as its training. Just like you, back in school! 😁
knn = KNeighborsClassifier(n_neighbors=1)

# Let's load and preprocess our image
image = cv2.imread('digits.png', cv2.IMREAD_GRAYSCALE)
_, binary_image = cv2.threshold(image, 127, 255, cv2.THRESH_BINARY_INV)

# Roll the dice by finding and iterating through contours for prediction
contours, _ = cv2.findContours(binary_image, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
for contour in contours:
    x, y, w, h = cv2.boundingRect(contour)
    roi = binary_image[y:y+h, x:x+w]
    digit = knn.predict(roi.flatten().reshape(1, -1))[0]  # This little snippet gives you your digit. Magic? No, it's science! 😎

Ensure your k-NN classifier is trained on digit images that are both normalized and flattened before they're predicted. This example is a quick preview of the actual process—genuine applications will require more specific pre- and post-processing steps, as well as better data normalization.

The OCR Ladder: One Step at a Time

First Step: Gathering and Labeling Data

Training requires good, representative data. For this, OpenCV FileStorage is helpful for storing your labeled TrainingData.yml and LabelData.yml files. This draws a good roadmap for your OCR journey. Net, gather a good dataset.

Second Step: Isolating Characters

With the dataset ready, step into the shoes of a contour artist. Extract digit contours using cv2.findContours(). When dealing with multiple characters, sort the contours to retain the sequence of digits. Exceptional cases during user interaction should be handled elegantly to ensure OCR stability.

Third Step: Embracing Consistency

OCR demands consistency. Instigate this by standardizing the size of Regions of Interest (ROIs)—the digits in our case. In addition, preprocess your images with Gaussian blur and adaptive thresholding to improve digit isolation.

Fourth Step: Reading Stored Data

Whenever your OCT application boots up, reading in your TrainingData.yml and LabelData.yml files keeps you from starting from scratch again. Efficiency is key here!

Final Step: Rotation, Scaling, and the End Goal

Some images just aren't “picture-perfect”. In such cases, consider rotating the image and extending its borders to make sure no digits get clipped. After all, every digit deserves to be seen.

Optimal Accuracy: Fruit of Fine-tuning

Inclusion of Diverse Conditions

Training on digit samples from varying conditions such as different fonts, sizes, and lighting conditions results in better versatility.

Using Additional Features

Explore additional features like aspect ratio and digit area for enhancing the predictive power of your classifier.

Experiment with Preprocessing

Consider complex image preprocessing techniques, like morphological operations, to refine digit isolation.

Combining Methodologies

While k-NN serves as an effective methodology, complementing it with SVM or neural networks might boost recognition accuracy for complex challenges.

explain-codes / Python / Simple Digit Recognition OCR in OpenCV-Python

Linked

Saving a Numpy array as an image



How to Crop an Image in OpenCV Using Python



How to detect a Christmas Tree? To spot a Christmas Tree from a picture, you can try a multitude of algorithms depending on the context and the data at hand. While a Convolutional Neural Network (CNN) powered by TensorFlow remains a robust and modern method, we have also traditional image processing algorithms that concentrate on clusters, color, shape, and texture patterns.



Representing and solving a maze given an image



Lazy Method for Reading Big File in Python?



How much data / information can we save / store in a QR code?



Python - Extracting and Saving Video Frames



The OCR Ladder: One Step at a Time Optimal Accuracy: Fruit of Fine-tuning

Linked

Saving a Numpy array as an image