Real-time Object Detection + OCR using YOLO v7 in Google Colab Free GPU (ANPR / ALPR Tutorial 2023)

In today’s article, we’ll explain how you can use Theos AI to take the outputs of an Object Detection model such as YOLOv7, meaning bounding boxes surrounding text, and pass them through a state-of-the-art transformer-based Optical Character Recognition (OCR) model to read them in real-time with a free GPU from Google Colab.

Automatic License Plate Recognition (ALPR) Example

In this example we will use the same license plate recognition model (ALPR / ANPR) we trained in our previous Blog Post, but this will work for all OCR use cases, such as document processing or any other.

Introduction

Let’s briefly look at the video we will use to test our model and also remember what our license plate detection model looked like.

Okay, great! our license plate detection model is working well.

Now you can open the Google Colab Notebook to follow along while we explain how the code works.

As you can see, our colab notebook is very simple, we only have to follow 6 simple steps.

Installation

If you plan to use the small OCR model size, you will have to first install the Tesseract text recognition engine. But, for this example we will use the large transformer-based model.

Now, let’s install all the Python dependencies.

Let’s run the Theos setup.

Let’s login with our Theos account.

Now we have to set the project key by replacing <project_key> with our Theos project key that can be found on the settings overview of our Theos project.

Finally, we are now able to install the neural network code and weights by replacing the following parameters with our own values:

  • <algorithm>: the object detection algorithm name, in our case yolov7.

  • <algorithm_version>: the algorithm version, in our case tiny.

  • <weights_tag>: the weights tag, in our case license-plates:experiment-1:best.

The Code

Let’s import some dependencies.

We now have to configure the following settings:

  • ALGORITHM: the object detection algorithm name, in our case yolov7.

  • ALGORITHM_VERSION: the algorithm version, in our case tiny.

  • WEIGHTS: the weights tag, in our case license-plates:experiment-1:best.

  • OCR_MODEL_SIZE: the OCR model size, possible values are small, medium and large.

  • OCR_MODEL_TYPE: the OCR model type if using the large OCR_MODEL_SIZE, possible values are str, printed and handwritten.

  • OCR_MODEL_ACCURACY: the OCR model type if using the large OCR_MODEL_SIZE, possible values are base, medium and best.

  • OCR_CLASSES: a list of the classes we want our OCR model to read from, in our case just license-plate.

  • INPUT_VIDEO: the input video file name.

  • OUTPUT_VIDEO: the output video file name.

Now, let’s check if cuda is available and clean the memory to ensure the models fit in the GPU.

Here we create a Theos client and use it to load the license plate detection model to the GPU.

Now, let’s load the OCR model.

Let’s load the input video and open the output video.

Now we iterate over each frame of the input video, pass it through our object detection and OCR models, draw the predictions into a new frame and save it to the output video.

Finally, we release the videos, unload the model and download the output video to our machine so we can see the results.

Now let’s run the script!

If you want to use this locally with your own webcam or security camera, it’s very simple to modify this script to do that.

You just have to change the line where we load the input video with the following.

The Result

Let’s see how our models performed on the input video.

It worked great! impressive accuracy.

The End

We did it!

I hope you enjoyed reading this as much as we enjoyed making it!

If you found this cool or helpful please consider sharing it with a friend, it will help us to know if you want us to make more of these!

Consider joining our Discord Server where we can personally help you make your computer vision project successful!

We would love to see you make this ALPR / ANPR system work with license plates in other countries, so let us know at contact@theos.ai if you do!

Previous
Previous

Introducing 🤙🏻 Easy YOLOV7 ⚡️| The Best Object Detector in Just 10 Lines of Code

Next
Next

How to Train YOLO v7 on a Custom Dataset for Structural Damage Detection in Critical Infrastructure using Drones