Cash Counting App using React Native and Computer Vision (YOLO v7 Tutorial 2023)
Have you ever wondered how to make a cash counting machine?
Well, we at Theos AI certainly have. Join us today on our fantastic journey of creating a simple Cash Counting App with React Native and Computer Vision.
If you're in a hurry and need to count those riches before the feds show up, just go ahead and clone the Github Repo.
If you’ve never created your own Artificial Intelligence model, don’t worry, you will fully understand how to do it by the end of this tutorial. For the React Native part, you need to have a bit of knowledge in Javascript to understand it, but it’s also quite simple, only 300 lines of code.
The Plan
Okay, so let’s think for a minute how our app and AI should work before we get started.
The user opens the app.
Points the camera to a bunch of cash.
Takes a picture.
The AI detects all the bills in the image, sums them up and shows the amount to the user.
There are many subfields of Computer Vision, such as Image Classification, Instance Segmentation and Text-to-image Generation (like the amazing DALLE-2, Midjourney and Stable Diffusion), but as we specified in our plan, the cash counting app will need to detect all the different types bills in the image, and that can be done with a computer vision task called Object Detection.
Object detection models receive an image as input and return a list of detections as output, these detections are made up of the following information.
Class: the class name of the detected object, in our case it will be the type of bill (“100” for example).
Confidence: this is an estimation of how confident our AI model is at making this prediction. Confidence scores range from 0 to 1, zero meaning no confidence at all, and one meaning absolute certainty.
X: the x position of the object within the image.
Y: the y position of the object within the image.
Width: the width of the object.
Height: the height of the object.
Later, when we make the App, we will use this detection information to count the bills.
The best object detection model at the time of this writing is YOLOv7, so that’s the one we’ll use for our cash counting app.
In AI and Machine Learning, generally the larger the neural network (meaning it has many millions of artificial neurons), the better it performs. These models require quite a bit of computing power, but as we want our cash counting AI to be as accurate as possible, we’ll use one of the largest versions of YOLOv7.
Deploying this big AI model directly on mobile phones doesn’t make much sense due to its high compute requirements. Since most phones won’t be able to handle it, we’ll deploy it to the cloud as a REST API and call it from our app via HTTP requests.
Let’s do it. Following are the steps we’ll have to take to make our app.
Collect example images similar to the ones that our AI will see in production when used by people.
Label bounding boxes in all the objects of interest (bills) in all the example images.
Train the model.
Deploy the model.
Use this model in our app.
Collecting Images
We don’t need so many images to get a good working model, you should always start with just 100 images, train your AI and test it.
If it’s not working very well, just upload again a few more hundred images, label them and retrain. Repeat this process until you’re satisfied with your model. Think of it like an AI MVP (minimum viable product).
I’m from Argentina, so we’re going to collect images of Argentinian Pesos, but this will also work if you do it with US dollar bills or any other.
I took them with my iPhone and also collected a few images from the internet. Here are some that I took.
We should take images in all angles, surfaces, and light conditions as possible in order for our AI model to work well in all the situations it may be used by people in the real world.
Labeling
To tell our AI what we want it to detect, we need to draw bounding boxes in all our example images.
We’re going to use Theos AI to label our images.
Following are the steps we have to take.
Create a free Theos AI account.
Create a new project.
Create a new dataset.
Upload our images to our dataset.
Create all the bill classes we want to detect.
Start labeling our images.
Let’s go ahead and upload our images to our dataset.
Now let’s create all our bill classes and start labeling. It should take us only a couple of hours to label 128 images.
If you want to use the same dataset I used here, here’s the download link.
We’re now ready to train our AI model.
Training
In order to train our dataset with YOLOv7, we’ll need to follow these three simple steps.
Connect a Google Colab instance to Theos in order to use a free GPU for training.
Create a new training session with our desired neural network algorithm, in our case YOLOv7 W6 (the largest YOLOv7 version on Theos), our dataset and the Google Colab machine that will do the training.
Click the Start Training button and wait for our AI to finish training.
Let’s connect a new Google Colab instance to Theos.
Now, let’s create a new training session and start the training.
The training completed succesfully!
We’re now ready to deploy our model to the cloud as a REST API.
Deployment
To deploy our trained object detection model we’ll follow these steps.
Go to the deploy section of Theos.
Create a new deployment by selecting the algorithm we used, YOLOv7 W6, and selecting the best weights (weights is a file that encodes the knowledge of our AI model).
Let’s create a new deployment and test it inside the Playground.
Our AI model is working!
Finally, it’s time to use it in our React Native app.
The Real Cash App
This will be our app.
We’ll build it with Expo for simplicity. Make sure to clone the Github Repo so you can follow along while we explain how the code works.
First, let’s import some dependencies.
Now, let’s create an object to store all the colors in which we’ll draw the predicted bounding boxes of each bill.
Let’s copy the URL from our deployment and paste it here.
This is a sleep function we’ll use in between our request retries in case they fail.
This is the detect function that will send the image to our AI and get back the detections. For more information on its parameters take a look at our Docs.
Here’s the rest of the code where we lay out all the components and handle the functionality of the app. We're not going to go through each line because they are self-explanatory if you know a bit of Javascript.
The End
We did it!
I hope you enjoyed reading this as much as we enjoyed making it!
If you found this cool or helpful please consider sharing it with a friend and smashing the star button for the Github algorithm, it will help us to know if you want us to make more of these!
Also, feel free to fork it and modify it to make it your own.
Consider joining our Discord Server where we can personally help you make your computer vision project successful!
We would love to see you make this app work in other countries, so let us know at contact@theos.ai if you do!