Car Make & Model Recognition

Based on fast neural network architecture, our car make and model recognition module can be easily integrated into applications that require accurate tagging of car images. It is robust under different lighting conditions and different angles. It can operate on embedded hardware, on-premise servers, or can be deployed as cloud API.

There are lots of industries that could benefit from our car recognition module, including security, marketing, and law enforcement. Some sample use cases are:

Intelligent Video Surveillance
Smart Billboards
Traffic Analytics
Tagging of Video and Images
License Plate Verification

Car MMR

Technical Details

The car make and model classifiers that we offer are just binary neural network models. The classification models are delivered in the following formats: Tensorflow protobuf, Tensorflow saved_model format, ONNX, MNN, TFLite, and OpenVINO. There is no object detector included, and the developers can use any object detector of their preference, like YOLO or SSD, to find the cars in each frame. The detected cars must be cropped and resized to 224x224 pixels, which is the input image size of the classifier. The car classifier is based on MobileNetV3 neural network architecture. It is very fast and runs in real time on CPU of a regular PC. One car image classification takes 35 milliseconds on Intel Core i5 CPU. For faster inference a NVIDIA GPU is recommended to be used. The acceleration when using GPU over CPU depends on the type of the graphic card, type of the CPU, and the batch size (the number of simultaneous processed images). If using a multi-core CPU, it is possible to get very high recognition speed even on CPU. As the object detection is the most computationally intensive task, it is possible to run the detector on GPU, and do the classification on CPU. There are many ways to integrate the car classifier into your software. Some runtime libraries that can be used are Tensorflow (as a standalone library or models server), Microsoft ONNX Runtime, NVIDIA TensorRT, Alibaba MNN lightweight deep learning framework, TFLite library, and Intel OpenVINO toolkit. It is possible to run the classifier using C++ or Python. Another option is to use TensorFlow Serving, which is a high-performance serving system for machine learning models, designed for production environments. It exposes RESTful API (in port 8501) and gRPC interface (in port 8500). The model server can be packaged in Docker container and to be hosted on the cloud or On-Premises servers.

The choice of the inference engine is important to get optimal results. When running on Intel CPU's, the best performance is achieved using the OpenVINO runtime library. For ARM processors, TFlite or MNN is more suitable. If the inference is done on Nvidia GPU, the optimized library NVIDIA TensorRT gives the best performance. But besides the hardware platform, there are many other factors that have to be considered, like mode of operation (optimized for throughput or latency), batch size, the architecture of the software, and the image processing pipeline. Quantizing the models is a common way to boost the performance at the expense of a slight decrease in accuracy. Some other factors to consider are if the model runs on an edge device or on a server, the need to scale the software on the cloud, etc. Using a model server like TensorFlow Serving might be the best option for some use cases.

Specifications:

Number of supported car brands: 400
Number of supported car models: 7000
Minimum vehicle size: 30x30 pixels
Accuracy: 75% - 95%, depending on the dataset
Speed: 35 milliseconds on Intel Core i5 CPU
View angles: front, rear, side view

Recommended open-source object detectors with state-of-art accuracy:

SSD detector with MobileNet V2 feature extractor
Single Shot Multibox Detector (SSD) with Inception V2 feature extractor
Faster R-CNN with Inception Resnet v2 feature extractor
YOLOv4 - Real-Time Object Detection
RetinaNet - Focal Loss for Dense Object Detection

Business Applications

Intelligent Video Analytics

Public safety and security organizations can include advanced search and car analytics functionalities into their software to find or redact relevant information in video records.

Traffic Analytics

Cities are getting smarter and by using Big Data supplied by the traffic cameras, the transportation systems can be managed more efficiently.

Digital Asset Management

Organizing, storing, and retrieving multimedia content like photos and videos. Building searchable car image databases for video and image archives.