LeNet-5-CMSIS-M4

Implementation of Caffe LeNet-5 on STM32F446RE board with Arm Cortex-M4 core.

0. Prerequisites

Hardware

STM32 NUCLEO-F446RE board
Desktop Computer (GPU is optional)

Software

Jupyter notebook - https://jupyter.org/
Python - https://www.python.org/
Caffe - https://caffe.berkeleyvision.org/
STM32CubeIDE - https://www.st.com/en/development-tools/stm32cubeide.html
PuTTY - https://www.putty.org/

Notes: Make sure software above have been installed before proceeding to further step.

1. Dataset & Model Preparation

Model

LeNet-5 Model Definition: Model/lenet_train_test.prototxt (for training & testing), Model/lenet_deploy.prototxt (for real classification on desktop)
Pre-trained LeNet-5 model: Model/lenet_iter_10000.caffemodel

Dataset

MNIST Dataset in LMDB format: Dataset/mnist_test_lmdb & Dataset/mnist_train_lmdb (for training & testing purpose).
MNIST Dataset in jpg format: https://github.com/teavanist/MNIST-JPG (for real classification purpose, please create and locate at Test_Dataset dir).

Full Training

(Optional) If you don't want to use the pre-trained LeNet-5 model.

<caffe> train -solver Model/lenet_solver.prototxt

Fine-Tuning

(Optional) If you wish to fine-tune the pre-trained LeNet-5 model.

<caffe> train -solver Model/lenet_solver.prototxt -weights Model/lenet_solver.prototxt

Note:

To enable GPU for full training/fine-tuning, use -gpu 0 argument.
Remember to change variables in prototxt accordingly if needed, ie: dataset path (lmdb).
<caffe> is your executable caffe, for my Windows case: C:\Caffe\caffe-master\Build\x64\Release\caffe.exe.
More info regarding data preparation and model training, you may refer to https://caffe.berkeleyvision.org/gathered/examples/mnist.html.

2. Inference via CPU/GPU

Open Scripts/LeNet5_classification.ipynb via Jupyter Notebook.
Follow and execute instruction mentioned in the Jupyter Notebook.
Remember to change the path for following variables: caffe_root, root, model_def, model_weights, labels_file.
You can choose to run inference via CPU/GPU by setting caffe.set_mode_cpu() or caffe.set_mode_gpu().
This Jupyter notebook allows you to run image classification for one image and group of test images.
Accuracy and inference speed will be displayed as below:

3. Inference via STM32 NUCLEO-F446RE Board

Quantize the weights & biases

nn_quantizer.py: Needs Caffe model definition (.prototxt) used for training/testing the model that consists of valid paths to datasets (lmdb) and trained model file (.caffemodel). It parses the network graph connectivity, quantize the caffemodel to 8-bit weights/activations layer-by-layer incrementally with minimal loss in accuracy on the test dataset. It dumps the network graph connectivity, quantization parameters into a pickle file.
Run nn_quantizer.py to parse and quantize the network. This step takes a while if run on CPU as it quantizes the network layer-by-layer while validating the accuracy on test dataset. To enable GPU for quantization sweeps, use --gpu argument.

python nn_quantizer.py --model ../Model/lenet_train_test.prototxt --weights ../Model/lenet_iter_10000.caffemodel --save lenet_quantize.pkl

Convert model into code

code_gen.py: Gets the quantization parameters and network graph connectivity from previous step and generates the code consisting of NN function calls. Supported layers: convolution, innerproduct, pooling (max/average) and relu. It generates (a) weights.h (b) parameter.h: consisting of quantization ranges and (c) main.cpp: the network code.
Run code_gen.py to generate code to run on Arm Cortex-M CPUs.

python code_gen.py --model lenet_quantize.pkl --out_dir ../Code

Convert MNIST Test Images into array format

convert_image.py: Get a group of MNIST images in jpg format and convert them into signed-int8 format. All the images array will be categorized into different input_x.h files, whereby each input_x.h file contains a maximum of 80 images (due to memory limitation of NUCLEO-F446RE board).
All the input_x.h files will be included into a include_list.h file, whereby user is allowed to comment / uncomment them such that only one input_x.h is included and uploaded to the board.

python convert_image.py --image_dir ../Test_Dataset --out_dir ../Code

Build & Run the project via STM32CubeIDE

Create a new project via STM32CubeIDE.
In Board Selector, select NUCLEO-F446RE for your Commercial Part No.
Download CMSIS-NN & CMSIS-DSP package from https://github.com/ARM-software/CMSIS_5 and add them to our project.
Remember to include both DSP/Include and NN/Include dirs via Project > Properties > C/C++ General > Paths and Symbols > Includes.
Add NN/Source dir via Project > Properties > C/C++ General > Paths and Symbols > Source Location.
Click your project ioc, under Pinout & Configuration, expand Timers, select TIM10, and click 'Activated' to activate the timer.
Copy content from main.cpp into Core/Src/main.c, and move weights.h, parameter.h, input_x.h, and include_list.h generated into Core/Inc dir.
'Build' and 'Run' the project to upload the program to NUCLEO-F446RE board.
The memory utilization is shown below:
To view the output message, open PuTTY terminal, click 'Serial', enter your Serial Line (ie: COM3) and Speed (ie: 115200), and click 'Open'.
Message such as classification result, inference cycle, accuracy will be displayed via PuTTY terminal.

Additional

The final STM32CubeIDE project for LeNet-5 implemenation has been compressed as LeNet-5-Project.zip.
You are expected to be able to run the project directly to your board to carry out image classification on MNIST image array located in input_x.h.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LeNet-5-CMSIS-M4

0. Prerequisites

Hardware

Software

1. Dataset & Model Preparation

Model

Dataset

Full Training

Fine-Tuning

2. Inference via CPU/GPU

3. Inference via STM32 NUCLEO-F446RE Board

Quantize the weights & biases

Convert model into code

Convert MNIST Test Images into array format

Build & Run the project via STM32CubeIDE

Additional

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
Code		Code
Dataset		Dataset
Model		Model
Scripts		Scripts
LeNet-5-Project.zip		LeNet-5-Project.zip
README.md		README.md

mingyi136/LeNet-5-CMSIS-M4

Folders and files

Latest commit

History

Repository files navigation

LeNet-5-CMSIS-M4

0. Prerequisites

Hardware

Software

1. Dataset & Model Preparation

Model

Dataset

Full Training

Fine-Tuning

2. Inference via CPU/GPU

3. Inference via STM32 NUCLEO-F446RE Board

Quantize the weights & biases

Convert model into code

Convert MNIST Test Images into array format

Build & Run the project via STM32CubeIDE

Additional

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages