header-includes | |
---|---|
|
Authors | ||
---|---|---|
娄辰飞 | 朱正平 | 陈禹池 |
nikolaos/
bof_utils.py
the data structure of BoF (Bag-of-Features)
global_param.py
network initializations, the training and evaluation hyperparameters, the hardware configurations and so oninference_new.py
inference of trained neural networksmain_new.py
the main programmodels_new.py
the different types of neural network modelstrain_new.py
training of neural networksutils_new.py
some general functionspower_management_api/
api.py
the APIs that deal with hardwares in the embedded board Jetson TX2README.md
the detailed introductions for api functions
scripts/
inference_with_beta.sh
the shell script that do the inference tasks with differentbeta
parameterstestcases.sh
the shell script that does all the tests specified in our reporttrain_inference.sh
the shell script that does the training and then the inference task
models_new/
the directory that stores all the modelscreate_custom_dataloader
the functions to create customized dataloadersgenerate_scripts.py
the python program to generate shell scripts for testings automatically. Except for only very few cases, this file is useless. Feel free to delete itREADME.md
this file
The command looks like the followin
python main_new.py --model_name MODEL_NAME --pretrained_file PRETRAINED_FILE_NAME --optimizer OPTIM_NAME --train_mode TRAIN_MODE --stat_each_layer STAT --evaluate_mode EVALUATE_MODE --task TASK --device DEVICE --trained_file_suffix SUFFIX --beta BETA --save SAVE --baseline BASELINE --core_num CORE_NUM --cpu_freq_level C_FREQ_L --gpu_freq_level G_FREQ_L --scene SCENE --baseline_time B_TIME --sleep_time S_TIME
Their meanings are specified as below. Note that not all arguments are meaningful at the same time, for example, when the TASK
is 'train', then EVALUATE_MODE
becomes meaningless. Actually you don't need to include all these arguments in the command. Every option has a default argument, which is specified in the main_new.py
, and when an option is missing in the command, the default value would be used.
MODEL_NAME
: the name of the model, either 'cifar', 'vgg' or 'vgg19'. 'cifar' is a smaller model with only 4 conv layers, and 'vgg' is a deeper model with 6 conv layers, and 'vgg19' is an extremely deep convolutional network with 16 layers. It achieves the best accuracy, and we hope that early-exit could show its advantage on this model.PRETRAINED_FILE_NAME
: the file that stores the pretrained model. They are all stored under the directory 'models_new'- '[model_name]_normal_default.pt' is the model for vanilla training
- '[model_name]_original_default.pt' is the model with early-exit layers implemented but only original layers pre-trained.
- '[model_name]_exits_default.pt' means the model with early-exit layers implemented and well trained.
OPTIM_NAME
: the name of the optimizer. Either 'sgd' or 'adam'. Usually by using 'adam' the training can be much more efficient.TRAIN_MODE
: the mode for training.- 'normal': train the model according to the most vanilla setting ---- without any early-exit schemes
- 'original': train the model with early-exit layers implemented but only train the original layers and ignore the early-exit layers. This mode stores a pre-trained file that is necessary for the following 'exits' mode
- 'exits': train the early-exit layers of the model, assuming that the original layers have already been trained.
STAT
: whether to collect the statistics of each layer, 1 for True and 0 for FalseEVALUATE_MODE
: the mode for evaluation.- 'normal': evaluate the normal mode without any implementation of early exit layers
- 'exits': evaluate the model with early exit layers and adaptive inference methods
TASK
: the task type, either 'train' or 'evaluate'. 'train' means to train and store a model based on training dataset, while 'evaluate' means to inference a model based on testing datasetDEVICE
: the device where the training and inference is executed. Either 'cpu' or 'cuda'SUFFIX
: whenSAVE
is non-zero, the model would be saved in a .pt file whose name consists of some some setting parameters and the stringSUFFIX
. This option is set mainly used to differentiate files under the same hyperparameters, so that they do not overwrite each other.BETA
: the hyperparameter to control the tradeoff between accuracy and speed-up in early-exit scheme. Refer to the paper or our report for more detailsSAVE
: if non-zero, then save the model, otherwise does not save the modelBASELINE
: an integer, to specify whether or not the current test is baseline (1 for true and 0 for false)CORE_NUM
: the number of gpu cores that are used to do the infrence task. It can only be either 2 or 4.C_FREQ_L
: the level of frequency of cpu cores used in inference time. the level of frequency of cpu cores used in inference time. The value for each level is specified in the fileglobal_param.py
. The value can be only among [4, 8, 12].G_FREQ_L
: the level of frequency of gpu cores used in inference time. The value for each level is specified in the fileglobal_param.py
. The value can be only among [2, 5, 8].SCENE
: a string, specifying the application scene. Can only be either "continuous" or "periodical". Refer to our report for more details of what they meanB_TIME
: the time it takes for baseline to finish in periodical scene. Refer to our report for more details of what it meansS_TIME
: the time we want the board to sleep in baseline setting in periodical scene. Refer to our report for more details of what it means
we can change the setting of hyperparameters by making corresponding changes in global_param.py
. For example, cifar_normal_train_hyper
(line 233 in global_param.py
) specifies the hyperparameter for normal training of cifar model. By changing the epoch number, batch size and so on, the hyperparameters for that task will be changed accordingly
Unfortunately, if you want to modify the structure for a neural network model (e.g. add a layer or enlarge the layer) or define a new structure, you'll need to do that by yourself by directly modifying the file models_new.py
. You would probably need to modify (or create) their layer name lists, their initialization and hyperparameter dictionaries in global_param.py
accordingly. That would be somewhat frustrating, so be prepared.
!!The training should be done on the computer / server. The embedded board does not have such computational power to do the training of large convolutional neural networks!!
The first thing you'll need to do is to train a network. Say you want to train the network model vgg19
with early exit layers, you will first train the normal layers by executing:
python main_new.py --model_name vgg19 --optimizer adam --train_mode original --task train --device cuda --trained_file_suffix round_1 --save 1
After that, you're gonna fix all the normal layers and start training the early exit layers by executing the command:
python main_new.py --model_name vgg19 --optimizer adam --train_mode exits --task train --device cuda --pretrained_file models_new/vgg19_train_original_round_1.pt --trained_file_suffix round_1 --save 1
Now basically you've done the training, and the trained model would be stored in models_new/vgg19_train_exits_round_1.pt
. And from the command output, you'll be able to see the accuracy after the training. If you are not satisfied with the accuracy, you could modify the model structures or hyperparameters and then do the above steps again.
After training the model, you may want to test the early exit results by doing some inferences with different values of the parameter beta
. If you want to do that on computers / servers, you will need to modify the codes in inference_new.py
by removing all the commands regarding hardwares. Then the command you're gonna type in the shell would be like:
python3 main_new.py --model_name vgg19 --optimizer adam --train_mode exits --task evaluate --device cuda --pretrained_file models_new/vgg19_train_normal_round_1.pt --trained_file_suffix round_1 --beta 5 --save 0 --evaluate_mode normal
There is a shell script in scripts/inference_with_beta.sh
that could help you do such tests with beta
ranging from 1 to 10.5. Before running the scripts, make sure you create a directory named experimental_results_new/
under the current directory (the directory where this README.md is located).
If you finally want to test the inference tasks on embedded boards, you will need to run commands like:
sudo python3 main_new.py --pretrained_file models_new/vgg19_train_exits_round_1.pt --stat_each_layer 0 --evaluate_mode exit --beta 6 --baseline 0 --core_num 2 --cpu_freq_level 4 --gpu_freq_level 2 --scene continuous --baseline_time 30 --sleep_time 30
There is a shell script in scripts/testcases.sh
that could help you do all the 74 tests that are specified in our reports. Before running the scripts, make sure you create a directory named experimental_results_new/
under the current directory (the directory where this README.md is located).