Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intel: SSD-MobileNet accuracy mAP=15.280% #27

Open
psyhtest opened this issue Apr 14, 2020 · 17 comments
Open

Intel: SSD-MobileNet accuracy mAP=15.280% #27

psyhtest opened this issue Apr 14, 2020 · 17 comments

Comments

@psyhtest
Copy link

psyhtest commented Apr 14, 2020

We've meticulously reconstructed all components of Intel's MLPerf Inference v0.5 submission, including:

Unfortunately, the reached accuracy (15.280%) is much lower than expected (22.627%):

Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.153
Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.236
Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.167
Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.012
Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.101
Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.361
Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.150
Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.178
Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.178
Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.014
Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.114
Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.423
mAP=15.280%

To reproduce, follow the instructions for the prebuilt Docker image.

You can also reproduce the same in a native Debian environment by following Collective Knowledge steps in the source Dockerfile.

@psyhtest
Copy link
Author

psyhtest commented Apr 15, 2020

If I remove the --reverse_input_channels flag from the model conversion, I get a much higher accuracy (20.798%):

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.208
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.320
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.226
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.016
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.145
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.472
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.192
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.237
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.238
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.021
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.167
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.536
mAP=20.798%

@psyhtest
Copy link
Author

Using the other quantized model without --reverse_input_channels gives worse accuracy (20.150%):

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.201
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.312
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.219
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.016
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.138
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.463                            
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.187
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.231                                                
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.232                                   
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.021
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.159
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.530
mAP=20.150% 

@psyhtest
Copy link
Author

Using the other quantized model with --reverse_input_channels gives even worse accuracy (14.827%):

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.148
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.228
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.161
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.010
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.095
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.343
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.146
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.174
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.174
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.012
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.109
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.407
mAP=14.827%

@psyhtest
Copy link
Author

Reverse input channels? Quantized model Accuracy (mAP)
yes asymmetric 14.827%
symmetric 15.280%
no asymmetric 20.150%
symmetric 20.798%

@psyhtest
Copy link
Author

Here's the full conversion log:

/usr/bin/python3.6 /home/anton/CK_TOOLS/lib-openvino-gcc-8.4.0-2019_R3.1-linux-64/dldt/model-optimizer/mo.py --model_name converted_model --input_model /home/anton/CK_TOOLS/model-tf-mlperf-ssd-mobilenet-q
uantized-finetuned-for.openvino/ssd_mobilenet_v1_quant_ft_no_zero_point_frozen_inference_graph.pb --input_shape [1,300,300,3] --tensorflow_object_detection_api_pipeline_config /home/anton/CK_TOOLS/model-t
f-mlperf-ssd-mobilenet-quantized-finetuned-for.openvino/pipeline.config --tensorflow_use_custom_operations_config /home/anton/CK_TOOLS/lib-openvino-gcc-8.4.0-2019_R3.1-linux-64/dldt/model-optimizer/extens
ions/front/tf/ssd_v2_support.json
Model Optimizer arguments:                     
Common parameters:
        - Path to the Input Model:      /home/anton/CK_TOOLS/model-tf-mlperf-ssd-mobilenet-quantized-finetuned-for.openvino/ssd_mobilenet_v1_quant_ft_no_zero_point_frozen_inference_graph.pb
        - Path for generated IR:        /home/anton/CK_TOOLS/model-openvino-converted-from-tf-ssd-mobilenet/.
        - IR output name:       converted_model            
        - Log level:    ERROR
        - Batch:        Not specified, inherited from the model     
        - Input layers:         Not specified, inherited from the model
        - Output layers:        Not specified, inherited from the model
        - Input shapes:         [1,300,300,3]
        - Mean values:  Not specified                              
        - Scale values:         Not specified
        - Scale factor:         Not specified                    
        - Precision of IR:      FP32
        - Enable fusing:        True                             
        - Enable grouped convolutions fusing:   True
        - Move mean values to preprocess section:       False                
        - Reverse input channels:       False
TensorFlow specific parameters:                                    
        - Input model in text protobuf format:  False
        - Path to model dump for TensorBoard:   None                 
        - List of shared libraries with TensorFlow custom layers implementation:        None
        - Update the configuration file with input/output node names:   None
        - Use configuration file used to generate the model with Object Detection API:  /home/anton/CK_TOOLS/model-tf-mlperf-ssd-mobilenet-quantized-finetuned-for.openvino/pipeline.config
        - Operations to offload:        None
        - Patterns to offload:  None       
        - Use the config file:  /home/anton/CK_TOOLS/lib-openvino-gcc-8.4.0-2019_R3.1-linux-64/dldt/model-optimizer/extensions/front/tf/ssd_v2_support.json
Model Optimizer version:        unknown version                                                                      
The Preprocessor block has been removed. Only nodes performing mean value subtraction and scaling (if applicable) are kept.
                                                                                      
[ SUCCESS ] Generated IR model.
[ SUCCESS ] XML file: /home/anton/CK_TOOLS/model-openvino-converted-from-tf-ssd-mobilenet/./converted_model.xml
[ SUCCESS ] BIN file: /home/anton/CK_TOOLS/model-openvino-converted-from-tf-ssd-mobilenet/./converted_model.bin
[ SUCCESS ] Total execution time: 38.25 seconds.

@psyhtest
Copy link
Author

psyhtest commented Apr 21, 2020

After my blunder with #29 where I didn't think of running the ImageNet accuracy script with another data type parameter, I decided to check the COCO accuracy script on the data from the v0.5 submission.

I discovered that for the clx_9282-2s_openvino-linux system the reported accuracy is mAP=22.627% while the script computes mAP=25.484%:

anton@velociti:/data/anton/inference_results_v0.5/closed/Intel/results$ for accuracy_txt in \
  ./clx_9282-2s_openvino-linux/ssd-small/Server/accuracy/accuracy.txt \
  ./clx_9282-2s_openvino-linux/ssd-small/Offline/accuracy/accuracy.txt \
  ./clx_9282-2s_openvino-linux/ssd-small/SingleStream/accuracy/accuracy.txt \
; do \
  echo "$accuracy_txt"; \
  tail -1 $accuracy_txt; \
  echo "" \
; done
./clx_9282-2s_openvino-linux/ssd-small/Server/accuracy/accuracy.txt
mAP=22.627%

./clx_9282-2s_openvino-linux/ssd-small/Offline/accuracy/accuracy.txt
mAP=22.627%

./clx_9282-2s_openvino-linux/ssd-small/SingleStream/accuracy/accuracy.txt
mAP=22.627%

anton@velociti:/data/anton/inference_results_v0.5/closed/Intel/results$ for mlperf_log_accuracy_json in \
  ./clx_9282-2s_openvino-linux/ssd-small/Server/accuracy/mlperf_log_accuracy.json \
  ./clx_9282-2s_openvino-linux/ssd-small/Offline/accuracy/mlperf_log_accuracy.json \
  ./clx_9282-2s_openvino-linux/ssd-small/SingleStream/accuracy/mlperf_log_accuracy.json \
; do \
  echo "$mlperf_log_accuracy_json"; \
  wc -l "$mlperf_log_accuracy_json"; \
  /usr/bin/python3.6 \
  /home/anton/CK_TOOLS/mlperf-inference-dividiti.v0.5-intel/inference/v0.5/classification_and_detection/tools/accuracy-coco.py \
  --coco-dir /datasets/dataset-coco-2017-val/ \
  --mlperf-accuracy-file $mlperf_log_accuracy_json; \
  echo "" \
; done

./clx_9282-2s_openvino-linux/ssd-small/Server/accuracy/mlperf_log_accuracy.json
31754 ./clx_9282-2s_openvino-linux/ssd-small/Server/accuracy/mlperf_log_accuracy.json
loading annotations into memory...
Done (t=0.54s)
creating index...
index created!
Loading and preparing results...
DONE (t=0.01s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=0.68s).
Accumulating evaluation results...
DONE (t=0.47s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.255
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.369
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.286
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.010
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.186
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.580
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.218
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.272
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.272
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.012
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.194
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.616
mAP=25.484%

./clx_9282-2s_openvino-linux/ssd-small/Offline/accuracy/mlperf_log_accuracy.json
66074 ./clx_9282-2s_openvino-linux/ssd-small/Offline/accuracy/mlperf_log_accuracy.json
loading annotations into memory...
Done (t=0.55s)
creating index...
index created!
Loading and preparing results...
DONE (t=0.01s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=0.60s).
Accumulating evaluation results...
DONE (t=0.45s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.255
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.369
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.286
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.010
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.186
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.580
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.218
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.272
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.272
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.012
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.194
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.616
mAP=25.484%

./clx_9282-2s_openvino-linux/ssd-small/SingleStream/accuracy/mlperf_log_accuracy.json
4255 ./clx_9282-2s_openvino-linux/ssd-small/SingleStream/accuracy/mlperf_log_accuracy.json
loading annotations into memory...
Done (t=0.46s)
creating index...
index created!
Loading and preparing results...
DONE (t=0.01s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=0.61s).
Accumulating evaluation results...
DONE (t=0.46s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.255
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.369
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.286
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.010
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.186
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.580
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.218
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.272
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.272
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.012
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.194
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.616
mAP=25.484%

Note that the JSON logs for the Server, Offline and SingleScream scenarios have 31754, 66074 and 4255 lines, respectively, not 5002 as expected.

@psyhtest
Copy link
Author

For the ICL-I3-1005G1_OpenVINO-Windows system, I get the accuracy and the number of lines as expected but also 21248 error lines like: ERROR: loadgen(800) and payload(2583339204608) disagree on image_idx for the SingleStream log:

anton@velociti:/data/anton/inference_results_v0.5/closed/Intel/results$ for mlperf_log_accuracy_json in \
  ./ICL-I3-1005G1_OpenVINO-Windows/ssd-small/SingleStream/accuracy/mlperf_log_accuracy.json \
  ./ICL-I3-1005G1_OpenVINO-Windows/ssd-small/Offline/accuracy/mlperf_log_accuracy.json \
; do \
  echo "$mlperf_log_accuracy_json"; \
  wc -l "$mlperf_log_accuracy_json"; \
  /usr/bin/python3.6 \
  /home/anton/CK_TOOLS/mlperf-inference-dividiti.v0.5-intel/inference/v0.5/classification_and_detection/tools/accuracy-coco.py \
  --coco-dir /datasets/dataset-coco-2017-val/ \
  --mlperf-accuracy-file $mlperf_log_accuracy_json; \
  echo "" \
; done
...
ERROR: loadgen(1871) and payload(2583339204608) disagree on image_idx
ERROR: loadgen(1871) and payload(2583339204608) disagree on image_idx
ERROR: loadgen(800) and payload(2583339204608) disagree on image_idx
Loading and preparing results...
DONE (t=0.13s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=12.38s).
Accumulating evaluation results...
DONE (t=2.13s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.226
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.347
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.246
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.018
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.158
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.521
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.205
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.257
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.258
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.023
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.182
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.594
mAP=22.627%

./ICL-I3-1005G1_OpenVINO-Windows/ssd-small/Offline/accuracy/mlperf_log_accuracy.json
5002 ./ICL-I3-1005G1_OpenVINO-Windows/ssd-small/Offline/accuracy/mlperf_log_accuracy.json
loading annotations into memory...
Done (t=0.51s)
creating index...
index created!
Loading and preparing results...
DONE (t=0.10s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=12.58s).
Accumulating evaluation results...
DONE (t=2.11s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.226
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.347
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.246
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.018
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.158
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.521
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.205
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.257
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.258
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.023
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.182
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.594
mAP=22.627%

@psyhtest
Copy link
Author

psyhtest commented Apr 21, 2020

No problems with DellEMC's results (including OpenVINO and TensorRT):

anton@velociti:/data/anton/inference_results_v0.5/closed/DellEMC/results$ for mlperf_log_accuracy_json in \
  ./DELLEMC_R740xd6248_openvino-linux/ssd-small/Offline/accuracy/mlperf_log_accuracy.json \
  ./DELLEMC_R740xd6248_openvino-linux/ssd-small/SingleStream/accuracy/mlperf_log_accuracy.json \
  ./DELLEMC_R740xd8276_openvino-linux/ssd-small/Offline/accuracy/mlperf_log_accuracy.json \
  ./DELLEMC_R740xd8276_openvino-linux/ssd-small/SingleStream/accuracy/mlperf_log_accuracy.json \
  ./R740_T4x4_tensorrt/ssd-small/Server/accuracy/mlperf_log_accuracy.json \
  ./R740_T4x4_tensorrt/ssd-small/Offline/accuracy/mlperf_log_accuracy.json \
; do \
  echo "$mlperf_log_accuracy_json"; \
  wc -l "$mlperf_log_accuracy_json"; \
  /usr/bin/python3.6 \
  /home/anton/CK_TOOLS/mlperf-inference-dividiti.v0.5-intel/inference/v0.5/classification_and_detection/tools/accuracy-coco.py \
  --coco-dir /datasets/dataset-coco-2017-val/ \
  --mlperf-accuracy-file $mlperf_log_accuracy_json | tail -1; \
  echo "" \
; done
./DELLEMC_R740xd6248_openvino-linux/ssd-small/Offline/accuracy/mlperf_log_accuracy.json
5002 ./DELLEMC_R740xd6248_openvino-linux/ssd-small/Offline/accuracy/mlperf_log_accuracy.json
mAP=22.627%

./DELLEMC_R740xd6248_openvino-linux/ssd-small/SingleStream/accuracy/mlperf_log_accuracy.json
5002 ./DELLEMC_R740xd6248_openvino-linux/ssd-small/SingleStream/accuracy/mlperf_log_accuracy.json
mAP=22.627%

./DELLEMC_R740xd8276_openvino-linux/ssd-small/Offline/accuracy/mlperf_log_accuracy.json
5002 ./DELLEMC_R740xd8276_openvino-linux/ssd-small/Offline/accuracy/mlperf_log_accuracy.json
mAP=22.627%

./DELLEMC_R740xd8276_openvino-linux/ssd-small/SingleStream/accuracy/mlperf_log_accuracy.json
5002 ./DELLEMC_R740xd8276_openvino-linux/ssd-small/SingleStream/accuracy/mlperf_log_accuracy.json
mAP=22.627%

./R740_T4x4_tensorrt/ssd-small/Server/accuracy/mlperf_log_accuracy.json
5002 ./R740_T4x4_tensorrt/ssd-small/Server/accuracy/mlperf_log_accuracy.json
mAP=22.911%

./R740_T4x4_tensorrt/ssd-small/Offline/accuracy/mlperf_log_accuracy.json
5002 ./R740_T4x4_tensorrt/ssd-small/Offline/accuracy/mlperf_log_accuracy.json
mAP=22.912%

@psyhtest
Copy link
Author

psyhtest commented Apr 22, 2020

So the resolution seems to have two aspects to it:

  • To use the pre-release branch of OpenVINO, last committed to on 7 October 2019 just before the v0.5 deadline;
  • Not to use the reverse_input_channels flag during conversion.

This can be confirmed via a new Docker image (tagged mlperf_inference_results_v0.5_issue_27_resolved on Docker Hub).

Alternatively, you can run this natively:

$ ck install package --tags=lib,openvino,pre-release
$ ck install package --tags=model,openvino,ssd-mobilenet
$ export NPROCS=`grep -c processor /proc/cpuinfo`
$ ck benchmark program:mlperf-inference-v0.5 --cmd_key=object-detection --repetitions=1 --skip_print_timers \
--env.CK_LOADGEN_MODE=Accuracy --env.CK_LOADGEN_SCENARIO=Offline \
--env.CK_OPENVINO_NIREQ=$NPROCS --env.CK_OPENVINO_NTHREADS=$NPROCS --env.CK_OPENVINO_NSTREAMS=$NPROCS \
--dep_add_tags.openvino=pre-release
...
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.226
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.347
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.246
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.018
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.158
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.520
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.205
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.257
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.258
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.023
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.183
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.594
mAP=22.617%

@psyhtest
Copy link
Author

The 2019_R3 release is no good either:

$ ck install package --tags=lib,openvino,2019_R3
$ export NPROCS=`grep -c processor /proc/cpuinfo`
$ ck benchmark program:mlperf-inference-v0.5 --cmd_key=object-detection --repetitions=1 --skip_print_timers \
--env.CK_LOADGEN_MODE=Accuracy --env.CK_LOADGEN_SCENARIO=Offline \
--env.CK_OPENVINO_NIREQ=$NPROCS --env.CK_OPENVINO_NTHREADS=$NPROCS --env.CK_OPENVINO_NSTREAMS=$NPROCS \
--dep_add_tags.openvino=2019_R3 --dep_add_tags.loadgen=for.openvino --dep_add_tags.mlperf-inference-src=dividiti.v0.5-intel \
--dep_add_tags.compiler=v8 --dep_add_tags.cmake=v3.14 --dep_add_tags.opencv=v3.4.3
...
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.208                                                                                                                             
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.320                                                                                                                             
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.226                                                                                                                             
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.016                                                                                                                             
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.145                                                                                                                             
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.472                                                                                                                             
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.192                                                                                                                             
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.237                                                                                                                             
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.238                                                                                                                             
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.021
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.167
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.536
mAP=20.798%

@psyhtest
Copy link
Author

... while the benchmark code doesn't build with 2020.1:

$ ck install package --tags=lib,openvino,2020.1
$ export NPROCS=`grep -c processor /proc/cpuinfo`
$ ck benchmark program:mlperf-inference-v0.5 --cmd_key=object-detection --repetitions=1 --skip_print_timers \
--env.CK_LOADGEN_MODE=Accuracy --env.CK_LOADGEN_SCENARIO=Offline \
--env.CK_OPENVINO_NIREQ=$NPROCS --env.CK_OPENVINO_NTHREADS=$NPROCS --env.CK_OPENVINO_NSTREAMS=$NPROCS \
--dep_add_tags.openvino=2020.1 --dep_add_tags.loadgen=for.openvino --dep_add_tags.mlperf-inference-src=dividiti.v0.5-intel \
--dep_add_tags.compiler=v8 --dep_add_tags.cmake=v3.14 --dep_add_tags.opencv=v3.4.3
...
[ 50%] Building CXX object CMakeFiles/ov_mlperf.dir/main_ov.cc.o               
/usr/bin/g++-8   -I/home/anton/CK_TOOLS/lib-boost-1.67.0-gcc-8.4.0-compiler.python-3.6.10-linux-64/install/include -I/home/anton/CK_TOOLS/lib-mlperf-loadgen-static-gcc-8.4.0-compiler.python-3.6.10-for.ope
nvino-linux-64/include -I/home/anton/CK_TOOLS/lib-openvino-gcc-8.4.0-2020.1-linux-64/include -I/home/anton/CK_TOOLS/lib-openvino-gcc-8.4.0-2020.1-linux-64/dldt/inference-engine/src/extension -isystem /hom
e/anton/CK_TOOLS/lib-opencv-3.4.3-gcc-8.4.0-linux-64/install/include -isystem /home/anton/CK_TOOLS/lib-opencv-3.4.3-gcc-8.4.0-linux-64/install/include/opencv  -fPIE -fstack-protector-strong -Wno-error -fP
IC -fno-operator-names -Wformat -Wformat-security -Wall -O2 -std=c++14 -pthread -USE_OPENCV -DBOOST_ERROR_CODE_HEADER_ONLY -O3 -DNDEBUG   -o CMakeFiles/ov_mlperf.dir/main_ov.cc.o -c /home/anton/CK_REPOS/c
k-openvino/program/mlperf-inference-v0.5/ov_mlperf_cpu/main_ov.cc                      
In file included from /home/anton/CK_REPOS/ck-openvino/program/mlperf-inference-v0.5/ov_mlperf_cpu/sut_ov.h:12,
                 from /home/anton/CK_REPOS/ck-openvino/program/mlperf-inference-v0.5/ov_mlperf_cpu/main_ov.cc:13:
/home/anton/CK_REPOS/ck-openvino/program/mlperf-inference-v0.5/ov_mlperf_cpu/backend_ov.h:5:10: fatal error: ext_list.hpp: No such file or directory
 #include <ext_list.hpp>                                                       
          ^~~~~~~~~~~~~~
compilation terminated.                         
CMakeFiles/ov_mlperf.dir/build.make:62: recipe for target 'CMakeFiles/ov_mlperf.dir/main_ov.cc.o' failed
make[2]: *** [CMakeFiles/ov_mlperf.dir/main_ov.cc.o] Error 1
make[2]: Leaving directory '/home/anton/CK_REPOS/ck-openvino/program/mlperf-inference-v0.5/tmp'
CMakeFiles/Makefile2:72: recipe for target 'CMakeFiles/ov_mlperf.dir/all' failed
make[1]: *** [CMakeFiles/ov_mlperf.dir/all] Error 2             
make[1]: Leaving directory '/home/anton/CK_REPOS/ck-openvino/program/mlperf-inference-v0.5/tmp'
Makefile:83: recipe for target 'all' failed                     
make: *** [all] Error 2

@psyhtest
Copy link
Author

I'm guessing new include paths need to be set for 2020.1:

diff --git a/program/mlperf-inference-v0.5/ov_mlperf_cpu/CMakeLists.txt b/program/mlperf-inference-v0.5/ov_mlperf_cpu/CMakeLists.txt
index ab50451..005fd67 100644
--- a/program/mlperf-inference-v0.5/ov_mlperf_cpu/CMakeLists.txt
+++ b/program/mlperf-inference-v0.5/ov_mlperf_cpu/CMakeLists.txt
@@ -42,6 +42,9 @@ set(OPENVINO_LIBRARY               "${OPENVINO_LIB_DIR}/libinference_engine.so")
 set(OPENVINO_CPU_EXTENSION_LIBRARY "${OPENVINO_LIB_DIR}/libcpu_extension.so")
 set(OPENVINO_INCLUDE_DIR           "${OPENVINO_DIR}/include")
 set(OPENVINO_EXTENSION_DIR         "${OPENVINO_DIR}/openvino/inference-engine/src/extension")
+set(OPENVINO_MKLDNN_NODES_DIR      "${OPENVINO_DIR}/openvino/inference-engine/src/mkldnn_plugin/nodes")^M
+set(OPENVINO_THIRDPARTY_MKLDNN_SRC_CPU "${OPENVINO_DIR}/openvino/inference-engine/thirdparty/mkl-dnn/src/cpu")^M
+set(OPENVINO_THIRDPARTY_MKLDNN_SRC_COMMON "${OPENVINO_DIR}/openvino/inference-engine/thirdparty/mkl-dnn/src/common")^M
 
 MESSAGE(STATUS "OPENVINO_DIR=${OPENVINO_DIR}")
 MESSAGE(STATUS "OPENVINO_LIB_DIR=${OPENVINO_LIB_DIR}")
@@ -83,6 +86,9 @@ include_directories(
     ${LOADGEN_DIR}
     ${OPENVINO_INCLUDE_DIR}
     ${OPENVINO_EXTENSION_DIR}
+    ${OPENVINO_MKLDNN_NODES_DIR}^M
+    ${OPENVINO_THIRDPARTY_MKLDNN_SRC_CPU}^M
+    ${OPENVINO_THIRDPARTY_MKLDNN_SRC_COMMON}^M
 )
 
 set(SOURCE_FILES backend_ov.h dataset_ov.h sut_ov.h infer_request_wrap.h item_ov.h main_ov.cc)

as well as modifying the program:

diff --git a/program/mlperf-inference-v0.5/ov_mlperf_cpu/backend_ov.h b/program/mlperf-inference-v0.5/ov_mlperf_cpu/backend_ov.h
index 8ea1de2..b0d2207 100644
--- a/program/mlperf-inference-v0.5/ov_mlperf_cpu/backend_ov.h
+++ b/program/mlperf-inference-v0.5/ov_mlperf_cpu/backend_ov.h
@@ -2,7 +2,7 @@
 #define BACKENDOV_H__
 
 #include <inference_engine.hpp>
-#include <ext_list.hpp>
+#include <list.hpp>
 
 #include "infer_request_wrap.h"
 
@@ -79,7 +79,7 @@ public:
         Core ie;
         const std::string device { "CPU" };
         if (device == "CPU") {
-            ie.AddExtension(std::make_shared<Extensions::Cpu::CpuExtensions>(),
+            ie.AddExtension(std::make_shared<Extensions::Cpu::MKLDNNExtensions>(),
                     "CPU");
             if (settings_.scenario == mlperf::TestScenario::SingleStream) {
                 ie.SetConfig(

but still doesn't build:

In file included from /home/anton/CK_REPOS/ck-openvino/program/mlperf-inference-v0.5/ov_mlperf_cpu/sut_ov.h:12,
                 from /home/anton/CK_REPOS/ck-openvino/program/mlperf-inference-v0.5/ov_mlperf_cpu/main_ov.cc:13:
/home/anton/CK_REPOS/ck-openvino/program/mlperf-inference-v0.5/ov_mlperf_cpu/backend_ov.h: In member function ‘void BackendOV::load(std::__cxx11::string)’:
/home/anton/CK_REPOS/ck-openvino/program/mlperf-inference-v0.5/ov_mlperf_cpu/backend_ov.h:82:81: error: no matching function for call to ‘make_shared<template<mkldnn::impl::cpu::cpu_isa_t Type> class Infe
renceEngine::Extensions::Cpu::MKLDNNExtensions>()’
             ie.AddExtension(std::make_shared<Extensions::Cpu::MKLDNNExtensions>(),
                                                                                 ^

@psyhtest
Copy link
Author

Also trouble is expected with 2020.2:

In file included from /home/anton/CK_REPOS/ck-openvino/program/mlperf-inference-v0.5/ov_mlperf_cpu/backend_ov.h:7,
                 from /home/anton/CK_REPOS/ck-openvino/program/mlperf-inference-v0.5/ov_mlperf_cpu/sut_ov.h:12,
                 from /home/anton/CK_REPOS/ck-openvino/program/mlperf-inference-v0.5/ov_mlperf_cpu/main_ov.cc:13:
/home/anton/CK_REPOS/ck-openvino/program/mlperf-inference-v0.5/ov_mlperf_cpu/infer_request_wrap.h: In member function ‘void InferReqWrap::postProcessImagenet(std::vector<Item>&, std::vector<unsigned int>&
, std::vector<long unsigned int>&)’:
/home/anton/CK_REPOS/ck-openvino/program/mlperf-inference-v0.5/ov_mlperf_cpu/infer_request_wrap.h:129:42: warning: ‘void InferenceEngine::TopResults(unsigned int, InferenceEngine::Blob&, std::vector<unsig
ned int>&)’ is deprecated: InferenceEngine utility functions are not a part of public API. Will be removed in 2020 R2 [-Wdeprecated-declarations]
             TopResults(1, *(b.blob_), res);
                                          ^

@psyhtest
Copy link
Author

psyhtest commented Jun 3, 2020

Still waiting for Intel's response on these issues.

cc: @christ1ne

@fenz
Copy link

fenz commented May 10, 2021

@psyhtest I tried a "parallel" reproduction exercise and found similar results. I saw you have a working example in CK: https://github.com/ctuning/ck-openvino#accuracy-on-the-coco-2017-validation-set but I didn't get if you figured out the "reverse_input_channels" question.
I converted the models using the official OpenVINO docker container from Intel:

FROM openvino/ubuntu18_dev:2019_R3.1 as builder

WORKDIR /tmp

RUN curl -O https://zenodo.org/record/3401714/files/ssd_mobilenet_v1_quant_ft_no_zero_point_frozen_inference_graph.pb && \
    curl -O https://zenodo.org/record/3252084/files/mobilenet_v1_ssd_8bit_finetuned.tar.gz && \
    tar xf mobilenet_v1_ssd_8bit_finetuned.tar.gz && \
    rm mobilenet_v1_ssd_8bit_finetuned.tar.gz && \
    cp mobilenet_v1_ssd_finetuned/pipeline.config . && \
    rm -rf mobilenet_v1_ssd_finetuned && \
    python3 /opt/intel/openvino/deployment_tools/model_optimizer/mo.py \
        --input_model /tmp/ssd_mobilenet_v1_quant_ft_no_zero_point_frozen_inference_graph.pb \
        --input_shape [1,300,300,3] \
        --reverse_input_channels \
        --tensorflow_use_custom_operations_config /opt/intel/openvino/deployment_tools/model_optimizer/extensions/front/tf/ssd_v2_support.json \
        --tensorflow_object_detection_api_pipeline_config /tmp/pipeline.config

I tried to generate both FP32 precision (default) and FP16 precision models w/wo "reverse" option but I get a "good enough' mAP only when not using the option (which is different from what suggested in the submission: intel

Regarding the OpenVINO version 2020.x it seems cpu_extension is now part of the library itself and doesn't need to be mentioned separately( openvinotoolkit/openvino#916 )
Based on this and on some OpenVINO tutorial (and seems used in the newer Intel submission as well) I changed some of the lines.
From this folder: https://github.com/mlcommons/inference_results_v0.5/tree/master/closed/Intel/code/ssd-small/openvino-linux
I run:

sed -i '/IE::ie_cpu_extension/d' ./CMakeLists.txt && \
sed -i \
    -e 's/ext_list.hpp/ie_core.hpp/g' \
    -e 's/network_reader_.getNetwork().setBatchSize(batch_size)/network_.setBatchSize(batch_size)/g' \
    -e 's/network_reader_.ReadNetwork(input_model)/Core ie/g' \
    -e '/network_reader_.ReadWeights(fileNameNoExt(input_model) + ".bin");/d' \
    -e 's/network_ = network_reader_.getNetwork();/network_ = ie.ReadNetwork(input_model, fileNameNoExt(input_model) + ".bin");/g' \
    -e '/Core ie;/{$!N;/\n.*const std::string device { "CPU" };/!P;D}' \
    -e '/ie.AddExtension(std::make_shared<Extensions::Cpu::CpuExtensions>(),/,/"CPU");/d' ./backend_ov.h

I tried to have a "minimal" equivalent code, not using much from the newer submissions but only:

  • change the network creation call
  • remove the cpu extension
  • cleanup: move the "Core ie" declaration before it is used.

Regarding the issue: ERROR: loadgen(1871) and payload(2583339204608) disagree on image_idx
It seems in the "windows" version there's a difference in the creation of the "Item" for the "ssd-mobilenet" model:
linux vs windows and, based on the item_ov.h (which looks the same for both linux and windows versions), it seems the linux version is the correct one, using the same "order" for the windows version the error disappear.

I know this is the first submission round but it is interesting to compare old and newer version, that's why it is important to me to clarify those doubts.

@psyhtest
Copy link
Author

@fenz Only a year has passed since I looked into this, wow. Feels more like forever :).

I would have looked at the v1.0 or v0.7 code but alas Intel only submitted SSD-MobileNet to v0.5.

@fenz
Copy link

fenz commented May 10, 2021

So, as far as I get, the current status is to not use "--reverse_input_channels" option when converting to OpenVINO model representation. By the way, I started looked at this long time ago as well, I'm back on it since (I thought) I could understand it a bit better now. Thanks for your answer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants