N2D2 1.2.0
💡 Highlights
We are delighted to annonce the new release of N2D2 with exciting new features and with a little tidying up in the codebase 🧹
We would like to thank everyone who contributed to this release!
- Improvement and bug fixes in the Python API: the API continues to become more and more interesting to use. We integrated multiple features such as a summary function and fixed some bugs in this release. Don't hesitate to give us your feedback about it !
- Change the name of the interoperabilities: you can now use our interop with the new names pytorch_to_n2d2 and keras_to_n2d2.
- Fixed bugs in the generation of the N2D2 exports and improvements of the user experience: some bugs have been fixed in the export code as such as in the structure of the exports. Test them to tell what you think of the improvements.
Moreover, more features are already planned to be added in the next releases like other QAT techniques (the SAT technique for example).
So don't go too far !
🐛 Bug Fixes
Python API
- Fixed a bug when N2D2 is not compiled with lib json cpp, the python API would fail to load as some Cpp class are not defined.
- When generating a network from ONNX, the first cell is linked to a dummy provider. When calibrating the network the first cell still pointed to this dummy provider and not to the new one.
- Fixed an issue in LSQ quantization when we re-initialize weights and diff weights.
- Fixed a bug where it was not possible to export a float32 model.
- Fixed a bug where the ONNX import would fail to fuse the bias into a fully connected/conv layer.
Interface
instances can now be used as an input for a cell without coming from another cell themselves.- Updated torch -> n2d2 tensor conversion to handle views.
C++
- Added the former CPP export and the current one is moved to
CPP_Quantization
- The Memory Manager has been fixed for the Float32 export
- DeepNet::removeCell now updates the cell pointed by the Target if the removed cell is the last one
- Fixed the import ONNX to concatenate Mul and Add layers into a BatchNorm;
- Fixed ONNX import padding for opset version < 11 (see onnx changelog).
- Versions < 11 have pads as a parameter of the node.
- Versions >= 11 have pads as an input of the node.
⚙️ Refactor
Python API
- Global variable is now an enum;
n2d2.global_variables.cuda_compiled
has been renamed tocuda_available
- Introduced the use of single method dispatch for code readability (code specific to the datatype of the first parameter of a method).
- Conv, Fc and Deconv layer use a Linear activation by default instead of None to fix error during post training quantization.
- Added protobuff as a dependency of keras interop and removed dependencies to the onnx package in requirements.txt.
- Renamed pytorch_interoperability -> pytorch_to_n2d2 and keras_interoperability -> keras_to_n2d2
C++
- Refactored quantification and export functions to remove the use of DeepNet referenced by cells.
- Renamed N2D2 namespace to N2D2_Export in the
CPP export
, to avoid conflict with N2D2 when using both N2D2 and the CPP export. - Renamed the ONNX constant to EXPORT_ONNX in the
CPP export
, to avoid conflict with the ONNX constant in N2D2. - The new path of the binaries generated by the exports are located in
./bin/run_export
- Multiple files in the CPP export have been improved like the
Makefile
orcpp_utils.cpp
.
🚀 Improvements
Python API
- Added decorator to check inputs type in the PythonAPI.
- Added default model ResNetv1-[18,34,50,101,152].
- Improved export test by compiling and running the exports.
- Added an option to allow the user to choose the opset version used for PyTorch interoperability.
- Added a summary method, a new display system to synthesize neural network.
The implemented function draws a table with all the NN layers.
Per layer, multiple information are displayed such as output dimensions, number of parameters of the layer, computing demand (in MAC), some extra information depending on the layer type, and if the current layer is trainable or not.
Here an example of what summary can do with a small network.
--------------------------------------------------------------------------------------------------------------
Layer (type) Output Shape Param # MAC # Connected to Extra Grad
==============================================================================================================
Image1 (input) (1, 3, 256, 256) 0 0k -
Conv_0 (Conv 3x3) (1, 64, 256, 256) 1,792 112,657k Image1 Act: ReLu True
Pool_0 (Pool) (1, 64, 129, 129) 0 0k Conv_0 size: 2x2, pad: 1 -
Conv_1 (PointWise) (1, 128, 131, 131) 8,320 136,323k Pool_0 Act: ReLu True
Fc_0 (Fc) (1, 100, 1, 1) 0 219,660k Conv_1 True
Fc_1 (Fc) (1, 10, 1, 1) 0 1k Fc_0 True
Softmax_0 (Softmax) (1, 10, 1, 1) 0 0k Fc_1 -
Features (output) (1, 10, 1, 1) 0 0k Softmax_0 -
==============================================================================================================
Total params: 10,112
Total computing: 468,642,326 MAC
- Added
list_exportable_cell
function to the PythonAPI export module.
Usage example :
>>> import n2d2
>>> n2d2.export.list_exportable_cell("CPP_TensorRT")
+----------------+-----------+
| Cell Name | Available |
+================+===========+
| Activation | Yes |
+----------------+-----------+
| BatchNorm2d | Yes |
+----------------+-----------+
| Conv | Yes |
+----------------+-----------+
| Deconv | Yes |
+----------------+-----------+
| Dropout | No |
+----------------+-----------+
| ElemWise | Yes |
+----------------+-----------+
| Fc | Yes |
+----------------+-----------+
| Padding | Yes |
+----------------+-----------+
| Pool | Yes |
+----------------+-----------+
| Reshape | Yes |
+----------------+-----------+
| Resize | Yes |
+----------------+-----------+
| Scaling | Yes |
+----------------+-----------+
| Softmax | Yes |
+----------------+-----------+
| Transformation | No |
+----------------+-----------+
| Transpose | Yes |
+----------------+-----------+
- Added the list of exportable cells to the documentation.
- Created a decorator to template documentation.
C++
- Added a broadcasting possibility for ElementWise cell ( (1 * 1) Prod (N * N) = (N * N))
- Added a CPU implementation for the LSQ quantization method.