-
Hi, I am trying to convert the network powering KataGo (https://github.com/lightvector/KataGo), a computer Go implementation, to C code, so that I can have a extremely lightweight Go engine for embedded platforms. I am using an older, smaller network, and I have successfully modified https://github.com/isty2e/KataGoONNX to produce an ONNX file for one of the older and smaller models. The final model size is about 4.3 MB (attached). When trying to run this network through onnx2c, I run into the issue that the "ReduceSum" node is not implemented. While I am an experienced programmer, I have very little experience with ML frameworks in general.
Thanks a lot! |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
BTW, I am also looking into quantization or even binarization (i.e., multiplies -> xors), but that is still a bit farther away. From what I've gathered so far, it seems like a plausible approach would be to do the quantization directly in torch or ONNX, and then use onnx2c on the quantized version. |
Beta Was this translation helpful? Give feedback.
-
Hi fishcu, First, the quantization in onnx2c was done at a time when that general idea was just transiting from academic papers to available tools. At the time, it was simpler to write my own quantization than try to get others working. I haven't looked into quantization since I wrote that MNIST-AVR demo, but I am under the impression quantization is much more user friendly in the high level ML frameworks today. So I would guess the best way of doing quantization with onnx2c today is to compile an already quantized .onnx file. And to your other questions :) 1I never looked into the onnxruntime implementation for a reference. The two reasons for this was that I wanted to get an understanding of what is happening in the nodes, but also because different implementations are very much tied to the frameworks they are implemented in - i.e. porting is as much effort as writing from scratch. All of onnx2c implementaion is written against the ONNX documentation https://github.com/onnx/onnx/blob/main/docs/Operators.md and the backend unit tests https://github.com/onnx/onnx/tree/main/onnx/backend/test/data (especially the small ones under the This approach is not always the best: the ONNX backend tests are not covering 100% of the specification, and the specification itself is many times a bit too "high level" (case in point: the 2The attributes that need to be parsed are defined in the Operands document: https://github.com/onnx/onnx/blob/main/docs/Operators.md#ReduceSum 3onnx2c has several layers of testing. Benchmarking is something added fairly recently, and still a bit work in progress. The main testing is the ONNX backend nodes as unit tests. These tests consists of a network as an .onnx file, and all inputs and outputs as protobuf stored files. To use these backend tests, it suffices to add the ONNX tests for ReduceSum as If your ReduceSum implementation passes all the ONNX backend tests for this operand, that is good enough (for me) to merge in. If the ONNX backend unit tests are very few or lack some particularly useful corner cases, there are some local (to onnx2c) unit tests that I've generated using scailableonnx (python files for this can be found somewhere under the I'm leaving it up to you to decide if extra tests are needed for ReduceSum. Writing my own were needed with LSTM and Conv operands. But those are the most complex operands anyway. end notesThis answer became a bit long... Hopefully it answers your questions :) One thing to be aware of is, that onnx2c stops instantly if when it discovers an unimplemented operand. I.e. ReduceSum might not be the only one missing... But looking forward to your contributions :) |
Beta Was this translation helpful? Give feedback.
Hi fishcu,
First, the quantization in onnx2c was done at a time when that general idea was just transiting from academic papers to available tools. At the time, it was simpler to write my own quantization than try to get others working. I haven't looked into quantization since I wrote that MNIST-AVR demo, but I am under the impression quantization is much more user friendly in the high level ML frameworks today. So I would guess the best way of doing quantization with onnx2c today is to compile an already quantized .onnx file.
And to your other questions :)
1
I never looked into the onnxruntime implementation for a reference. The two reasons for this was that I wanted to get an understand…