Allow inline optimizer to spit out a single variable #83

janjongboom · 2019-07-23T07:09:03Z

To do weight updates it's much easier if only a single block of memory is responsible for all the weights. One way of doing this would be to concat all the variables that the inline optimizer spits out, and reference the memory directly from the trained.cpp file. Bonus points if you can pass a block of memory into trained.cpp, so you can place them in memory-mapped flash (on many ST boards the QSPI is mapped at 0x90000000).

I figured at first that using the existing file system functions would help with the latter, but LittleFS on (at least my) QSPI boards is extremely slow and inferencing time goes from 320 ms. to 2000 ms. on DISCO-L475VG-IOT01A because the slow fopen() calls (see ARMmbed/mbed-os#11085).

The text was updated successfully, but these errors were encountered:

neil-tan · 2019-07-26T08:53:55Z

@janjongboom This is interesting.
Is the intention to optimize the firmware for delta updates?

Please help me to envision how the latter should be done. Currently, the weights are being compiled as .text. It is injected into the flash as a part of the application.
Declaring a tensor pointing to a specific memory region should be straight forward. Though, how are you planning on injecting the weight into the QSPI.

tagging @Knight-X

janjongboom · 2019-07-26T11:48:50Z

@neil-tan It's in .text but split over many variables. So if I want to update them through some sort of delta update process I need to get the location in flash of every variable and overwrite that portion. If it's just a single blob and uTensor knows the offset of each file in that blob you only need to do one update and don't have to do any book-keeping yourself.

mbartling · 2019-07-30T13:19:56Z

@janjongboom what's the benefit in doing this over just forcing the individual constants into a specified region of rom? Given they are in the same translation unit they probably end up contiguous anyways.

janjongboom · 2019-07-31T10:10:30Z

@mbartling Because then I need to keep track of 1) all variables, 2) the location of these variables in ROM. Which means I need a compiler to figure this out. If the location of the weights is abstracted away by uTensor I don't need a compiler. I just need some Python code to generate the new weights definition and send it to the board.

Given they are in the same translation unit they probably end up contiguous anyways.

Yes, but no guarantees here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow inline optimizer to spit out a single variable #83

Allow inline optimizer to spit out a single variable #83

janjongboom commented Jul 23, 2019 •

edited

Loading

neil-tan commented Jul 26, 2019

janjongboom commented Jul 26, 2019

mbartling commented Jul 30, 2019

janjongboom commented Jul 31, 2019 •

edited

Loading

Allow inline optimizer to spit out a single variable #83

Allow inline optimizer to spit out a single variable #83

Comments

janjongboom commented Jul 23, 2019 • edited Loading

neil-tan commented Jul 26, 2019

janjongboom commented Jul 26, 2019

mbartling commented Jul 30, 2019

janjongboom commented Jul 31, 2019 • edited Loading

janjongboom commented Jul 23, 2019 •

edited

Loading

janjongboom commented Jul 31, 2019 •

edited

Loading