Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mcuboot/mcumgr: board not upgradable after "Not enough free space to run swap upgrade" #58103

Closed
fabiobaltieri opened this issue May 20, 2023 · 9 comments · Fixed by #64586
Closed
Assignees
Labels
area: mcumgr bug The issue is a bug, or the PR is fixing a bug priority: medium Medium impact/importance bug

Comments

@fabiobaltieri
Copy link
Member

Describe the bug
Hi, bumped into this issue with an application setup with two partitions setup for swap upgrade, reproduced it on an nrf52dk but really it should apply to any board with this setup.

It appears that if an image large enough is loaded in slot2 (in my case it's 217820 bytes out of 220 KB, 96.69% used), mcuboot may fail to swap with something like:

W: Not enough free space to run swap upgrade
W: required 225280 bytes but only 221184 are available

and runs the previous image. This is expected and documented in https://developer.nordicsemi.com/nRF_Connect_SDK/doc/latest/mcuboot/design.html#swap-without-using-scratch, the problem is that once the board enters this condition, it appears that there's no way to recover it with mcumgr:

$ mcumgr -c acm0 image list
Images:
 image=0 slot=0
    version: 0.0.0
    bootable: true
    flags: active confirmed
    hash: 13a9251314c31970101ec3ff4ca441f5c90d36f247644cc0cdc8e30b4c8bf6f1
 image=0 slot=1
    version: 0.0.0
    bootable: true
    flags: pending
    hash: ef1bcc89678e7c523bb25b78d10c946742adafa60694df40b26c35fc16e7d0de
Split status: N/A (0)
$ /home/fabiobaltieri/go/bin/mcumgr -c acm0 image erase
Error: 6
$ /home/fabiobaltieri/go/bin/mcumgr -c acm0 image upload build/zephyr/zephyr.signed.bin
 0 B / 213.04 KiB [-------------------------------------------------------]   0.00% <- stuck, no progress
$ mcumgr -c acm0 image test 13a9251314c31970101ec3ff4ca441f5c90d36f247644cc0cdc8e30b4c8bf6f1
Error: 6

So the board is effectively stuck in upgrade limbo as it looks like those calls checks the "upgrade pending" bit and errors out. The only way I've found to recover is to issue a "mcuboot erase 2" on the shell, if that's not available it's debugger time I guess.

This has apparently previously been reported in apache/mynewt-mcumgr#157.

I think that one should be able to cancel the upgrade by erasing the second slot image. Is that something we can fix in the Zephyr fork or should be fixed upstream?

To Reproduce
Just try to upgrade a board configured for swap with a big enough image. In my tests I noticed images even bigger to trigger a "E: Image in the secondary slot is not valid!" and cause the bootloader to erase the second slot automatically, so this may only trigger on a range of sizes.

Expected behavior
Either mcuboot erases the second image when failing (seems to do it for other cases) or mcumgr allows erasing a pending image.

Impact
Seems pretty serious to me, may lead to devices lost and non upgradable in the field.

Environment (please complete the following information):
Linux, Zephyr SDK, d01780f

@fabiobaltieri fabiobaltieri added bug The issue is a bug, or the PR is fixing a bug area: mcumgr labels May 20, 2023
@fabiobaltieri
Copy link
Member Author

cc @d3zd3z @nordicjm @de-nordic

@nordicjm
Copy link
Collaborator

From memory, I believe @de-nordic has been working on an improvement so that images that are too big to be built should error out during the build instead of generating a file that cannot actually be loaded.

@fabiobaltieri
Copy link
Member Author

From memory, I believe @de-nordic has been working on an improvement so that images that are too big to be built should error out during the build instead of generating a file that cannot actually be loaded.

That would be ideal, though I think it should still be possible to delete a pending image as a last resort.

@jgl-meta jgl-meta added the priority: medium Medium impact/importance bug label May 23, 2023
@PerMac
Copy link
Member

PerMac commented May 31, 2023

@gchwier This looks like a good scenario to add to the mcuboot+mcumgr tests you're working on.

@github-actions
Copy link

This issue has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this issue will automatically be closed in 14 days. Note, that you can always re-open a closed issue at any time.

@ShBiezeman
Copy link

ShBiezeman commented Aug 3, 2023

Should this issue also cover the case that if a big image (remaining free partitions < 1)is loaded into slot number 1 it blocks firmware updates due to no swap partition avaible? Ran into this issue a couple days back and should be solvable by reserving a single partition chunk after the app region during compilation. MCUboot shouldn't reserve this chunk tho.

Im willing to look more into this issue but have very limited knowledge within the partition management code.

Found a thread over on the nordic devzone describing this issue.

@github-actions
Copy link

github-actions bot commented Oct 3, 2023

This issue has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this issue will automatically be closed in 14 days. Note, that you can always re-open a closed issue at any time.

@tejlmand
Copy link
Collaborator

tejlmand commented Nov 8, 2023

isn't this issue a duplicate of this #27471 ?

@fabiobaltieri
Copy link
Member Author

@tejlmand from what I see it isn't, this is about a specific range of sizes that is within the partition but beyond what mcumgr can use and mcumgr failing in a way that results in the device not upgradable anymore, which I think it's pretty bad.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: mcumgr bug The issue is a bug, or the PR is fixing a bug priority: medium Medium impact/importance bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants