H7 dual core OTA fix #2966

robertc2000 · 2024-11-13T11:43:17Z

I encountered an issue while testing that prevented OTA from working properly on dual core H7 devices (H755 and H745).

These devices normally boot from both cores. CM4 requires by default a firmware at address 0x08100000 and CM7 requires a firmware at address 0x08000000.

The issue is we are using address 0x08100000 to store the new firmware, which means that after the next reset, both cores will start executing their corresponding firmwares. I noticed this always leads to a crash and the board gets bricked after this...

I put up a fix to disable CM4 from executing before resetting.

scaprile · 2024-11-13T12:08:18Z

I think we should partition flash differently, in order not to mess with the M4 and be able to update the M7.

scaprile · 2024-11-13T11:59:34Z

src/ota_stm32h7.c

@@ -35,6 +35,9 @@ static struct mg_flash s_mg_flash_stm32h7 = {
 #define FLASH_OPTSR_PRG 0x20
 #define FLASH_SIZE_REG 0x1ff1e880

+#define STM_DBGMCU_IDCODE 0x5C001000


Is this register present in all H7s ? We might break other OTAs by triggering a HardFault on invalid memory access

Yes, I checked that. It appears to be on all H7s.

scaprile · 2024-11-13T12:06:04Z

src/ota_stm32h7.c

+  if (STM_DEV_ID == 0x450) {
+    // H745/H755 and H747/H757 are running on dual core
+    // Disable CM4 to prevent the old and the new firmwares from booting
+    // simultaneously on each core.
+    MG_SET_BITS(MG_REG(bank + FLASH_OPTSR_PRG), MG_BIT(22), 0);
+  }


This is a quick workaround, not a fix...
The user bought a dual-core chip for a reason, and we are rendering the second core unusable after first upgrade.
The problem with dual-core OTA is that any core firmware might want to be updated.
IMO, at a minimum, we should be able to update the core we are running in without disabling the other core.
Ideally, we should be able to update both cores, but I guess that would require a different schema, as each core has its own image.

OK, so in this case we could repartition the flash, like you mentioned. How about using only the first bank (only for these dual core boards)? We'll partition this instead.

So they designed a dual-flash-bank chip, where banks can be swapped, and assigned one bank for one core and the other bank for the other core, effectively rendering the swap feature unusable...
Can we change core base addresses ? If so, how would that work with Cube ? How will users flash their devices ?
Otherwise, we can only work with the portion that belongs to our core, or with both at the same time. The former is the way we've been doing it, the latter requires merging two images into one uploadable chunk; I'd rather avoid that.

One other issue: if we tinker with the M4 memory space, we need to stop that core before doing so, that is, probably the easiest path is to limit to the core we are running in.

When and if a customer request updating the M4, we'll devise a schema to do it; be it a single image or a selector for the core to be updated. For now, IMO, stick to the simplest way to update the M7 that coexists with what Cube will generate.

To answer your questions,

Yes, we can change the base addresses for each core. There are some option bytes which control this and modifying the values seems to be persistent across reboots. I tried to figure out how to do it in Cube, but I can't find such an option. Cube seems to generate code linked at the default addresses for each core. (CM4 code is linked at 0x81.... and CM7 at 0x80...). You can change these option bytes though in STM32CubeProgrammer, but I think that is irrelevant for now.

For now, I don't think we should deal with M4 memory space. As long as we use the first bank only for OTA, everything should be fine, since we are updating only M7.

Agreed, if it can't be done in a way that is transparent to the user (does not require users to be aware of it and go changing stuff or programming in a different way), I think we'd better stick to a single-bank approach. That has already been done and it only requires thinking flash is half the size and there is no swap, so we manually overwrite the lower portion of what is physical bank 1, and don't mess with bank 2, that belongs to the M4.
Is that correct ?
That will let us easily handle updates for the M4 when and if a customer requests that.

Exactly! We'll treat it as a single bank scenario by assuming the 2nd bank does not exist.

robertc2000 · 2024-11-14T12:32:55Z

src/ota_stm32h7.c

@@ -121,6 +124,12 @@ MG_IRAM static bool mg_stm32h7_swap(void) {
  flash_clear_err(bank);
  // printf("OPTSR_PRG 1 %#lx\n", FLASH->OPTSR_PRG);
  MG_SET_BITS(MG_REG(bank + FLASH_OPTSR_PRG), MG_BIT(31), desired);
+  if (STM_DEV_ID == 0x450) {


Looks like H745/H755 and H747/H757 aren't the only ones that use this ID. H743 uses 0x450 as well. I did not expect that, since H743 has a different RM. They shouldn't have allowed to have the same ID, it complicates things. How are we supposed to distinguish between a H743 and a H745 at runtime??

We can always do it at compile time, but let me dive into ST waters and let's see what I can find.

I know. It would be cleaner to do it at runtime though.

Looks like single-core chips are in fact dual-core chips where the M4 has been disabled. This means that everything is the same, but probably we can check the BCM bits in the OPTION bytes.
Whether the chip is a single-core or a dual-core one, I think it is safe to say that if the CM4 is gated (no clock, BCM4 = 0 ), we can assume all flash is available.
WDYT ?
745 RM 13.3.14 SYSCFG_UR1
742 RM 12.3.12 shows this register is not present, notice the next register is UR2. The less-significant 16-bit portion of UR4 is reserved.
Worst case: we get a HardFault when accessing SYSCFG_UR1, this is an invasive way to detect a single core...
Best case: the option byte is readable

I don't see any other option if 743s really are 745s with the M4 disabled

Good idea, but unfortunately UR1 reads as 0x10001 on both H743 and H745/H755... So no hard faults, but instead it returns the wrong value. It should have been 0x10000 instead.

I guess we'll have to decide at compile time if it's dual core or not. I did not find any other options.

So 743s are in fact 745 chips with either a "broken" or removed M4 core...
The other options I can come up with, require collaboration between cores and a timeout... a big no no no...
Let's add a #define for all dual cores. Does this very same situation replicate with the other dual core families we handle ?

Yes, it's the same for all dual core boards we have (H745/H755 and H747/H757). They are all under the same reference manual, anyway.

So, I would add MG_OTA_H7DUALCORE or something like that (give or take one or two underscores)

robertc2000 requested a review from scaprile November 13, 2024 11:43

scaprile reviewed Nov 13, 2024

View reviewed changes

robertc2000 commented Nov 14, 2024

View reviewed changes

H7 dual core OTA compatibility

665298a

robertc2000 force-pushed the ota-h7-dual-core branch from 000eb0a to 665298a Compare November 14, 2024 15:23

scaprile self-requested a review November 14, 2024 17:22

scaprile approved these changes Nov 14, 2024

View reviewed changes

scaprile merged commit 8b95a22 into master Nov 14, 2024
62 of 77 checks passed

scaprile deleted the ota-h7-dual-core branch November 14, 2024 17:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

H7 dual core OTA fix #2966

H7 dual core OTA fix #2966

robertc2000 commented Nov 13, 2024

scaprile commented Nov 13, 2024

scaprile Nov 13, 2024

robertc2000 Nov 13, 2024

scaprile Nov 13, 2024 •

edited

Loading

robertc2000 Nov 13, 2024 •

edited

Loading

scaprile Nov 13, 2024

robertc2000 Nov 13, 2024

scaprile Nov 13, 2024

robertc2000 Nov 13, 2024

robertc2000 Nov 14, 2024

scaprile Nov 14, 2024

robertc2000 Nov 14, 2024

scaprile Nov 14, 2024

robertc2000 Nov 14, 2024

scaprile Nov 14, 2024

robertc2000 Nov 14, 2024

scaprile Nov 14, 2024

H7 dual core OTA fix #2966

H7 dual core OTA fix #2966

Conversation

robertc2000 commented Nov 13, 2024

scaprile commented Nov 13, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

scaprile Nov 13, 2024 • edited Loading

Choose a reason for hiding this comment

robertc2000 Nov 13, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

scaprile Nov 13, 2024 •

edited

Loading

robertc2000 Nov 13, 2024 •

edited

Loading