System DT Meeting Notes 2020

Table of Contents 2020-May-13 Attended Agenda Action Items Notes 2020-Apr-06 Attended Agenda Action items Notes 2020-Feb-13 Attended: Action items Notes 2020-Jan-22 Attended: Action items: Notes:

2020-May-13

Attended

broonie
Stefano Stabellini
Nathalie Chan King Choy
cvs
Benjamin Gaignard (ST)
Ed Mooring
Tomas Evensen
Bruce Ashfield
Rob Herring
Ilias
Joakim Bech
Loic Pallardy (ST)
Mark Dapoz
Krzysztof Kepa

Agenda

Stefano to present proposal for bus-firewall configuration in system device tree

Action Items

Nathalie sync up w/ Francois about who should be invited to System DT call (some individuals didn't get this invitation)
Stefano to go back & discuss w/ Xilinx XMPU expert.
Stefano & Tomas to sync-up with Loic & Benjamin. Target for next System DT call to discuss how the 2 proposals combine.

Notes

TO DO: Link to Stefano's slides
Resource Groups will become more important
- Collection of devices accessible by 1 or more domains
- Chane we made: Use different property name. To avoid confusion & b/c definition diffferent, used "include" instead of "access"
- What if you have device you want to share across multiple domains?
  - Using resource group for sharing, it becomes clear which resources are shared across domains & easier to check
- Also will use for bus firewalls
Bus Firewalls
- Needs info in list on slides
- Stream ID range -> IDs is an open question, but assume Stream ID for this discussion
  - Other bus firewalls might have different ID space
- address range to block/allow access
- some may have allow rules & block rules
- priority is not required, but often bus firewall has only few slots. So, would be good for tool to know which rules will fit & which will not fit. Make rule that higher priority rule goes in & others as best effort.
- Benjamin: How will it work between 2 address blocks?
  - Stefano: Think example will answer the Q
- Target Example: 2 domains blocking each other's access to their resources
  - CPU cluster of domain 0 is blocked from accessing memory regions & region of device
- Benjamin: phandle on device. Here seem to have double things.
  - Stefano: The rest of slides might provide some clarity
- Tomas: Could be DDR memory, or address-mapped control registers. As long as it's addressable, it's the same. Even if that's controlling an Ethernet, for e.g.
- Rule #1: Address ranges
  - In this slide, wherever it says "memory range", should say "address range", ref. Tomas' comment above.
  - Allows us to avoid writing all the address ranges we are protecting, so reduces some duplication
- Rule #2: phandles
  - Using links instead of stream IDs hard coded in the attribute
  - Makes easier to check for correctness
- Rule #3: firewall-default
  - When you want to block everyone except small set
- Rule #4: "Stream ID Self" is always allowed
  - There is no useful scenario when you want to block yourself from your resources
- Example #1: Same example we saw before, but now we can go in more detail
- Example #2
  - Domain 0 & Domain 1 block everyone else
  - Highest priority is to protect the resource-group
  - Domain 0: All stream ID except your own stream ID are blocked from accessing the memory range & MMIO region of MMC0
  - Resource group is blocking everyone else w/ higher priority. All stream ID except for domain 0 & domain 1 are blocked.
- Example #3: 1 domain has higher priority
  - Everyone is blocking all stream ID
  - Domain 0 blocking at higher priority
  - Domain 1 & Domain 2 are allowing Domain 0 to access
- Benjamin: Opposite of what I proposed from ST POV. Fitter rules from network firewall when I see that. ST popose to set firewall depending on node more like pin control configuration. This is quite different POV. Difficult for me to catch on this.
  - Tomas: On each node ST says to set each firewall. For us, we see it more of a SW configuration than a HW description, so you can change it without changing the DT that describes the HW. So, we put these in different places. That's a goal for us. Also, we want to express things as "I have resources here, what do I want to protect from" b/c think that's how SW & Safety folks think. The other way would be to say each master doesn't affect memory or a device or someone else, but then if you do it in 1 place, you don't have the security or safety around it. So, if someone adds a new Ethernet controller somewhere, you don't have to rememberto say also this one doesn't mess with my region.
    - Keep HW & SW config separate
    - Describe what youw ant to protect
- BenjamiN: We don't have this problem of everything is allowed & need to block everybody or open to some groups.
  - Loic: Maybe not 100% opposite. We can verify at runtime that we can access the resource. So, we know where to look in firewall config so we know where to access it. If we can't, we can request access via secure monitor, that will open firewall for the SW. Here, it's more initial firewall config that you wwant to describe. Now I understand the different domains, but I don't understand the priorities. For me, you have access or don't.
  - Stefano: In the firewall, you do/don't have access. Priority is to simplify the writing of the rules for someone who wants to protect more than they can protect. If can't protect the priority 9, will throw an error. So, you can write the rules so it matches what does/doesn't get protected.
  - Loic: If you put levels of priority, you will add much complexity in tools to protect something. You may change the firewall config & SW was working fine as priority 6, but then you lose access to the resource b/c you added something w/ priority 7. We should not jeopardize SW execution.
  - Stefano: Agree w/ sentiment that priority field would simplify the spec & simplify understanding from SW. Reason we might have to have it (maybe optional) but the HW firewall on Xilinx board is taking complex configuration. So, the tool might have to know the priority to play some complex tricks to generate firewall rule & not simple 1:1 translation.
  - Loic: So, you ahve certain amt of entries & can't protect everything.
  - Stefano: Additionally, our firewall rules have address masks & need to group together >1 address range. Because of this, the priority info would be useful for Lopper to be able to generate the complex set of rules in the language the firewall understands. If you don't want to use priority, just put 9 for everything. Stefano to go back & discuss w/ Xilinx XMPU expert.
  - Tomas: Could say block, allow, desirable block
  - Loic: Yes, something like that.
  - Tomas: Could create group for these things to be protected
  - Loic: So in your domain you put what MUST be protected & may be protected & not protected
  - Stefano: Yes, Example #2
  - Benjamin: So it means not all the rules may be applied b/c no more space in firewall?
  - Stefano: The idea is that highest priority must be applied (9 in this example), but like the feedback of block, allow, block-desirable, allow-desirable. Will go back & try to clarify.
  - Mark: How would user know if it's protected if it's desirable? It may change
  - Stefano: SW will only get description of what it should access. We are generating a traditional DT out of this.
  - Loic: Think this is wher eis complementary w/ Benjamin proposal. At runtime, can check b4 accessing peripheral that we have access for it. Important b/c w/ system DT you want to link firewall config w/ DT node enabling & broadcasting info for all the SW running on the platform. But, you know it was working for customer & customer can update kernel & DT with/without changing the firewall. Where HW is not safe, you will catch & not boot kernel & put message on console. Benjamin ahs done proposal on list: Before probing drivers check that it's allowed & that way know that firewall is well configured & aligned w/ DT.
  - Tomas: I'm hearing that 1 view of firewall is high-level initial firewall (what we have here). You guys need lower-level firewall at rungime.
  - Loic: Yes, this is why both proposals can fit together. We need way to put together initial firewall. ST has multiple firewalls. We should, thanks to this list, be able to generate firewalls for DDR, internal RAM, etc. Today, we don't have generation from 1 to DTS/DTB & customer may update over the air just a part of system. Need to be able to boot in safew way U-Boot & kernel to check that it's OK.
  - Tomas: Runtime verification is really important. We discussed in a previous call, if you have a description like this, how you describe your fierewall will be vendor specific. We will have a back-end to Lopper specific to us. ST might want to have a Lopper back-end that adds per-device attributes.
  - Loic: we put link where to check in HW. It's fixed. Like interrupts, how they are fixed. CAN you know is protected by this part of firewall & so you can check it's well configured there. Here it's more dynamic for initial config of firewall that we'll generate. In Lopper, will have different generation of firewall for NXP, Xilinx, ST, etc.
  - Tomas: Yes. We are trying to see if there is a way at higher level for user who moves from ST, NXP, Xilinx part doesn't have to do it in a different way.
  - Stefano: Or at least can use same flow to generate firewall rules
Open issue: ID Space
- Stefano: Xilinx firewall takes IDs that are effectively Stream IDs (IO MMU ID). What do you use?
- Loic: We can identify all the masters
- Stefano: I'l lhave to be careful how I word it so each vendor can have its own ID
- Loic: Example: PCIe master on bus w/ lot of security issues. When we are on open platforms, each CPU on our SoC has own ID, but masters like PCIe can select the ID. If I don't have close & secure platform, I can have CortexA controlling PCIe & PCIe can access what Cortex A can access. If I want to secure the platform & create sandbox for PCIe, I can dynamically change ID fo the PCIe. PCIe will need its own domain, sandbox, memory. Have this on set-top-box chipset b/c 4K programs & encrypted video, have to close everything to ensure the key will never be sent & can't attack the SoC. Need to restrict access of PCIe card.
- Stefano: PCIe is dynamic & more complex
- Loic: Can start by doing it static. If we want to isolate PCIe, can do it from beginning. Customer may want
- Stefano: What is name of property you use? master ID?
- Loic: can call it Master ID
- Stefano: Good way
- Rob: IDs are usually HW-specific. This is why we have varial # cells, that makes up ID for controller/provider
- Loic: It's a bus ID, so ID for translations sent by 1 master
- Rob: Ultimately, you want ID tied to HW somehow. IOMMU: Stream ID + some other info. Want to avoid making up your own #s.
- Stefano Could say they are bus mastering IDs to avoid making up #
- Rob: Clock IDs are usually just made u, but sometimes correspond to register offsets. To back up a bit: Where is the descriptoin of the HW in a not DT terms, b/c you're giving me the solution, but it's hard to see the problem. Can you express it as "request access from OS", "boot time configuration". Can you enumerate the use cases & requirements.
- Tomas: In our case, the problem statement is firmware during boot. Get master IDs & address ranges to protect. HW registers to be programmed by the firmware that's running on a special processor.
- Rob: I expect most implementations will be firmware based for all the resources. I'm getting patches on list from ST & proposals on this call from Xilinx, and discussion on the call. Bit of disconnect. Would be good if you could work together to come up w/ 1 solution, or why there should be 2 solutions.
- Tomas: We should get together w/ ST
- Loic: We need to align on vocabulary & come back on execution domains to see what is an execution domain (not just SW, U-Boot, kernel, Xen) but also master & mode (secure/non-secure). So, we can see wehre ID coming & is it enough or if it needs to be associated to something else.
- Stefano: Can put the 2 proposals side-by-side & see where they are aligned/not
- Tomas: Assumptin has been that this will be rpe-processed to get to firmware. This info by itself will not be there at runtime, to set up the initial firewalls.

2020-Apr-06

Attended

Grant Likely
Tomas Evensen
Arnd Bergmann
Benjamin Gaignard (ST)
Loic Pallardy (ST)
Dan Milea
Etsam Anjum
Frank Rowand
Joakim Bech
Bill Fletcher
Ed Mooring
Francois Ozog
Mark Dapoz
Mpujol
Rob Herring
Stefano Stabellini
Nathalie Chan King Choy
Bruce Ashfield
Kevin Chappuis

Agenda

Follow up from the System DT FAQ talk during Linaro Tech Days
Firewall
Lopper
Continue to discuss configuration lifecycle

Action items

Stefano: Will think through if can provide what you need in execution domains for runtime. This example more focused on boot time.
Bruce: Send out info on Lopper once it's posted on GitHub

Notes

Linaro Tech Days System DT Presentation
- Who attended? 8 of the attendees on this call
- Summarizing what was covered in that talk
  - What is System DT?
  - Domains -> More specific: Execution domains
  - HW description & configuration
  - Example shows all 3 concepts that we've already seen
  - Default
    - We discussed it before, but never summarized it.
    - Why? Both convenient & common. Also turns system DT to a smoother addition to DT spec, less revolutionary & more of an addition than changing.
  - Interrupt controllers
  - Instead of putting under "chosen", put under its own top-level node "domains"
    - An execution domain might have boot args & need its own chosen node & would be confusing to have nested chosen
- Grant: Like the way that "domains" is a common container point for all the stuff in the execution domain
- Rob: Looks like what I was proposing where we move everything down a level. Difference is CPU R5 is a phandle here, which would be at the top level.
- Stefano: Yes, CPU cluster containing the 2 R5s.
- Rob: Why have it at top level? Why not in the domain?
- Stefano: Similar to your proposal b/c your question about chosen reserved memory made me re-examine this. Want to have distinction between description & configuration - it's in a way static, it's shipped w/ the HW. Execution domain concept is not just where it's running, but what memory, what devices are assigned & how.
- Tomas: R5: Want to describe physical attributes of the 2 R5s. Could configure as 2 domains of R5s or lockstep R5. Want to have distinction between how you specify the HW properties & how they are configured.
- Rob: Works fine on CPUs b/c they don't have any addressing in the sense of the root of the DT is the view of the address space. But when you have nodes where there is addressing, then you're back to the same problem where root node has single fundamental view of address space. E.g. Regular memory nodes have an address
- Stefano: Idea is to be configured as a range. Memory attribute under CPU just for memory ranges.
- Rob: Say you have a device at address 1000. For A cores, that's 1 device. For R cores that's some other device or memory. If you want to describe the HW at the root level. You have 2 things at the same address.
- Stefano: If we have 2 different devices that are at same address of each CPU. From the HW perspective they would be described w/ same trick used for the interrupt controllers: Indirect bus & address map. Refer to the "Interrupts" slide in presentation. Then at execution domain level, you 1st link to the CPU (one of the 2 R5s or both), then view of the system becomes the view of that CPU cluster & it flows from there which devices are visible at which address. Lopper should be able to check (in future).
- Rob: Indirect bus definition is allowed but not valid & hopefully someday we'll check for this.
- Rob: The root node is 1 view & the only way around that is to add 1 level.
- Grant: I agree, if bus is not on the root, they can't have reg property & ranges property. These node names are invalid. If they were inaccessible buses, could do without addresses attached.
- Rob: But that gives new problem: How do you have more than one?
- Grant: So would have to give a specific name instead of a generic name.
- Stefano: Can we still have sub-node w/ address?
- Grant: Yes, but you wouldn't have to @address on the bus node.
- Stefano: It makes sense b/c doesn't automatically translate to parent address node.
- Grant: I can't recall if you need a ranges property to make it all work.
- Rob: You'd need ranges to be translatable
- Grant: This would be start of translation domain. You probably don't need ranges… might be handled by indirection from execution domain… That shouldn't be too massive of an impact on what has been done. Should be fairly straightforward.
- Tomas: Good feedback
Bus Firewalls
- Stefano:
  - We understand the question a little better now.
  - Lopper would look at System DT & execution domain config & have plug in to generate config needed by the bus firewall. The info is all there, just not in the format that the bus firewall needs.
  - Issue could be too much info
- Tomas:
  - 1 way to do it is to have a way to say: protect this thing from this master & what should not be protected
  - From usability POV, you want to protect a domain (its memory, devices, registers)
  - So, the way to specify what should be protected is through domain concept
  - But then there's problem that Stefano will talk about next
- Frank: Has decision been to have system DT include config or still cautious?
- Grant: Want to avoid too much data in DT, but system is only valid if we know how system is used. Collecting info of how it fits together in domain node seems appropriate to me (gut feel, can't really do strict policy here)
- Frank: We have to be explicit on if we think it's a valid configuration & explicitly justify it
- Tomas: That's why we're really trying to distinguish the config info & put it all in one place in "domains"
- Frank: Let's specify that domains
- Benjamin: Config is owned by device node. We are more in view to do like pin controller configuration b/c main point for us is to change configuration at one time & switch device from Cortex-A to cortex-M. That's what I've proposed on mailing list. Been working on firewall bus. Will send out next week.
- Frank: Interesting if adding/removing devices from system view when system is up. See that with overlays & FPGA is the main user of FPGAs & they are tiptoeing thru it carefully. You have to be super careful.
- Rob: I think intent is to do it at boot time, instead of when running?
- Benjamin: Let Linux assign frame buffer over to co-processor
- Tomas: We have similar use case w/ FPGA. We use system DT, but then we convert that into overlays. Bit scary to take away stuff w/ overlays - not very stable. You still have that info in this domain node & then you prepare an overlay & we tell the firmware who owns what. We don't change the regular device, we just change how the DT looks w/ overlays.
- Benjamin: For us, we put the info in the device node.
- Tomas: You could still use that & then ___, then you could use Lopper. That becomes an ST-special way at runtime. We could still specify it as we have on the screen & could be translated to what ST needs.
- Stefano: Is it Linux itself that makes the change, or do you have separate little CPU to do the system-wide config?
- Benjamin: Initially it is configured secure & then switch from secure to non-secure co-processor. There is still HW control of if this operation is allowed or not.
- Stefano:
  - Will think through if can provide what you need in execution domains for runtime. This example more focused on boot time.
  - Would this be enough to cover the boot time part of the problem? Assuming you had your own plug-in for Lopper to turn it into what you need.
  - Do we have enough capability in bus firewall itself to protect everything we need to protect?
  - There are only so many ranges you can protect & typically you have more ranges you want to protect than what you can protect. There's no concept of priority or ranking in the domain.
  - Tomas & I were thinking of adding priority attribute to convey priority of protecting a domain.
  - Do we need even more granular priority attribute - at range level instead of execution domain level?
    - e.g. Ethernet controller in access list is more important to protect than MMC controller
    - Then Lopper back-end could know how many slots are available in bus firewall
- Tomas: Is this something that should be specified, or is this vendor-dependent?
- Stefano: Most important to protect resources of 1 domain from another domain.
  - e.g. everything in Microblaze domain is important to protect, but which resource in this domain is most important to protect if we don't have enough slots?
- Tomas: Mental model: If you have a domain & you want to protect everything you can address in this domain, and it’s highest priority, it means all the other domains have to be prevented from touching my domain. We think this is how a user will want to think about it. But then need to consider if we run out of protection registers.
- Benjamin: So something with lower priority is not protected?
- Tomas: So then Lopper should spit out warning that only up to priority N was protected
- Loic: If we want to protect 1 IP, it should be applied. We need 2 options at lopper level: Device accessible when protected, or protected from everybody. Imagine that you want to protect 1 device but we can't b/c lower priority. E.g. Won't pass POS certification. Lopper should always generate same output & black/white.
- Tomas: If it's only in 1 domain, it should not be accessible if you have enough registers. If it's shared, it needs to be in multiple places. Then, what if not enough registers - in access list, want to prioritize which gets protected. E.g. if have POS registers, must be protected. Then can have other things that are nice to have protected.
- Mark: So to protect memory from other domains, you have to configure other domains so they can't DMA into it.
- Stefano: It might be possible to do what you describe, but this is not what this is for, in theory. It's not for configuring SMMU. 1 configuration SoC-wide. Usually fixed # (e.g. 8-12) of ranges that can be specified.
- Tomas: that's more logical way for user to specify & then have tooling to tell the firmware.
- Mark: SMMU is fine b/c page table based. Things like RDC aren't. Could get complicated if you have multiple priorities. You need to know in a safety system that everything is protected.
- Tomas: Tooling needs to report what did/didn't get protected. If do other way, where each thing is specified what needs to be avoided, it's very error-prone.
- Stefano: We don't mean best effort. We convey the info in the DT. If you say everything with priority 9 must be protected & if you can't do that, you get error in Lopper.
- Tomas; Not something you discover at runtime that it can't be protected. You have a tool that does the allocation.
- Mark: Usually it's all or nothing - either you protect everything or it's not enough.
Tomas: Lopper
- Kalray is testing out Lopper
- Bruce is working on putting it on a more official GitHub
- Doesn't have the firewall stuff in it yet
- Transforming System DT -> DT
- Will send out more info this week

2020-Feb-13

Attended:

Loic Pallardy (ST), Mark Dapoz, Rob Herring, Tomas, Stefano Stabellini, Dan Milea, Ed Mooring, Ilias Apalodimas, Joakim Bech, Nathalie Chan King Choy, Pierre (Kalray), Dan Driscoll, Grant Likely (Arm), Dan Milea, krzysztofkepa

Action items

Stefano: review memory & reserved memory to make sure it's consistent
Tomas/Stefano: We should add FAQ to the System DT write-up
Tomas: Ask Bruce to send out Lopper more widely, even if draft, soon
Tomas: Drive documenting Xilinx internal discussion on firewalls & share
All: Please share your ideas & then we can discuss a proposal
Nathalie: Put on next call agenda to continue to discuss configuration lifecycle next meeting

Notes

Stefano & Tomas wrote a doc to explain what System DT is about concisely & the use cases & sent to list
- How does concept of default view benefit the use cases or not? What is the relation between them?
- What is missing & can anything be explained better?
- It's a great way to start b/c lots of scattered concepts in System DT. This is 1st attempt to explain how they come together
Rob: Have not had a chance to go through it
Stefano: What do you think should be covered in the doc @ high level?
- We have use cases
- We have interrupts description
- We have execution domain explanation
- How chosen & config properties
- How to describe multiple heterogeneous CPU clusters
- Memory & Reserved memory
  - Stefano: review memory & reserved memory to make sure it's consistent
- Tomas/Stefano: We should add FAQ to the System DT write-up
  - How do we do XYZ?
  - How do you add a new device?
  - How do you add protection?
  - How do you deal w/ interrupt that's only accessible from one?
- Stefano: We do get some good Qs that come up & good to address them
Tomas: Hasn't been clear that System DT can be used in multiple ways
- Host-only tool & shouldn't have to change anything w/ client & just split it up. Linux doesn't have to worry about it.
- Grant's point that maybe it should be readable directly from Linux -> Default, which has some other interesting properties & makes it easier to put into description of Device Trees
- Need to look at these 2 cases separately
  - Let's start with host-only b/c less impact
  - Make sure it's backward compatible
- What do others think?
- Dan Driscoll: Makes sense to me. Then System DT is available & has some tooling around it & Linux uses what's generated from the System DT?
- Tomas: Still want to make sure Linux won't choke on a System DT
- Mark Dapoz: Combined system one is useful for Hypervisor case. Don't have to be backwards compatible with Linux in that case
- Dan D: Yes, we've done @ Mentor, but it was proprietary. World prefer solution created outside.
Mark Dapoz: WR same way & prune it down. In the tooling good if we can do it. Haven't seen any good UI tooling to do this. Very hard to express in text. Need a graphical visualization.
- Rob: Are any of these tools public?
- Mark: Don't think any tools exist
- Tomas: Bruce working on Lopper & sent out to some ppl. Intent is to make it available
  - Tomas: Ask Bruce to send out Lopper more widely, even if draft, soon
  - Initially, need to work out the text format before we get to the graphical tooling
- Stefano: Have way to assign devices thru DT in Xen. Too difficult for users.
- Grant had to drop off approx here due to conflict
- Dan D:
  - Agree we need textual definition nailed down B4 tools. We had evolving content internally & the tools had hard time to keep up
  - Graphical side is very hard. Partitioning is hard without impact to guests. Hard to solve even with tool. SW change often needed to support the tools.
- Tomas:
  - Should make sure even if we have Python-like tool that we have a C/C++ version that we can read with libfdt & can do runtime tooling as well
  - Lopper is Python on top of dtc & libfdt
  - Can have different data-driven back-end plug-ins
    - Have DT back-end that can prune for 1 or multiple domains
    - Will have RTOS back-end for #defines that can compile in
    - Will have one for Xilinx firmware
- Dan D: Will Lopper get open sourced? When?
- Tomas: Yes. TBD under what project. Flexible. Don't want it to be just a Xilinx thing.
- Dan D: We agree with path forward with System DT & held off on our tool development in this area. Maybe there are things we can contribute to help progress Lopper.
Loic: Question about access for assigning peripherals to a domain. May be difficult for end user.
- Could be shared: GPU banks, some clocks, certain IP
- e.g. U-Boot, then kernel
- What about big firewall definitions -> which processor can access or not?
  - You need a table to configure your firewall for peripheral access & memory firewall
  - You need a DT node indicating how the peripherals are accessible & secure/non-secure side
  - If you want to verify access, you can look at this table & not duplicate info in your domain
- Tomas: In the domain description, you are specifying what memory you are using (might be shared w/ other domains), what devices you're accessing (some might be unique via indirect bus)
  - List serves 2 purposes & need to differentiate. Started internal discussion & should start external
  - Access list so Lopper can prune tree & throw away what's not needed for 1 domain. Doesn't by itself protect any memory or devices.
  - In our devices, we have HW registers that protect regions & devices from other bus masters
  - Usually you have a more limited set of those protections
    - 1:1 mapping
      - In 1 domain & not shared -> should be protected
      - We don't have enough HW to protect everything from everything else
      - But also maybe you want to allow peek/poke
    - You have a domain that's more robust & maybe you need to safety certify it or need more security
      - Maybe can add attribute that says if it's not explicitly shared, then set up protection/firewall
      - Was at high level what we were thinking, but still discussing
    - At more fine grained level, for things we include in domain, be able to override
      - Allow access even if not explicitly protected
    - Domain specification allocation & not spread out for different devices & memories
      - Would mean config info not all in one place
- Loic:
  - You need all the info for peripherals in 1 place b/c usually have 1 large firewall. Same from memories.
  - What you describe is only understood by Lopper. If we come back to Grant's requirement for kernel to be able to understand.
- Tomas: If you send unaltered system DT to Linux, it sees everything unless on an indirect bus. My internal model is that Linux is the master & allocates stuff to the other processors. So, if we add features to Linux in conjunction with remoteproc
- Loic: Disagree. Linux won't be master for firewall management. Might be Cortex-A with TF-A/OP-TEE. Might have dedicated co-processor in charge of security that applies rules & then boot Linux.
- Stefano: If you have full system DT to platform config processor. Firmware comes up & configures. Similarly, hypervisor coming up & Setting up protection for each domain. If bus firewall already configured & Linux or other OS boots up, maybe not a good idea to pass it a system DT.
  - Spec POV vs. Use-case POV
- Rob: You can have both use cases
  - Secure world has done some partitioning & Linux can do further partitioning or it gets what's partitioned
- Stefano: Example of how that will work on Xilinx
  - Firmware configures the whole platform & give System DT to it & it configures bus firewall, etc.
  - Start cortex-A and Xen will do further partitioning
  - I don't think it's a great idea to tell Xen everything incl. what it can't access b/c outside its realm
  - At runtime DT should describe what each part should care about
  - Last leaf of the tree will only see normal DT by then
- Rob: So you're arguing there is no host tool? Secure world gets the whole thing & strips out.
- Dan Driscoll: You can do either way & define the partitioning up front
- Rob: The one to worry about is doing it at runtime. If you can do it at runtime you can do it at build time
- Tomas: Firmware doesn't have RAM or resources. Can read DT, but can do on tools side. Or, if you have more dynamic
- Rob: Host tool might not at runtime, but if it works at runtime then will work at build time
- Loic: Think runtime tool will be too heavy to implement in TF-A or small FSBL
- Rob: But you're suggesting firewall binding, which is runtime
- Loic: For me, what's described in domain w/ access, can't imagine TF-A to create & setup the firewall
- Tomas: For us, Lopper to create table for firmware
- Stefano: Advantage of generating table - description can also work for hypervisors & different privilege levels. Could cover simply. If we have a well-written but just bus firewall, then not as flexible. Maybe could write it so could be read by Lopper or other tools.
- Stefano: Don’t imagine that we will pass full System DT to little firmware with tiny memory
- Tomas: Want system to be flexible enough if you have powerful system that can be dynamic, or have tool that helps you do it statically for low-resource processors
Dan D:
- Shared devices in AMP configuration without hypervisor. Need co-operative understanding between drivers & OS.
- This area is problematic to us
- How to annotate DT node what this means to device driver
- Rob: Doesn’t the HW have to be designed for sharing, so it's implied
- Yes & no. Lots of cases where … [lost]
- Rob: IIC bus with spin lock to arbitrate bus
- Tomas: Normally, you don't share devices. There should be a way to do it in spec. It's still a contract between devices. Don't think have to specify at this level.
- Stefano: This falls into difficult corner cases.
  - Case where need to show different description if it is/isn't host node
  - Don't have to start on day 1 with this
- Mark:
  - They may get the device but someone else may own the interrupt controller
  - Do we need a way of describing that
- Dan D:
  - It's a complex problem that comes up for us when no hypervisor available
  - Curious: If looking into if there a way to address that?
  - We have to modify the OS to handle as needed & it's painful to do for every use case
- Mark
  - Anything outside of regular memory gets complicated fast
- Tomas
  - Could have Lopper spit out a warning if a device is shared
  - If you have that problem, the format itself doesn't preclude sharing
  - Maybe just need more hints to the driver - add-on
- Dan D
  - Need some mechanism to tell end user they are sharing a device & implications
  - Otherwise end user assumes it will just work when they share a device
- Tomas
  - Maybe this is something you address later for end user when you have a GUI
Tomas: Back to Loic's firewall Q
- Tomas: Drive documenting Xilinx internal discussion on firewalls & share
- All: Please share your ideas & then we can discuss a proposal
Loic: Configuration lifecycle
- System DT will generate 1 time (Lopper)
- So have series of DTB
- So if don't have right updates (kernel w/ new DTB not aligned with firmware one) then can have incorrect access set up
- Maybe can have U-Boot or kernel that is checking configurations are aligned
- DTE have some discussions around that
- Consistency of global system configuration is important
- Stefano: Overall intent is to make DT more & more stable
  - We all know problem you described still exists
  - Would like the problem to go away over time as we make more stable
  - Do we need to add specific version to parts of it to ameliorate the situation? Hoping DT stability will be good enough so we don't need to do it
  - Loic: Not DT compatibility issue, but customer not updating it properly on the product b/c of async updates.
- Agree it's good policy for update OTA, but can imagine something goes wrong: Firmware updated & old Linux images
- Should it be Bootloader that updates DT at runtime?
- Since system DT will generate several configurations, have question regarding lifecycle
- Stefano: Master for original info is the original DT input to Lopper. Then everything else falls from there.
- Tomas: But good Q on how you keep things in sync w/ different teams. Customers will screw up.
- Loic; Need a way to mention to customer when they update, impact of using partial vs full image
- Nathalie: Put on next call agenda to continue to discuss configuration lifecycle next meeting
- Tomas: We've all had issues with DT coming from all over the place & not generated in nice fashion, even when you have the tools available

2020-Jan-22

Attended:

Joakim Bech, Tomas Evensen, Ilias, Rob Herring, Mathieu Poirier, Stefano Stabellini, Ed Mooring, Mark Dapoz, Nathalie Chan King Choy, Bruce Ashfield, Loic Pallardy (ST), Etsam Anjum (MGC), Dan Driscoll, Saravanna Kannan (Google)

Action items:

Stefano: Document a little bit more how the model works
- Remove reserved memory
- Add top-level use-case FAQ (e.g. how to do peripheral assignment to SW)
- Consider putting a qualifier word before "domain" to make it more specific
Everyone: Try to poke holes in the model. Good to have hard questions to think through & answer
Rob: Prototype proposal of changing root
Nathalie: co-ordinate next call over email (2 weeks from now doesn't work b/c Rob can't make it)

Notes:

Tomas: Recap:
- Trying to expand current concept of device tree to describe the whole system
  - Add features to talk about multiple CPU clusters, multiple masters
  - Add config info to say which resources (CPU, memory, I/O device) belong to which domain
- Look at from physical & virtual point-of-view: Would like to be able to describe both AMP & hypervisor use case.
- Splitting how the HW looks from the configuration
  - DT is more about hardware description than configuration
  - Different ppl are coming with the different info
    - HW info: Board, SoC, vendor
    - Who decides how you split up the resources (e.g. between Linux, FreeRTOS, Zephyr) is different (e.g. System architect)
- Stefano: added some features & not yet made proposal for config
Stefano: Rob suggested to write out both HW (new CPU clusters) & configuration (e.g. OpenAMP domain): How do they interact w/ existing?
- Writing an overview was useful to get ppl on the same page & how these features fit together
- Highlighted at least 1 problem:
  - Most top level nodes are to describe the HW (memory, amba bus, cpus).
  - Originally put OpenAMP domain under chosen to mark as configuration. Not really final destination for these domains, but made easy to identify the config & not drastically change structure of DT.
  - Reserved memory is one of few nodes that's already there for configuration. So, why not use that for describing memory? But, this is a bit awkward w/ multiple domains & multiple OS. You want 1 chosen & 1 reserved memory for each domain.
Rob:
- Think you captured it pretty well
- Domain is probably an over-used term
  - HW domains
  - Each HW domain is 1 or more config domains (e.g. R5, Cortex-A, may be divided into secure/non-secure)
  - Stefano: Agree: Domain is overloaded term
  - Loic: In ST we call it "execution context", where we can run SW component. Could be Cortex A secure on non-secure, or Cortex M. And, all the peripherals we can access.
  - Tomas: We had tried "execution context" in past & no one had understood it. Xen is calling this "domains". Tricky: In some SoCs this is very configurable. Don't want to change the structure of the DT if you change the configuration.
Stefano:
- Makes sense to keep HW description all together & immutable
  - Used CPU attribute
  - Access list that points to buses
  - Memory was controversial case. Had 2 ways:
    - access list (could go to reserved memory - bit controversial),
    - special memory attribute (memory config carve out for domain)
      - Top-level memory node describes physical memory
      - Config of memory & what devices accessible & what cpu cluster we're running described in the domain under chosen (or clearly config section)
Discussion on how much effort should go into making it possible for Linux to understand a System DT. Grant urges to do so for default case. Otherwise, Lopper gives "legacy" OS a DT that it can understand.
- Rob:
  - Doing that w/ memory is trade-off for how OS parses its memory
    - OS can't look at common node describing all of memory
- Stefano:
  - Compatibility w/ legacy OS & domain description is bit of a balance
- Rob:
  - If we change how OS parses, OS has to know how to parse both ways b/c might not have System DT. Legacy OS isn't necessarily "legacy" b/c not getting rid of it. With tool, only have to consider 1 way of doing it.
- Stefano:
  - Just realized an assumption I had: A "legacy" OS will only have to support default. Either we'll get System DT will give the "legacy" OS the default view or have tool Lopper to give the OS the info it needs.
- Rob:
  - So you're not going to run Linux on R5?
- Tomas:
  - We intended that you would run Lopper first if it's the secondary OS.
  - To get system DT to feed directly into OS came later. Figured System DT is on host side & would always run Lopper. Grant wanted to see if we could have Linux understand System DT.
- Rob:
  - Is it worth the messiness to avoid running Lopper on 1 view of the system vs. doing something more symmetrical & cleaner?
- Tomas:
  - For us, RTOS won't use DT directly. It will use DT to get a .h with #defines in them. B/c resource constrained.
- Rob:
  - Think that's true today but will diminish over time. Think more processors will run Linux in same system
- Tomas:
  - You're saying when running multiple Linux instances, 1 will get default & others will need Lopper
- Etsam:
  - Think system Dt should just work if only 1 execution context in system (just 1 Linux), or if there are multiple contexts then have to run Lopper b/c have to do memory partitioning & device assignment.
  - Tomas; Think this is Stefano's intention
  - Stefano: Yes
- Stefano:
  - Rob suggesting: Is it worth trying to maintain max compatibility if makes things awkward?
  - We could maintain backward compatibility, but it doesn't have to be super nice. Will make it easier to write changes to spec if more seamless & not revolutionary. Then if want to run "legacy" OS, it will boot for primary case.
  - Trying to make multiple domain case w/ Lopper nice. Reserved memory makes this awkward
- Etsam:
  - For multiple execution contexts or multiple domains, currently we are putting all the resource assignment info into chosen node. Problem is that OS will have to parse that info.
  - Dan proposed if we separate out the resource assignment to domains vs. what's needed to capture heterogeneity in SoCs
  - Even with that, having multiple execution contexts will need a tool to be run
  - If we move the domain concept out of the main system DT, the changes in OS will be minimal. Will not need to parse domain nodes in chosen
- Stefano:
  - Have not had a chance to read Dan's yet, but think my proposal will require minimal changes. No change for legacy OS in default. Lopper use -> don't need to change OS.
  - If want to enhance an OS to understand system DT w/o Lopper at runtime, but amt of new info to read is quite minimal, if it knows which CPU cluster to use, all the info is there. There is a bit of link jumping & reference checking, but just 3 new attributes to read. Think it will be feasible to add support for that
  - And, is it worth optimizing for this case when you can use Lopper & not need any OS changes?
- Rob: Back to my proposal of domains & changing what the root is
  - If we look at flattened DT, that may only require zeroing out the top level node to make domain at 0 node the root node, which would be a very minimal change
  - Or, at most, when you unflatten it, you could point to a different location & make that your root
  - If you have everything under a domain node & nothing else outside, then it would be a trivial change to the OS. Or, even the bootloader could do it before you give it to the OS. You could do Lopper as part of your boot flow.
- Etsam:
  - Which OS will pick which node from the domain nodes when you have multiple execution contexts?
- Rob:
  - You could say that the default node is domain 0. At some point, you have to pick which is which. Probably a couple ways to do that.
- Tomas:
  - If you do that, you have decided on the domains ahead, but the domains are decided later by system architect (e.g. when you have TrustZone) with same device & multiple domains
- Rob:
  - Probably need 2 levels of domains:
    - HW domains: Pretty much fixed: Cortex-A, R5, microcontroller
    - Within those, you have config domains which handle asymmetric cases, or TZ, or hyper
- Dan:
  - Could have hypervisor running 4 different guests. A cluster could be multiple domains from a SW perspective. So, can't break up HW into subsystem cluster. It's how you map the SW to any execution context.
- Rob:
  - You'd have 2 levels: 1 level is dictated by HW. Within a HW domain, you're creating config or openamp domains
- Dan:
  - OK, in the end, yoy have a HW cluster that all the devices & buses fall under. Right now there's no relationship back to which CPU cluster
- Stefano:
  - What about device that has been assigned to one of the OpenAMP domains. How would you express that?
- Rob:
  - Good question. If the devices are all viewable by all the HW domains, then it would be at the top level.
- Tomas:
  - Way to think about our system is CPU, memory, devices are fairly independent. Can assign as you want, mostly.
  - Interrupt controllers are some exceptions, but Stefano has way of expressing that.
  - General use case is to have someone else select from HW resources & create the domains. For CPU clusters, how do you tie those to the fixed HW - think Stefano was doing w/ these different buses.
- Stefano:
  - If you consider that devices remain top level (e.g. amba), and link to them under Rob's domains. Not too different. Then OpenAMP domain would be directly under cpu clusters.
- Loic:
  - Benjamin from ST has proposal for domain controller. Working on for kernel & u-boot. He proposed some bindings in each peripheral node to have info about which execution context can access. It's directly defined in the peripheral's node, like pin control. You can then say which domain it is associated.
  - Tomas: Then do you have to change tree when you reconfigure?
  - Dan: Would have to show that it could refer back to multiple execution contexts, whether it can access it or not. Not to do with domain definition, just the HW association
  - Loic: Yes
  - Dan: Seem have same issue w/ interrupt controller
  - Stefano: Solved in last example by describing all interrupt controllers. MPSoC has multiple interrupt controllers.
  - Dan: Is it changing the bindings or adding new?
  - Stefano: Thanks to Grant's suggestion, could do without any change. You use as interrupt parent of all devices like an interrupt muxer.
  - Dan: Then Lopper has to generate the right piece that this is the interrupt controller that I have
  - Rob: We only solved how to describe more than 1 parent, but not how to select which parent is the correct one. If you have HW domain, then you can stick it in HW domain.
  - Stefano: There is also HW restrictions where something is not visible or accessible between clusters. This is currently done with address map & indirect bus.
  - Dan: OK, in that case the info is all in the system DT & Lopper not necessary. So, we are discussing some cases where don’t need Lopper & other cases where Lopper makes life easier. At some point, we have to choose.
  - Tomas: We didn't originally think it could be backward compatible until we had the conversation w/ Grant. In a way, having default domain makes a lot of sense.
Tomas: Are there unsolved issues being able to express all the domains (SW context), HW that is only attached to only 1 cpu cluster & not others? Or, are all the issues kind of solved in your mind & more syntax to work out?
- Stefano:
  - Not aware of open issues for the heterogeneous clusters case. It's about syntax & how we want to express & make nicer for 1 type of parsing
  - There are issues for VM use case. Want to wait for these ideas to settle 1st. Think could be solved additions
    - Memory for VM needs physical & pseudo-physical ranges
    - Where to place virtual-only devices
- Rob:
  - Disagree that there's not open issues & it's just syntax. Syntax is open issues & need to know that they will work for the different cases.
  - Feel like we've gone down this path & each step there's different issues to solve & thinking about if we should back up & go down a different path w/ less issues to solve
- Tomas:
  - Want to make sure that the model holds up
  - Lots of good energy on the topic. Think we all want the same thing & just need to clarify.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

System DT Meeting Notes 2020

Table of Contents

2020-May-13

Attended

Agenda

Action Items

Notes

2020-Apr-06

Attended

Agenda

Action items

Notes

2020-Feb-13

Attended:

Action items

Notes

2020-Jan-22

Attended:

Action items:

Notes:

Clone this wiki locally