Home » Eclipse Projects » APP4MC » HW MemoryDefinition, Memory and MemoryMapping
|HW MemoryDefinition, Memory and MemoryMapping [message #1833835]
||Sun, 25 October 2020 11:12
| Henning Riedel
Registered: July 2009
Lets say, the Processor has several Memories, split up into multiple memory banks:|
L2 RAM Bank0 = 256kB, Bank1 = 128kB
L3 RAM Bank0 = 768kB, Bank1 = 768kB, Bank2 = 512kB and Bank3 = 0kB
AccelMem (128kB) split into 16kB banks AccelMem0 .. AccelMem7.
L1 TCMA RAM = 32kB
L1 TCMB RAM = 32kB
L2 RAM Bank0 = 512kB, Bank1 = 256kB
Each subsystem has certain L1/L2 interconnects, which are also connected between the subsystems. As it seem, also the Interconnects have a port for each memory bank.
So, even the ProcessingUnit_21 can (and has to be able) to access L3 memory in SubSys1, including code execution from and data exchange.
This access is going through Interconnect2 to InterConnect1 to L3RAM_Bx.
As shown, the memory banks can have different sizes, and they each have an MPU and can also be separately turned off (with some restrictions though). (--> FrequencyDomains and PowerDomains)
So, should each memory bank than have a different MemoryDefinition and assign them to each Memory then?
I guess, this would allow to model/simulate concurrent access to memories, getting blocked or not, depending on the placement of data into the banks.
Also, it would allow to specify, which parts should be placed into which banks, regarding e.g. code executed in PU_21 which does not fit into L2RAM in SubSys2, while separating it from code of SubSys1 PU_11 / PU_12.
And how does the MemoryMapping work than, in case we would e.g. combine certain number of banks to a bigger region, e.g.:
AccelMem_B0 and AccelMem_B1 combined into a single memory section/region with a 24kB InputBuffer located too for HWAccel filled by DMA transfers from HW Peripheral. InputBuffer is used by HWAccel to process the data, and place output somewhere else, e.g. L3RAM_B2.
The MemoryMapping and PhysicalSectionMapping only allow to assign them to a single Memory.
How can it be specified in the MappingModel , there is a section with e.g a 24kB buffer spanning the separate AccelMem_B0 and AccelMem_B1 in the HwModel? Or how can this memory banks be combined in the HW or Mapping models?
|Re: HW MemoryDefinition, Memory and MemoryMapping [message #1834208 is a reply to message #1834122]
||Thu, 05 November 2020 03:27
| Henning Riedel
Registered: July 2009
Falk Wurst wrote on Tue, 03 November 2020 12:25|
so generally speaking, in case you need a mapping per bank I would recommend modelling every bank as a single memory with its own MemoryDefinition (since the MemoryDefinition includes the size if the memory). For your example of AccelMem you would just need one MemoryDefinition because the size of each bank is the same.
That's what I currently do, using MemoryDefinitions for each Memory Bank. I could maybe do a single one for the AccelMem, since each Bank there is the same size.
Afterwards it depends how you model your ports at the memories in case every single memory bank has its own port connected to the Interconnect you could have concurrent accesses to memories (depending on the connection handler configuration "max concurrenct transfers").
So, TI (as the chip vendor) really draws separate arrows from the VBUSM SCRM to each memory bank.
This TI document describes the CBA (Common Bus Architecture), including VBUSP, VBUSM, SCR (Switching Central Resource) and Bridges:
If you want to have multiple concurrent transfers on the interconnect but not to your different memory banks I would recommend to use another connection handler which connects all memory banks via one port with the main interconnect.
Or an completely different approach would be combine all banks in one memory and use the physicalsectionmapping to create a section per bank based on the address (In this case concurrent accesses to the memory banks are not possible). With the access elements of the processing unit and the memory offset you could still model individual access paths to the different memory banks (physical section).
Yes, we would loose the concurrent access. This could even get problematic to decide between DSP code/data vs M4 code/data vs R5F code/data.
But it could make it easier to combine banks to a single memory section, like 2 16kB AccelMem banks to a single 32kB bank.
Especially if you have each memory bank modelled as separate memory you can easily do the mapping since this is memory instance specific.
I am not completely sure if I have understood your question with the buffer. Is it the case that you want to combine Bank0 and Bank1 -> in total 32kB and have an additional Input buffer with 24kB?
Yes, for the AccelMem, where each bank is 16kB, but we need 2 banks for the 32kB data samples we receive from our sampling HW. We actually need a double buffer, because while the received data is processed by the Accelerator, the next sample will be placed into another 2 banks combined to a single one. In that case, I could just create 2 32kB memories and remove 2 16kB banks.
But I had also some other things in mind:
- Memories with gaps, where sections have to "flow around" (usually splitting in 2 sections, where 1st section overflows to 2nd section
- Split into Safety-Level dependent sections
- Split into core specific sections and inter-core communication sections
- Split into SafetyLevel dependent sections
- Split into FAST vs SLOW code sections .. L2 Mem = FAST, L3 Mem = SLOW, also depending on DSS_L2 & MSS_L2 vs DSS_L3 vs MSS_L3
- Moving some SW component(s) from one core to another due to better load
Also remember, I have here heterogenous cores in a SoC, no core can really share code between them.
If I had 3 cores all being just ARM-Cortex-R5F, I could share code of SW components needed on all cores, just using a different configuration and core specific runtime context, e.g. AUTOSAR MultiCore OS or EcuM/BswM code only once. I could keep all in one ELF-image. Placement could be much easier, since I would not have to consider so much the split between DSS, MSS and HWA-core code, except for FAST and SLOW code.
With 3 different cores (C66x, Cortex-R4, Cortex-R5F), I need 3 different OS, 3 differently compiled EcuM/BswM in the worst case. Here I have 3 different ELF-images.
For the first scenario I would model this as one memory which has the double size compared to the other banks (you could separate the mapping based on the address for the theoretical bank 0 and bank 1). You could also model a cache module in front of your memory banks which interacts as a buffer -> currently there is no concept of mapping data to buffers because typically labels etc. are not permanently available in caches/buffers.
You could also still model two different access path from your ProcessingUnit with different MemoryOffsets (one for each theoretical bank).
I hope I could answer most of your points.
Thanks a lot .. I'll try to get the best out of it :) .. At least, there is such a model, with tooling around, and I have not to think about something on my own. And at least it is maintained, compared to the also interesting looking Time4Sys project (which is based more on the MARTE profile), which did not get much attention for about 2 years now.
Current Time: Wed Jul 06 05:23:38 GMT 2022
Powered by FUDForum
. Page generated in 0.01875 seconds