This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM5716: L3/L4 interconnect NoC programing

Part Number: AM5716

Reposting  on behalf of field, as the previous thread on this is now locked without any response

Hello,

AM5716 has L3/L4 interconnect NoC which is an architecture of command & acknowledge and response & acknowledge, so it can be nonblocking access between commands and responses. I'd like to know how to program our software to be nonblocking access for NoC. Because the accessing time to peripherals is critical for our application.
 
For example, it's ideal for me to have a sequence in which only commands to each peripheral are issued continuously without waiting for a response, and the processing like the reading or writing starts as soon as the response is recieved. In order to realize it, how should I program software? Also, how do I know that the response recieved?

I saw NOTEs on P.3021 of TRM (SPRUHZ7E) and searched for the information on Arteris and Sonics website, but it looks like there is only information of implementing their NoC on the device. I'd like the document like programming method.

Anyway, could you tell me if there is something like a specific programming method to use NoC architecture well? Do I need to be conscious of specific programming methods? I ask for comments from experts of device architecture if possible.

Regards,
Kazu

  • Hi Mukul,

    I have notified the factory team. They will respond here.
  • Based on discussion with the team, we do not believe that there is much software programmability in the system interconnect/NOC for this.
    One potential option is to configure accesses from CPU as posted instead of non posted writes. In the posted write case the response from L3 is immediate and the command to the peripheral goes in the back ground. For the non-posted the response from the peripheral is sent to the CPU.
    As you mentioned , in case the posted case however, there must be some way from the peripheral for software to know the status that the request has been successfully serviced by potentially reading some end point registers.
  • Hello Mukul,

    Thank you for your information. I rewrote the question.

    If the default access method from CPU to the peripheral is non-posted write, please tell me in detail how I can achieve posted write.

    As you mentioned, does the software need to read the each of endpoint registers (ex.UART) after writing a data to it?

    In the case, since the software overhead is not small, I guess that the processing speed will be slower.

    Or can it be realized for changing to posted write simply by setting registers of CPU or L3?

    Regards,
    Kazu

  • Kazu

    Can you further elaborate on the use-case and also tell me what masters/CPU are involved in this usecase (ARM or DSP, and other masters?). 

    To do posted from A15 will likely require the correct MMU attributes ( I do not have any software examples). 

    Most accesses apart from strongly-ordered ones have the ability to post from the A15. 

    This approach in general maybe be expensive in SW, as SW must make sure per register read or some other way to ensure the end point is ready for the next data etc. 

     

  • Mukul,

    Thank you for your reply.

    In our system, IPU1_C0 mainly accesses peripherals. It's one of two Cortex-M4 in IPU1 (not IPU2).

    I've already confirmed that CM4 can access GPIO, UART and so on correctly. I also have confirmed that the program runs faster and LED blinking interval is shorter by configuring IPU1_UNICACHE_MMU.

    I refer to GEL and CSL sample programs of PDKv1.0.6. But I could not find the initialization code of IPU1_MMU.

    Please refer below.
    AM571x TRM (SPRUHZ7F) : Figure 7-1. IPUx Subsystem Overview

    If I need to set up a certain configuration of MMU in order to use NoC efficiently, please tell me the configuration of IPU1 MMU (IPU1_UNICACHE_MMU or IPU1_MMU). If I need to set things like NoC's configuration register so that I can use NoC efficiently, please also tell me about it. Thank you.

    Regards,
    Kazu

  • Kazu
    What is the end goal for this implementation?
    Do you have a certain latency target that you are trying to hit with the IPU1 M4 to the end peripherals?
    Have you tried just using priority etc?

    On IPU MMU programmation, my colleague pointed me to the following resources/information that you may find helpful

    The PDK and IPC examples for M4 do provide AMMU configuration to use to access peripherals on L3/L4 interconnect. Handling of AMMU in PDK driver examples is different from how IPC uses these settings and has been explained here:
    processors.wiki.ti.com/.../Linux_IPC_on_AM57xx

    If you are measuring latency using bare-metal IPU1 code then the AMMU settings to access peripheral like PCIe are shown in the CSL example:
    pdk_am57xx_1_0_8\packages\ti\csl\example\pcie\write_loopback

    Regards
    Mukul
  • Mukul,

    Thank you for your information.

    I inform our situation a bit more below. Anyway I'll send you the latency target later.

    CA15 uses RTOS (own scheduler) instead of Linux and CM4 is bare-metal (NO-OS).

    I've already referred to the PDL CSL example you suggested. I used and customized it for the test of LED blinking interval, as mentioned earlier.

    Regards,
    Kazu

  • Hi Kazu
    Ok, thanks for the additional clarifications. Feel free to share the remaining information on or off the E2E (via Todoroki san) on customer specifics.
    I will reiterate that for what scenarios you have shared, there is no relevant NOC registers that we recommend to program.
    Most of this should be manageable with the MMU/cache settings , priority and potentially posted vs non posted access (with the caveats previously stated).

    Let me know if we can close this thread or you need further help.

    Regards
    Mukul
  • Hello Mukul,

    Thanks a lot. We'll enable Cache and try some configuration of AMMU. If NoC does not get the performance we expect, I'll ask you in new thread.

    Regards,
    Kazu