This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

ARM - DSP sync latency workarounds?

Other Parts Discussed in Thread: SYSBIOS

Hi,

using a TI K2H evm we checked on the arm-dsp roundtrip time of MessageQ
messages. It seems to be around 70-100 microsecs, even when using MSMCSRAM
shared memory. Since we could not find any official data about it, not
even in MessageQ performance tuning guides, we tend to beleive in this
datum. The roundtrip interval of 100 usec is way too long for us, as we
planned to use this mechanism to synchronize dsp cores with PCIe
interrupts handled on ARM every 500 microsecs. Carving pci out of
arm-linux offered bleak
prospects regardless, and this way more than one dsp cores are to be
synchronized...

These findings turned us towards using the HW semaphore module. The naive
way of inluding csl_semAux.h, then playing in user space failed promptly.
Do you think there's any support for this on linux side?

We are concerned about latency so that we are trying to use the Semaphore
modules HW address directly from our PCIe driver. Here we have trouble
requesting arm corepack semaphore interrupt SEM_INT8. Are there any tricks
that we might be missing?
request_irq(8u, &Sem_IRQHandler, IRQF_SHARED | IRQF_DISABLED, gDevName,
gDev)
fails with error -22 (invalid argument)

Do you think this naive kernel space direct semaphore module handling could
work?

Unfortunately using simple IPC interrupts won't make it because MessageQ
infrastructure seems to be using it and we would like to keep those
messages for other purposes.

I still wonder what could take for 70-100 usecs if IPC is interrupt based
and SRAM latencies don't seem to be that long either.

Even the vaguest ideas would be most welcome... These difficulties were
not planned in our project so we really have got panic on our hands.

Regards,

Janos
  • Can you please take a look at the suggestions recommended on the following e2e post on which a similar issue was reported and let us know if CMEM package meets your requirements.

    http://e2e.ti.com/support/embedded/tirtos/f/355/p/290974/1022301.aspx

    Regards,

    Rahul

  • Hi Rahul,

    It's very nice to read about another soul being concerned about latencies and having similar results as myself...

    As far as I understood, CMEM is for continuous physical memory allocation and I hope I could use that later on to share data between arm and dsp. However, to delineate my problem better, just a few words about our vision of our software architecture:

    Since arm linux has all the external connections (PCIe, SRIO, Hyperlink, eth) it handles PCIe input as root complex. An FPGA is connected and configured to transport large amount of data directly to dsp core0's L2 memory via PCIe. It all seems to work so far.
    The dsp cores are daisy chained and then synchronized with the incoming
    data so that the limited space of L2 memory can be properly used as buffers.
    Between DSP L2 memories the data is transferred by DMA, as we try to elude shared memory usage as long as we can. Input data arrives in every 500usecs and the FPGA gives PCI interrupt whenever the transfer's finished. We handle this event in our PCIe driver's interrupt handler, and we would prefer to do synchronization from right there. So here we are desperately in need of low latency signals between arm and dsp (both directions would be nice).

    Once the DSP chain has done it's job, processing continues on arm and data is sent to another K2H SoC via hyperlink.

    Considering our specific memory throughput constraints this kind of processing seemed to be the only way. The DDR througput still worries us but we are not there yet.

    DSP L2 size and data throughput gives us a 500usec constraint which forces us to use low latency signalling.

    MessageQ was well promoted and many transport layer documents mentioned it. But when we faced arm-dsp signalling we stumbled upon ti.ipc.transports.TransportRpmsg. No choice there. And it does not support DSP-DSP messages. Using two different transport layers on DSPs seems to be a legend. I find it odd that msgcom is not even mentioned in most of the documentations (e.g. not mentioned in mcsdk user guide). How does it relate to everything else? (linux, sysbios or the formidable resource manager) If Rpmsg transport is so slow, why isn't there any other transport options available?
    As for msgcom: is it true that there's no async mechanism on arm to get notifications from new messages? Are those 1000usec round-trip times in the linked conversation confirmed? What do you think is the limit in latency we can hope for using msgcom?

    Looking at the IPC abstractions first (MessageQ, Notify) and matching them on hardware (IPC interrupt lines, HW semaphores, Multicore navigator...) looks promising. Is it true that using linux (so the arm cores) leaves us without all of these low latency signalling?

    Other desperate measures we've tried :
    - using the hw semaphore module from linux kernel space => trouble with interrupts (incompetence is my bad here clearly) There are only 6 semaphore interrupt lines on arm anyway and I can't be sure if it's used already by some ti native module.
    - busy wait on DSP L2 memory value written by arm driver (here somehow DSP gets an eventcombiner interrupt we don't understand why)

    What would you recommend if we don't have resources to dig deep into this linux kernel k2h adaptation?

    Regards,

    Janos

  • Hi, Janos,

    Anything between ARM and DSP if it goes through Kernel, it has the security kernel space provides but costs the latency which your number probably reflects the fact. On the receiving side, the interrupt goes through kernel which is the same scenario and also costs on latency. From user space bypassing kernel, it can be done through QM_LLD. However, TI is in the process of providing a navigator transport to fill the gap. It should be in one of the 3.1.x releases coming in the next few months.

    Rex