This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Delay between ISR and Task activation via Semaphore

Other Parts Discussed in Thread: OMAP-L138, OMAPL138

Hello,

 

the OMAP Iam using is interconnected via an SPI connection to another processor. I want to exchange data between this cpu and the OMAP. The CPU acts as a master and the OMAP as slave. If the CPU wants to receive data from the OMAP it pulls up a GPIO and the ISR is invoked at the OMAP site. This ISR then posts a semaphore. The semaphore is pended in the "writer task". The writer task then signals the CPU via another GPIO that it is ready to write data and outputs the data to the SPI bus, using Stream_write. Basically my code will look something like this:

 

static Void GpioInputIsr(Ptr ignore)

{

    Gpio_PinCmdArg  pinCmdArg;

   /* debug GPIO is resetted here */

 

    ignore = ignore;

 

    pinCmdArg.pin = GPIO_PIN_TO_DRIVER_NUMBER(6, 14);

    Gpio_clearInterruptStatus(gpio0, &pinCmdArg, NULL);

 

    Semaphore_post(tx_event);

 

    /* debug GPIO is set here */

}

 

 

Void HOST_TaskRawDataTransmitter(UArg a, UArg b)

{

    while (1)

    {

        Gpio_PinCmdArg  pinCmdArg;

        Semaphore_pend(tx_event, BIOS_WAIT_FOREVER);

        /* debug GPIO is toggled here */

        /* signal CPU that we are now ready to write */

        pinCmdArg.pin   = GPIO_PIN_TO_DRIVER_NUMBER(2, 12);

        pinCmdArg.value = 0;

        Gpio_setPinVal(gpio0, &pinCmdArg, NULL);

 

        Stream_write(spiHandle, out_buffer, out_len, BIOS_WAIT_FOREVER, NULL);

 

        pinCmdArg.pin   = GPIO_PIN_TO_DRIVER_NUMBER(2, 12);

        pinCmdArg.value = 1;

        Gpio_setPinVal(gpio0, &pinCmdArg, NULL);

 

    }

}

 

 

All in all the whole transfer lasts around 3.2 ms (1024 bytes of data will be transfered).

 

For debug reasons I've resetted a debug gpio when the ISR starts and setted it when its finished, furthermore another debug GPIO is toggled right after the Semaphore_pend call in the task. Now the problem is that this task activation lasts too long. (See the picture, the delay is between the two bars.) The strange thing is that this period of time is nearly as long as one SPI transfer. Also the long task activation delay seems only be the case if the Stream_write function call is active. If its commented out and the task only pends on the semaphore, the task activation delay seems to be around 2 us (I would expect this)....

 

Iam using bios_6_32_05_54, ipc_1_23_05_40 and pspdrivers_02_10_01 on an OMAP-L138.

 

Any help on this issue would be very appreciated.  If you need more information or explanation please let me know.

 

Kind Regards,

Steve

 

  • Steve,

    I'm not really an expert on Stream_write() usage but I'll give you my analysis anyway.

    From your description, It appears that the 3.2ms delay is due to time spent in the Stream_write() call. (You can verify this by toggling the io pin you're monitoring before and after the call to Stream_write())

    The BIOS_WAIT_FOREVER argument to the Stream_write() function tells it not to return until the transaction is complete.

    My guess is that you've wired up the spi Stream so that Stream_write() waits for the other CPU to acknowledge reception of the buffer before returning.

    I suspect this is where all the time is being spent (ie waiting for the other CPU to ack the transfer).

    Alan

  • Alan,

     

    thanks for your answer. I've figured out that the problem seems to have another cause, namely I think at the beginning within the SPI communication. Figure 1 shows the beginning of the SPI communication, where

    • the brown signal is SCK,
    • the red signal is MISO (Data from OMAP to CPU),
    • the green signal is the interrupt (each falling edges the interrupt ends) and
    • the yellow signal is the calling time of Stream_write after task activation.

    Each brown block (we see 23 of them the figure) is a transfer of 128 bytes and each cluster of these brown blocks is a transfer of 1024 bytes. As you can see the interrupt from the other CPU arrives before the clock is started. Afterwards (as I would expect) the task is activated and finally Stream_write (yellow signal goes 1) is called. However, for a strange reason Stream_write interrupts its execution after the first 128 bytes. I would have expected that Stream_write lasts as long as the 1024 bytes have to be transfered (1024 is specified by the passed bufLen to Stream_write). Furthermore the transfer of the first 128 bytes seems to start much to late (looking at the first rising edge of the red signal). For the second transfer of 1024 bytes the invocation duration of Stream_write is even shorter, and at the third transfer the SPI driver throwed an exception and the application terminated.

    Interestingly if I don't use a semaphore (Stream_write trusts the SPI bus that it allocated 1024 bytes on it at each transfer) it seems to work as expected (see figure 3). However, I cant use it without semaphore since the application must able to handle not only writing those 1024 bytes to the bus, but also other sizes and types of transfers.

    Another question: What is the timing requirement for Stream_write along with the SPI-driver? Figure 2 shows the very first transfer in greater zoom. There we can see that the time between start of Stream_write and writing the first pattern (first edge of red signal) to the SPI bus is around 300 usecs. Can it be that the SPI clock starts to early for Stream_write and this confuses the whole transfer?

     

    So what could be the problem here? Is this the right forum to discuss? Shall I start a new thread?

     

    Thanks for your help.

     

    Kind Regards,

    Steve

    Figure 1

     

    Figure 2

     

    Figure 3

  • Hi Steve,

     

    I have a couple of questions -

     

    In which mode have you configured the SPI?(Interrupt/DMA)

     

     
    Steve Kreyer said:
    However, for a strange reason Stream_write interrupts its execution after the first 128 bytes. I would have expected that Stream_write lasts as long as the 1024 bytes have to be transfered (1024 is specified by the passed bufLen to Stream_write).
     

    Perhaps, Looking at the graphs – ‘brown signal’ which signifies clock, itself is not continuous. There are gaps seen in between. Any particular reason for this?.  And so is the data!! since the SPI is in slave mode.

     

    And, since you are programming “Stream_write” to write 1024 bytes, (it should write 1024 bytes completely without any gaps, but) it writes in blocks(128 bytes) as seen in the graph. How do you say that each block is 128 bytes?.  

     

    Also, can you please confirm/check again and tell us what is the time gap(time taken) between start of the clock and the start of the first data transmission.

     

    Best Regards,

    Raghavendra

  • Hi Raghavendra,

     

    thanks for your answer.

     

    According to your questions: I've configured SPI in DMA mode.Normally a package transfer initiated by the master is 1024 bytes in size. The gap you've observed is because of a delay within the SPI master. After each DMA transfer (128 bytes) within the master, the SPI pauses for some usecs. This delay is for some reason is not configurable nor avoidable. However because of this pause between each 128 bytes and the operating frequency (which is 10MHz) a transfer of one package (1024 bytes) lasts for about 3 msecs. The gap between the start of the first clock period and the first transmitting of data is around ~300-400 usecs.

     

    What I've done so far: I've observed that it takes around 300-400 usecs from Stream_write invocation call to the point where the first data is actual transmitted to the bus. So it seemed since our clock was driven before that point the whole transfer was messed up (because of the fact that Stream_write takes so long to prepare everything) So I've give the OMAP around 1msec time between the interrupt (which posts the semaphore) and the start of the SPI clock driven by the master, so that Stream_write (i.e. the SPI driver?) has enough time for preparation.

    However 300usecs for preperation seems pretty long to me. Can you confirm my observation of the behaviour of Stream_write? Or any other idea what could go wrong here?

     

    Thanks for your help.

     

    Kind Regards,

    Steve

  • Hmm ok..

    Steve,

    I shall check that(Stream_write delay) and get back to you soon.

     

    Best Regards,

    Raghavendra

  • Hi Steve,

     

    I conducted few tests to find out the time taken by the Spi_submit() call, till it enables the SPI. When Stream_write() is called, the Spi_submit() call is the one which will be executed in the driver. Initiated the performance capture at the beginning of the “Spi_submit()” call in the driver and ended the capture inside “Spi_localEdmaTransfer()” call just before enabling SPI for transaction. The AbsoluteTime was ~85u secs.

     

    Once the SPI is enabled inside the SPI_submit(), the EDMA starts transmitting the data. So this shows that, the delay introduced by the driver will be of 85u secs. Let me check if there are any possibilities of optimizing the code.

     

     

    Best Regards,

    Raghavendra

  • Hi Steve, Raghavendra

    We are seeing similar issues with GIO_Submit() functionality on OMAPL138 using PSP with SPI(5-pin mode)/DMA activation. Here we are running OMAPL138 CPU at 300MHz and SPI configured for 24.576MHz.

    Essentially we initialise the system and run in an infinite loop before any of our tasks are activated (ie the CPU is doing very little).  As OMAPL138 is SPI master we use a GPIO line from FPGA to inform the DSP the FPGA is ready to make a SPI transfer.

    Looking at the scope display we could see the FPGA interrupt entered and expected an almost immediate de-asssertion of the SPI_SCS line to activate underlying DMA transfers. This was taking between 112 and 130 usec.

    Added seperate GPIO to verfy what we were seeing

    Initialisation

    { GPIOdebug - LOW }

    FPGA_HWI_ISR (used dedicated HWI_INT6)

    {

    GPIOdebug - HIGH

    GIO_submit()

    GPIOdebug - LOW

    }

    From scope reading's

    1 : Time for FPGA interrupt assertion to GPIOdebug going HIGH = 1.3usec

    2: Time for GPIOdebug HIGH - LOW transition surrounding GIO_Submit() >120usec

    Bearing in mind this reading is taken from within a HWI (which has disabled all other HWI's) how can it take over 100usec to start a SPI/DMA transfer within what I would expect to be a very optimally coded critical function ?

    Secondly if this forces us to try and use automatic reload mechanisms within DMA controller to effectively give us an infinte transfer for 1 GIO_submit() call (if it works and stays synchronised) can we expect to find a similar show stopper like this buried somewhere related to the automatic reload mechanisms ?

    BR

    Barry

  • Hi Barry,

     

    All the profiling is done based on the PSP 02.10.01 with SYS/BIOS. The same may be applicable to PSP 01.30.01 since the profiling has been done inside the driver.

     

    As mentioned in the previous post,

    From the point the packet is submitted(GIO_submit()), till the SPI transfer is enabled inside the driver  it takes about ~85usecs. After reviewing the driver, all the code present under submit seems to be essential and cannot be removed(optimized).

     

    Well, The GIO_submit you are using is an asynchronous call or a synchronous call??

    If it is a synchronous call, no doubt it waits until the requested bytes are processed. In that case, the addition time delay might also depend on the number of bytes being requested.

     Thanks & regards,

    Raghavendra

  • Hi Raghavendra

    Thanks for the prompt reply.

    Am using the ascynchronous mode with dma callback function operational.

    There is an improvement possible which noticed:

    The PSP libraries we need have paths set in preferences->C/C++ Build->C6000 Linker -> File search path. This allows us to specify debug/release paths for the respective debug/release builds.

    HOWEVER we also have a linker command file which noticed had "debug" specific paths set for the libraries (our release build was picking up on some of these libraries). When consolidated all the paths into the *.TCF file and removed "debug" specifc entries from the linker command file the GIO_Submit() function using the release build improved greatly.

    Still not enough for our intended usage so we've had to change our interprocessor comms strategy.

    BR

    Barry