This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

EDMA3 Transmit problem

Hello,

I'm using a C6678 on our custom manufactured board. I'm implementing data transfer between the C6678 and our host (linux) via PCIe and EDMA3. Receiving data (from host to DSP) is no problem, everything works fine. Transmitting data from the DSP to host makes some trouble.

Setup is as follows:
-Using only 1 DMA channel (for transmit / receive), no parallel action
-Use link DMA; all descriptors (or param sets) are prepared before starting DMA; use link update of TCC after one DMA is completed
-end of one DMA is signaled via IRQ; IRQ bit is polled until DMA is finished
-use outbound mapping for address translation

I followed the examples in mcsdk_2_01_02_06\tools\boot_loader\examples\pcie\linux_host_loader\pciedemo.c and pdk_C6678_1_1_2_6\packages\ti\csl\example\edma\edma_test.c.

DMA is done in a simple for-loop regarding how much param sets have been prepared (see code in attachment). Problem is that some times the last package fails. Debugging with PCIe-Analyzer (see picture in attachment) shows that the outbound mapping (OB_INDEX at 0x21800200) has not been used for the last package.

I assume that disabling outbound mapping in the CMD_STATUS is to early. But I wonder about this as the IRQ is setup for normal mode -> IRQ is triggered when data is transfered! So there should be no raise condition or something.

Please have a look at this. Maybe there is some wrong handling for the DMA finish?

Best regards,
Bernd

PS: if a printf is added before the CMD_STATUS is set back (no outbound mapping allowed), then this error not happens. But this is not a sufficient solution...

Picture and code snippet.zip
  • Hello,

    second problem is as follows: outbound regions are set to 1MB. If the PCIe target address reaches a 1MB boundary, the outbound mapping for the last packet before the boundary gets a false outbound mapping (see attachment).

    Also for this problem I'm not sure why this happens. Maybe someone got an idea how this can be avoided?

    Best Regards,
    Bernd

  • Bernd,

    It looks that OB address is 0xBBE5xxxx something but the last address is 0x6005xxxx, showing the OB translation is disabled before the last transfer complete. This is an PCIE outbound write using EDMA (from DSP to RC), how do you set-up the OPT field of the EDMA channel, bit 21, bit 20 and bit 11 (see http://www.ti.com/lit/ug/sprugs5a/sprugs5a.pdf), table 2-3?

    Regards, Eric

       

  • Hello Eric,

    thanks for your quick answer. Opt-Field is setup as 0x81101000 (last param set has value 0x81101008).

    -Priv & Priviid are set
    -only TCINTEN is set (I just want an IRQ when the whole DMA has finished and all data is transmitted)
    -TCCMOD set to 0 (normal completion after data transfered)
    -TCC set to 1 for completion code

    Are there any mistakes in this setup?

    Best Regards,
    Bernd

  • The setting of EDMA OPT field looks OK. I didn't see how you trigger the EDMA transfer in the code? For your for loop, if I re-order the sequence like below:

    for () {

    1. set ESR bit to trigger the EDMA transfer for the current parameter set

    2. activateChannelAndWaitForCompletion() to make sure transfer is complete (the next parameter set is automatically loaded via linking)

    3. pParamSetup = (CSL_Edma3ParamSetup*) EDMA3_TPCC1_PARAM_SET_ADDRESS(0);

    4. setOutboundRegs(pParamSetup->dstAddr);

    5. pParamSetup->dstAddr = extPCIeAddress(pParamSetup->dstAddr);

    } //end of for loop

    6. disable the OB translation in CMD_STATUS

    Also, will add some delay cycles between 4 and 5 helps? If yes, is this acceptable?

    Regards, Eric

  • Hello Eric,

    the EDMA is activated with this line in activateChannelAndWaitForCompletion():
    CSL_edma3HwChannelControl(ChannelHandle, CSL_EDMA3_CMD_CHANNEL_SET, NULL);

    Then ESR is set. The re-ordering is not appropriate as the correction of the address mapping has to be done before EDMA is started -> therefore, getting pParamSetup / setOutboundRegs / extPCIeAddress has to be done before activateChannelAndWaitForCompletion().

    I see that a delay of disabling the OB translation (e.g. a printf before step 6; for whatever reason, using platform_delay hasn't not worked -> are there other delay functions to test?) works. But I don't get the reason for this as the DMA-finish-IRQ is setup to be triggered AFTER the data transfer has completed.

    Best Regards,
    Bernd

  • Hello,

    making some further tests the problem with the IRQ coming not appropriate maybe created after the 1MB-problem described in Apr 08 2014 06:43 AM.

    So I focus now to solve this problem as this is to 100% reproducible (other problems often occurs only after the 1MB-problem happened). Setup is in the note above described:
    -transfering only sizes until 8192 bytes works fine
    -trying to transfer 8200 bytes and reaching a 1MB-address-boundary, the EDMA3-controller gets a false outbound mapping (always for the last packet of on 1KB, see picture).
    -page size for our system is 4KB, so the difference is that this failure happens only for chained EDMA with more than two param sets
    -strangely this happens in the DMA described by the first param set (which works perfect in all other tests)

    As the EDMA is running, I've no change on the DSP side to take care of the mapping. EDMA is started and then controlled by the controller. No idea why for only one packet the mapping is falsified???

    Best Regards,
    Bernd

  • Hello,

    it seems that there are problems with our linux driver and the alignment for the chained param sets prepared by the driver. This alignement seems to get the EDMA3-controller into trouble.

    I'll have a closer look at this first.

    Best Regards,
    Bernd

  • Bernd,

    Thanks for letting me know. If you need to insert some delay in DSP cycles:

    extern cregister volatile unsigned int TSCL;

    *  Time Stamp Counter is a free running 64-bit CPU counter that advances each CPU
     *  clock after counting is enabled. The counter is accessed using two 32-bit
     *  read-only control registers, Time Stamp Counter Registers – Low (TSCL) and
     *  Time Stamp Counter Registers – High (TSCH). The counter is enabled by writing to
     *  TSCL. The value written is ignored. Once enabled, counting cannot be disabled under
     *  program control.

    TSCL = 0; //write a value to enable it

    Void CycleDelay (int32_t count)
    {
        UInt32                  TSCLin;

        if (count <= 0)
            return;

        /* Get the current TSCL  */
        TSCLin = TSCL ;

        while ((TSCL - TSCLin) < (UInt32)count);
    }

    Then you can call the CycleDelay() function.

    Regards, Eric

  • Hello Eric,

    I've adapted our DMA-transfers to 64-bit transfer to avoid alignment problems. Nevertheless, I've trouble with the mapping. I can see and reproduce two errors:

    1) For a non-chaining DMA-transfer (only one param set), sometimes the completion-IRQ seems to come to early so that the outbound mapping is a mix of the MSI- and the DMA-address (see picture 1).

    Delays are no option as the code programmed sequential -> the completion-IRQ has to be a valid signal for the end of the DMA.

    2) Same error in another way: if another page for the second descriptor is used, the outbound is a mix of old and new DMA-address (see picture 2).

    I'm completely baffled how this can happen as the code always sets the outbound mapping sequential correct for transfer. Therefore I suppose some interference by or from the DMA-controller which maybe not under my control.

    I appreciate any idea for solving these problems.

    Best Regards,
    Bernd

    DMA errors.pdf
  • Hello,

    with further tests I've found out that the incorrect outbound mapping depends on the packaging of the EDMA-controller.

    With the PCIe-Analyzer it can be seen that the data is sent in packages, each 16x 32-bit-words. If the data length is chosen that each package is filled completely, no error occurs. With some "left-over-data" (e.g. last package has only 2x 32-bit-words), the error is nearly reproducible.

    The second error from the post above still remains: changing the mapping due to another page results in a mixed outbound mapping.

    I suppose the cause of these problems is the finish-IRQ generated by the TCC. But I'm not sure at all.

    Best Regards,
    Bernd

  • Hello,

    using some delay cycles between every DMA of a chained DMA solved both problems. The next DMA is delayed and therefore the switching of the outbound mapping.

    Also the MSI-IRQ signaling our driver that the whole DMA has finished is delayed.

    Nevertheless, maybe somebody has an idea why the finish-IRQ of the TCC seems to be raised to early.

    Best Regards,
    Bernd