This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

DMA transfer over PCIe

Hi,

We are usign c6657 DSPs which is connected to cavium octeon processor. The interface between c6657 and cavium octeon processor is PCIe bus. The PCIe driver which we used was given as sample program by TI which TI tested between x86 and c6657 and we developed our driver on top of this by doing the required changes for cavium octeon in place of x86. All the data transfer between cavium octen and DSP is working fine without using DMA.

Now we want to use DSP DMA for data transfer between DSP and octeon. So Can I get help from TI how to use DSP DMA for data transfer? The help from TI will be really great. Actually the sample code TI gave at that time has one dma transfer API but this is not working as it is , modifications are required so we want to understand it specially the address translation part, if TI suggests to use the same API. This API uses eDMA transfer.

Here is the PCIE configuration:

Cavium octem  => root complex, master, 64 bit processor

DSP => c6657, end point mode, 32 bit

Following is the API:

/* ============================================================================
* @func gauss_write_dma
*
* @desc Move DMAs contents from GPP memory to DSP Memory. For DSP this is
* outbound read.
* flag: 0: Move data inside DSP; 1: Move data between GPP and DSP
*
* @modif None.
* ============================================================================
*/
void gauss_write_dma(gauss_device_t *gss_dev, uint32_t srcAddr, uint32_t dstAddr, uint32_t size, uint32_t flag)
{
uint32_t *pReg, tmp, pageBase, i, tSize;

pReg = (uint32_t *)gss_dev->regVirt; /* Point to PCIE application registers */

/* Move data between GPP and DSP, need to program PCIE OB registers */
if (flag) {
iowrite32(0x0, pReg + OB_SIZE/4); /* 1MB outbound translation size */

if (size <= PCIE_ADLEN_1MB) {
pageBase = srcAddr & PCIE_1MB_BITMASK;
iowrite32(pageBase|0x1, pReg + OB_OFFSET_INDEX(0)/4);
iowrite32(0x0, pReg + OB_OFFSET_HI(0)/4);
}
else {
for (tmp = size, i = 0; tmp > 0; tmp -= PCIE_ADLEN_1MB, i++) {
pageBase = (srcAddr + (PCIE_ADLEN_1MB * i)) & PCIE_1MB_BITMASK;
iowrite32(pageBase|0x1, pReg + OB_OFFSET_INDEX(i)/4);
iowrite32(0x0, pReg + OB_OFFSET_HI(i)/4);
}
}
}

/* Temporarily re-map IB region 3 from DDR memory to EDMA registers */
iowrite32(EDMA_TPCC0_BASE_ADDRESS, pReg + IB_OFFSET(3)/4);

pReg = (uint32_t*)gss_dev->ddrVirt; /* Now it points to the start of EDMA_TPCC0_BASE_ADDRESS */

while (true) {
/* Use TC0 for DBS = 128 bytes */
myIowrite32(0x0, pReg + DMAQNUM0/4);

/* Set the interrupt enable for 1st Channel (IER). */
myIowrite32(0x1, pReg + IESR/4);

/* Clear any pending interrupt (IPR). */
myIowrite32(0x1, pReg + ICR/4);

/* Populate the Param entry. */
myIowrite32(0x00100004, pReg + PARAM_0_OPT/4); /* Enable SYNCDIM and TCINTEN, TCC = 0 */

if (flag == 1) {
/* Calculate the DSP PCI address for the PC address */
tmp = PCIE_DATA + (srcAddr & ~PCIE_1MB_BITMASK);
myIowrite32(tmp, pReg + PARAM_0_SRC/4);
} else {
myIowrite32(srcAddr, pReg + PARAM_0_SRC/4);
}

/* Calculate the A & B count */
if (size > PCIE_TRANSFER_SIZE) {
tmp = size/PCIE_TRANSFER_SIZE;
tSize = tmp*PCIE_TRANSFER_SIZE;
size -= (tmp*PCIE_TRANSFER_SIZE);
tmp <<= 16;
tmp |= PCIE_TRANSFER_SIZE;
}
else {
tmp = 0x10000|size;
tSize = size;
size = 0;
}

myIowrite32(tmp, pReg + PARAM_0_A_B_CNT/4);
myIowrite32(dstAddr, pReg + PARAM_0_DST/4);

myIowrite32(((PCIE_TRANSFER_SIZE<<16)|PCIE_TRANSFER_SIZE), pReg + PARAM_0_SRC_DST_BIDX/4);
myIowrite32(0xFFFF, pReg + PARAM_0_LINK_BCNTRLD/4);
myIowrite32(0x0, pReg + PARAM_0_SRC_DST_CIDX/4);

/* C Count is set to 1 since mostly size will not be more than 1.75GB */
myIowrite32(0x1, pReg + PARAM_0_CCNT/4);

/* Set the Event Enable Set Register. */
myIowrite32(0x1, pReg + EESR/4);

/* Set the event set register. */
myIowrite32(0x1, pReg + ESR/4);

/* wait for current DMA to finish. */
while (true) {
/* check in steps of 10 usec. */
udelay(10);
tmp = myIoread32(pReg + IPR/4);
if ((tmp & 0x1) == 1) {
break;
}
}

if (size != 0) {
srcAddr += tSize;
dstAddr += tSize;
} else {
break;
}
}

/* Clear any pending interrupt. */
myIowrite32(1, pReg + ICR/4);

/* Restore pointer */
pReg = (uint32_t *)gss_dev->regVirt; //Point to PCIE application registers
iowrite32(DDR_START, pReg + IB_OFFSET(3)/4);
}

In case TI has some other sample PCIe dma transfer code in similar PCIe configuration, please share.

Thanks

Kashif

  • Hi Kashif,

    The PCIE LLD didn't have EDMA example. MCSDK have some EDMA sample code "\ti\mcsdk_2_01_02_06\tools\boot_loader\examples\pcie\linux_host_loader\pciedemo.c". Please take a look at this example code.

    If you need more information about this example(Build/Run). Refer the following document "\ti\mcsdk_2_01_02_06\tools\boot_loader\examples\pcie\docs\Readme.pdf"

    Please take a look at below thread:

    http://e2e.ti.com/support/dsp/c6000_multi-core_dsps/f/639/t/318202.aspx

    Thanks,

  • I have following update from my side on this:

    -          I referred document tms3206657.pdf and I found the address of EDMA3CC was 0x02740000(table 2-2, page 23), not 0x02700000. Then I replaced the address and tried but internal and PCIe , both dma transfer didn’t succeed.

    -          Then I read the value of each register from cavium dirver file itself. So I read every register immediately after writing to it. I found all the values matching Here.

    -          Then I connected DSP with JTAG and read the memory of all those registers. Here is the problem. I didn’t find any value of the register matching with what I wrote

     

    So that means in pciedemo.c file the address was not correct mapped to 0x02740000.  Following is how 0x02740000 is mapped in pciedemo.c, function- HAL_readDMA():

     

    #define EDMA_TPCC0_BASE_ADDRESS      0x02740000  è Originally this was 0x02700000. Not sure why it was 0x02700000. May be pcidemo example was for c6678.

    #define DMAQNUM0                     0x0240 

    #define ESR                          0x1010

    #define EESR                         0x1030                

    #define IESR                         0x1060

    #define IPR                          0x1068

    #define ICR                          0x1070

    #define PARAM_0_OPT                  0x4000

    #define PARAM_0_SRC                  0x4004

    #define PARAM_0_A_B_CNT              0x4008

    #define PARAM_0_DST                  0x400C

    #define PARAM_0_SRC_DST_BIDX         0x4010

    #define PARAM_0_LINK_BCNTRLD         0x4014

    #define PARAM_0_SRC_DST_CIDX         0x4018

    #define PARAM_0_CCNT                 0x401C

     

    HAL_readDMA()

    {

    …………

    …………

                    /* Temporarily re-map IB region 3 from DDR memory to EDMA registers */

                    iowrite32(EDMA_TPCC0_BASE_ADDRESS, pReg + IB_OFFSET(3)/4); 

                    pReg = (uint32_t*)gss_dev->ddrVirt;   /* Now it points to the start of EDMA_TPCC0_BASE_ADDRESS */

    …………

    …………

    }

     

    Does this line of code really re-map the ddrVirt to EDMA register base? If yes then All my DSP register should have been updated. Can you confirm this from pciedemo.c code? Iw ill update the same in e2e community also.

     

    Thanks

    Kashif 

  • Hi Kashif,

    MCSDK PCIe Linux host loader EDMA-Interrupt boot example demo code(pciedemo example) supports C6670 and C6678 platform only. C6657 platform support ddrinit/helloworld/post demos only.

    Please take a reference of C6670/78 EDMA-Interrupt boot example demo code and develope your application.

    Yes, IB_OFFSET(3) was used for DDR originally, we temporarily changed it to point to EDMA_TPCC0.

    Please take a look at below thread:

    http://e2e.ti.com/support/dsp/c6000_multi-core_dsps/f/639/p/291330/1015977.aspx#1015977

    Thanks,

  • Hi Kashif,

    Any Update. Please help us to close this thread.

    Thanks,

  • Hi Padmaja,

    Here is update after our meeting. The subsequent transfer issue was resolved and we were able to transfer data over PICIe using eDMA. Now I am facing another issue. Once cavium octon host gets MSI interrupt, it transfers all the buffers from DSP over PCIe EDMA but after first interrupt , no further interrupts are received by host while DSP sends interrupt in every 5 ms. I confirmed DSP interrupts by probing the interrupt pin.

    So it seems HAL_readDMA() function disables the msi interrupts(Not sure why?).

    In your pciedemo code I can see following when calling HAL_readDMA():

     

    static irqreturn_t ISR_handler(int irq, void *arg)

    {

                    uint32_t status = HAL_CheckPciInterrupt();

     

                    if (status == 1) {

                                    HAL_readDMA(DDR_START, rData, DMA_TRANSFER_SIZE, 1);     /* Move from DSP to GPP */

                                    HAL_PciClearDspInterrupt();

                                    return IRQ_HANDLED;

                    }

                    return IRQ_NONE;

    }

     

    It seems you didn’t use MSI interrupt in your example code. I have two questions now:

    1)      Can I use msi interrupts with eDMA transfer.

    2)      If yes, why msi interrupts appears to be disabled after first eDMA transfer

  • Hi Kashif,

    Ans1: Yes.

    Ans2: MSI interrupts is not disabled after the first EDMA transfer. In pciedemo code interrupt is triggered once only.

    If you want to use the MSI interrupts continuously means, you need to update EOI bits on IRQ_EOI register. (Example: you using MSI0 means write 0x4 to IRQ_EOI register /* end of MSI0, event number=4 */) Refer section 3.1.15 End of Interrupt Register (IRQ_EOI) on PCIe user guide.

    Please refer the following thread. Hope it helps.

    http://e2e.ti.com/support/dsp/c6000_multi-core_dsps/f/639/t/161475.aspx#589428

    Thanks,

  • Ganapati,

    I wrote "0x4" to IRQ_EOI but this didn't help at all. Please note my configuration is different from the example you shared (Though very similar). In your  example it is the host who generates periodic interrupt to DSP while in my example DSP generates MSI interrupt in every 5 ms to host and If I use eDMA transfer(dsp-> host direction) then host does not receive any interrupt from DSP after the first one.

    If I don't use eDMA but use directly copy the data from DSP (dsp-> host direction) then host continuously receives periodic interrupts. That is very strange what eDAM transfer does that stops MSI interrupts.

    I need help in debugging this. First how to confirm if DSP has actually generated second interrupt or not? secondly if generated the second interrupt then why host did not receive it. I would like to get the first part confirmed with the help of TI.

  • I will check these links but I am looking for the answer why these MSI interrupts work fine when I don't use eDMA. Why the problem comes only in eDMA.

    Also at this stage I don't know the msi problem is on DSP side or the host side. My objective here is to confirm whether DSP is sending interrupt to host or not and I am not getting pointed answers for that.

  • Ganapati,

    i just checked the link and link says "Write 0x1 to each bit of the MSI0_IRQ_STATUS will clear the status of the MSI interrupt. It should be done before you exit the ISR and make the MSI vector available for the next triggering.".

    While in your example code pciedemo.c you cleared the interrupt by following function:

    void HAL_PciClearDspInterrupt(gauss_device_t *gss_dev)
    {
    uint32_t *pReg = (uint32_t *)gss_dev->regVirt;
    iowrite32(1, pReg+EP_IRQ_CLR/4);
    }

  • Hi Kashif,

    In pciedemo.c example code, HAL_PciClearDspInterrupt(gauss_device_t *gss_dev) function cleared the Endpoint Interrupt Request Clear Register (EP_IRQ_CLR).

    You need to implement the code for clear the MSI 0 Interrupt Enabled Status Register (MSI0_IRQ_STATUS).

    Thanks,

  • Ganapati,

    I did clear MSI0_IRQ_STATUS but this too doen't help. As I told you earlier, the same code works fine if I don't use eDMA. We have to understand the difference when I use eDMA and when I don't use eDMA. Why the same vode works fine if I don't use eDMA.

    We need to have call with TI and discuss the problem to make it faster.