This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TMS320C6678: evm6678le PCIe outbound read issue!

Part Number: TMS320C6678
Other Parts Discussed in Thread: SYSBIOS

Good day!

I have evm6678le and Xilinx vc707 boards. I have code in CCS which writes some data from evm6678 to vc707 by PCIe interface and then reads data from vc707 with verifying. I can see correct data in vc707 memory after writing. But in evm6678 data which were read from vc707 are incorrect. Can anyone check code? Am I performed reading correctly?

unsigned int* pciebase;

pciebase = 0x60000000;

unsigned int* srcbuf[N];

for ...

   srcbuf[i] = *( (volatile uint32_t  *) pciebase   +  i);

  • Hi,

    I've notified the PCI experts. Feedback will be posted directly here.

    NOTE that responses may be delayed, because of the Christmas holidays.

    Best Regards,
    Yordan
  • Yordan, thank you for support!
  • Hi,

    I don't understand the issue. If you have a PCIE connection established between C6678 and FPGA, you write some code like *(unsigned int*) 0x6000_0000 = 0x12345678 on C6678 side for exmaple, then you are able to see the pattern 0x12345678 landed in FPGA side correctly in an mapped memory location, right? If so, if you look at 0x6000_0000 through CCS memory window, what it is? Then you want to read it back using code value = *(unsigned int*)0x6000_0000, what it is in "value"?

    Do you enabled the cache on DSP side? If yes, did you do a cache invalid before read?

    Regards, Eric
  • Hi, Eric!

    Cache is enabled in my project. But srcbuf is not cached. It's simple variable. Check my code. Reading from FPGA is at the end of file.

    #include <xdc/std.h>
    #include <xdc/runtime/System.h>
    #include <ti/sysbios/BIOS.h>
    #include <ti/sysbios/knl/Task.h>
    #include <stdio.h>
    #include <ti/csl/csl_bootcfgAux.h>
    #include <ti/csl/csl_cacheAux.h>
    #include <ti/csl/csl_chip.h>
    #include <ti/csl/csl_xmcAux.h>
    #include <ti/csl/csl_tsc.h>
    #include "pcie_edma_receive.h"
    #pragma DATA_SECTION(dstBuf, ".dstBufSec")
    /* Cache coherence: Align must be a multiple of cache line size (L2=128 bytes, L1=64 bytes) to operate with cache enabled. */
    /* Aligning to 256 bytes because the PCIe inbound offset register masks the last 8bits of the buffer address  */
    #pragma DATA_ALIGN(dstBuf, 256)
    /* last element in the buffer is a marker that indicates the buffer status: full/empty */
    #define PCIE_EXAMPLE_MAX_CACHE_LINE_SIZE 128
    #define PCIE_EXAMPLE_UINT32_SIZE           4 /* preprocessor #if requires a real constant, not a sizeof() */
    #define PCIE_EXAMPLE_DSTBUF_BYTES ((PCIE_BUFSIZE_APP + 1) * PCIE_EXAMPLE_UINT32_SIZE)
    #define PCIE_EXAMPLE_DSTBUF_REM (PCIE_EXAMPLE_DSTBUF_BYTES % PCIE_EXAMPLE_MAX_CACHE_LINE_SIZE)
    #define PCIE_EXAMPLE_DSTBUF_PAD (PCIE_EXAMPLE_DSTBUF_REM ? (PCIE_EXAMPLE_MAX_CACHE_LINE_SIZE - PCIE_EXAMPLE_DSTBUF_REM) : 0)
    struct dstBuf_s {
      volatile uint32_t buf[PCIE_BUFSIZE_APP + 1];
      /* Cache coherence: Must pad to cache line size in order to enable cacheability */
    #if PCIE_EXAMPLE_DSTBUF_PAD
      uint8_t padding[PCIE_EXAMPLE_DSTBUF_PAD];
    #endif
    } dstBuf;
    #define PCIE_EXAMPLE_BUF_EMPTY 0
    #define PCIE_EXAMPLE_BUF_FULL  1
    /*
     *  ======== taskFxn ========
     */
    Void taskFxn(UArg a0, UArg a1)
    {
        System_printf("enter taskFxn()\n");
        Task_sleep(10);
        System_printf("exit taskFxn()\n");
    }
    /*
     *  ======== main ========
     */
    #pragma DATA_ALIGN(pcie_dest_buf, 1024)
    unsigned int pcie_dest_buf[128];
    Int main()
    {
         /*
          * use ROV->SysMin to view the characters in the circular buffer
          */
         //System_printf("enter main()\n");
         int i;
    #define pcie_init_on
    #ifdef pcie_init_on
         // PCIESS
      unsigned int* pl_link_ctrl;
      unsigned int* link_ctrl2;
      unsigned int* pl_gen2;
      unsigned int* cmd_status;
      pl_link_ctrl = 0x21801710;
      link_ctrl2 = 0x218010a0;
      pl_gen2 = 0x2180180c;
      cmd_status = 0x21800004;
      // BAR0..5
      unsigned int* bar0_reg;
      unsigned int* bar1_reg;
      unsigned int* bar2_reg;
      unsigned int* bar3_reg;
      unsigned int* bar4_reg;
      unsigned int* bar5_reg;
      bar0_reg = 0x21801010;
      bar1_reg = 0x21801014;
      bar2_reg = 0x21801018;
      bar3_reg = 0x2180101c;
      bar4_reg = 0x21801020;
      bar5_reg = 0x21801024;
     
      // cmd_status, dbi_cs2 = 1
      *cmd_status  =  (*cmd_status)  | 0x20;
      for(i=0; i<100; i++) asm (" NOP"); // Пауза
      *bar1_reg = 0x00000fff;
      *bar2_reg = 0x00000fff;
      *bar3_reg = 0x00000fff;
      *bar4_reg = 0x00000fff;
      *bar5_reg = 0x00000fff;
      for(i=0; i<100; i++) asm (" NOP");
      // cmd_status, dbi_cs2 = 0
      *cmd_status  =  (*cmd_status)  & 0xffffffdf;
      for(i=0; i<100; i++) asm (" NOP");
       
      *bar1_reg = (*bar0_reg); 
      *bar0_reg = 0;
      *bar2_reg = 0;
      *bar3_reg = 0;
      *bar4_reg = 0;
      *bar5_reg = 0;
     
      for(i=0; i<100; i++) asm (" NOP"); 
       
      unsigned int* temp_reg;
      unsigned int* bar1_ena;
      unsigned int* bar1_addr;
      unsigned int* inbound_region0;
      bar1_ena  = 0x21800310;
      *bar1_ena = 0x1;
      bar1_addr  = 0x21800314;
      *bar1_addr = (*bar1_reg);
    #endif
      inbound_region0  = 0x2180031c;
      *inbound_region0 = ((unsigned int) (0x1<<28))  | ((unsigned int) &dstBuf);
      CSL_tscEnable();
    unsigned long long int t1, t2;
    t1 = CSL_tscRead();
    t2 = CSL_tscRead();
    System_printf("Time of calling TSC_read func = [%u] cycles\n", (t2-t1));
    System_flush();
    t1 = CSL_tscRead();
    // EP waits for the data received
    do {
    unsigned int key;
    ///// if (dstBuf.buf[0] == 0x1) {t1 = CSL_tscRead();}
    // Disable Interrupts //
    key = _disable_interrupts();
    //  Cleanup the prefetch buffer also. //
    CSL_XMC_invalidatePrefetchBuffer();
    CACHE_invL1d ((void *)dstBuf.buf,  PCIE_EXAMPLE_DSTBUF_BYTES, CACHE_FENCE_WAIT);
    CACHE_invL2  ((void *)dstBuf.buf,  PCIE_EXAMPLE_DSTBUF_BYTES, CACHE_FENCE_WAIT);
    // Reenable Interrupts. //
    _restore_interrupts(key);
    } while(dstBuf.buf[PCIE_BUFSIZE_APP] != PCIE_EXAMPLE_BUF_FULL);
    t2 = CSL_tscRead();
    System_printf("Time of calling TSC_read func = [%u] cycles\n", (t2-t1));
    System_flush();
    void             *pcieBase;
    unsigned int* outbound_lo;
    unsigned int* outbound_hi;
    unsigned int* outbound_size;
    //unsigned int* busnum;
    unsigned int* cfgsetup;
    cfgsetup = 0x21800008;
    *cfgsetup = ((*cfgsetup) & 0xff00e0f8) | 0xff03ffff;
    unsigned int* cmdstatus;
    cmdstatus = 0x21800004;
    *cmdstatus = ((*cmdstatus) | 0x2);
    // busnum = 0x21801018;
    // *busnum = ((*busnum) & 0xff00ff00) | 0xff03ff01;
    outbound_lo  = 0x21800200;
    outbound_hi  = 0x21800204;
    *outbound_lo = 0xf7d00001;
    *outbound_hi = 0x0;
    outbound_size = 0x21800030;
    *outbound_size = 0x0;
    //unsigned int* temp11;
    //temp11 = 0x60000000;
    //unsigned int* temp12;
    //temp12 = 0x60000004;
    pcieBase = 0x60000000;
    for (i=0; i<PCIE_BUFSIZE_APP; i++)
    {
       *((volatile uint32_t *)pcieBase + i) = dstBuf.buf[i];
    }
    for(i=0; i<100; i++) asm (" NOP"); // Пауза
    unsigned int srcbuf[PCIE_BUFSIZE_APP];
    for (i=0; i<PCIE_BUFSIZE_APP; i++)
    {
      srcbuf[i] = *((volatile uint32_t *)pcieBase + i);
    }
    //srcbuf[0] = *temp11;
    //srcbuf[1] = *temp12;
    for(;;);
         BIOS_start();    /* does not return */
         return(0);
    }
  • Hi,

    You code looks right.

    for (i=0; i<PCIE_BUFSIZE_APP; i++)
    {
    *((volatile uint32_t *)pcieBase + i) = dstBuf.buf[i];
    }

    for(i=0; i<100; i++) asm (" NOP"); // Пауза

    unsigned int srcbuf[PCIE_BUFSIZE_APP];
    for (i=0; i<PCIE_BUFSIZE_APP; i++)
    {
    srcbuf[i] = *((volatile uint32_t *)pcieBase + i);
    }

    What is the data in dstBuf.buf[i] when you write to FPGA? Can you see the pattern in 0x6000_0000 in DSP memory via CCS memory browser? If yes, then you just use DSP code to read the same thing in 0x6000_0000 into srcbuf[i] which is NOT cached, why it didn't work? What you get?

    Regards, Eric
  • There is simply count from 0 to N in dstbuf. In CCS memory browser I can't see data at 0x6000_0000. Only zeros. In srcbuf in even addresses I get addresses like at srcbuf[0] I get 0x60000000 data, at srcbuf[2] I get 0x60000008. But I can see data in FPGA memory by means of PCITree program.
  • Hi,

    The simple debug is you manually editing 0x6000_0000, 0x6000_0004 to some value in CCS memory window, then can you confirm the corresponding changes are seen in FPGA memory? If yes, the PCIE mapping/translation is working. If not, I really don't understand how you can see the FPGA memory via PCITree.

    Regards, Eric

  • I guess this means that for some reason I can write but not read from 0x6000_0000. Me too can't explain it

  • Hi,

    0x2180_0200 and 0x2180_0204 is your first outbound translation region, it starts from 0x6000_0000. If you can see the pattern in FPGA side written by DSP. Then you should see the same pattern from 0x6000_0000. However, if you can't see the pattern, I even don't understand how the FPGA can see it. How do you verify the data on FPGA side? You have some JTAG debugger? Or the PCITree software is used to access FPGA?

    Regards, Eric
  • I can see by PCITree. I tryed with different data arrays from dstbuf and filling of FPGA memory is ok
  • Hi,

    Probably, you can save the 256MB memory from 0x6000_0000 to 0x6FFF_FFFF with CCS into a data file, then search it with the pattern you wrote into FPGA. If you can find it, we may think of how the OB shifted away from the first OB region. If you can't find it, I really don't know how the PCIE worked. Do you have any emulation access into FPGA to make sure the data is really there?

    Regards, Eric