This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

EVM C6474 and SRIO CSL

Hi.

I am using the EVM C6474 by Spectrum Digital, and I am new to the C64x+ Multicore architechture as well as SRIO.

I have been trying to use the SRIO example project in the latest C6474 CSL (i.e. "srio_evm_dio_example") to send simple 256k packets from one DSP to the other through Port 1 using the DirectIO method. However, the project seems to be hugely unstable. It only runs successfully every 2 out of 5 tries (and I have done the load/run procedure many, many times...) When it stalls, it usually stalls when both processors are attemting to initialize their ports (i.e. When both DSP's STDOUT windows display: "Debug: Waiting for SRIO Port 1 to be initialized.."). Moreover, when the project does run successfully, only the source processor stdout window displays an output as follows:

Debug: Waiting for SRIO Port 1 to be initialized
Debug: SRIO Port 1 has been initialized...

Debug: Starting Transfer 1 on SRIO Port 1
Error: Port 1 has Input/Output Error... clearing

Debug: SRIO Transfer 1 was completed...
Debug: Starting Transfer 2 on SRIO Port 1
Debug: SRIO Transfer 2 was completed...
Debug: Starting Transfer 3 on SRIO Port 1
Debug: SRIO Transfer 3 was completed...
Debug: Starting Transfer 4 on SRIO Port 1
Debug: SRIO Transfer 4 was completed...
Debug: Starting Transfer 5 on SRIO Port 1
Debug: SRIO Transfer 5 was completed...
Debug: Starting Transfer 6 on SRIO Port 1
Debug: SRIO Transfer 6 was completed...
Debug: Starting Transfer 7 on SRIO Port 1
Debug: SRIO Transfer 7 was completed...
Debug: Starting Transfer 8 on SRIO Port 1
Debug: SRIO Transfer 8 was completed...
Debug: Starting Transfer 9 on SRIO Port 1
Debug: SRIO Transfer 9 was completed...
Debug: Starting Transfer 10 on SRIO Port 1
Debug: SRIO Transfer 10 was completed...

Debug: Write CMD Test Passed

I am worried about the highlighted message above, which does not seem right to me, and I dont really know what it means, due to the high level nature of the CSL API's and reference guide. The target processor also, NEVER displays anything other than: "Debug: Waiting for SRIO Port 1 to be initialized", although the code for this DSP clearly indicates that there should be more output messages when successful, which to me is a further indication that there is instability in the app. I have made sure to follow the correct load and run procedures of loading and running the target first (i.e. DSP2, core 0), followed by the source (i.e. DSP1, core 0).

Lastly, the running of this app also causes BOTH DSP's to stall after each run (whether successful or not), and the ONLY way of getting to try again, is to perform a hard reset of the board (or a power cycle), which off course is not ideal.

The following thread seems to point out that there might be an L2 memory violation due to an insufficient memory map setup (in the .cmd file etc.), but I have tried following the advice in this discussion without any success:
http://e2e.ti.com/support/dsp/c6000_multi-core_dsps/f/439/p/12465/48817.aspx#48817

Anyone else having this same issue? Anyone out there with advice?

I have attached the .cmd and .GEL files used in both source and target processor for clarity (in zip format). I am suspicious about the .gel file that came with the board, since it has memory map sections that does not correspond with hardware on the C6474 (for example, UTOPIA). Can someone verify whether these memory configs are correct?

Estian.

Files.zip
  • Can we get the LSU register dumps for a good transaction as well as when it fails? It might be possible that the registers are not initialized properly. only way to check is make sure that the LSU registers and the ERR stat registers are intialized before each run.

     

    Thanks,

    Arun.

  • Ok.

    How exactly do I do a register dump?

    I assume it has something to do with exporting the contents of the memory through the disassembly window..

    Would you mind explaining?

    Thanx. Estian.

  • Hi.

     

    Can anyone help me with this request?

     

    Estian.

     

  • Estian,

     

    You can get the offsets for the LSU regs on page 29 of sprug23B. By register dumps, we refer to getting the contents of the regs from the memory windowor using the CSL handle. Please post the contents here.

  • Sorry, I meant page 109, not 29.

  • Thanx, Aditya.

    Ok, so after some struggle, here is how things are standing at the moment:

    I have been stepping through the code for the source and target processors to see if I could pinpoint the the source of failure.

    Firstly, I changed the infinite loops at the end of both source and target's code (i.e. while(1)) into finite loops to see if the code for both processors actually completes during a free run. However, the target processor still seemed to hang up after packet transfer 5. This issue is solved by increasing the delay time of the target before sending each next packet by a factor of 10 (i.e. adding a zero). Doing this, now at least guarantees that both the source and the target DSP's completes their programs. So, that solves that issue...

    However, the following problems still seems to persist:

    • In the source dsp, the message "Error: Port 1 has Input/Output Error... clearing", still seems to come up at the beginning of the execution. Thus, there still seems to be an error on the port that needs to be cleared before the first transfer of each run can be started. Why?
    • Not all the executions of the printf function actually reaches the stdout window, even when stepping over the code. In the source dsp, none of the stdout messages reach the stdout window after the "Error: Port 1 has Input/Output Error... clearing" message, even though there are quite a few more printf calls in the code after this one. So, to me it seems as if the stdout service is failing somewhere along the way. Or, Could it be that the source and target stdout executions are somehow opposing each other?
    • Simply restarting the code on each processors and doing a re-run is not successful at all. Both processors simply hang up when doing this. I have to at least do a warm reset (i.e. XWRSTz) and relaod the code, before I can successfully re-execute the code on both processors. At first, I thought it was because the code does not properly close the SRIO module at the end of each run. So, I added a proper CSL_srioClose(hSrio) line at the end of both source and target's code, but alas, no success. Can anyone clarify?

    So, finally as requested, here are the states of the LSU registers after each processor's successful run. Note: I simply copied the contents from the Dissasembly window for the necessary LSU register areas (i.e. 02D00400 - 02D0047C) after a successful run:

    Target DSP

    02D00400          :

    02D00400 00000000            NOP           

    02D00404          :

    02D00404 00000000            NOP           

    02D00408          :

    02D00408 00000000            NOP           

    02D0040C          :

    02D0040C 00000000            NOP           

    02D00410          :

    02D00410 00000000            NOP           

    02D00414          :

    02D00414 00000000            NOP           

    02D00418          :

    02D00418 00000000            NOP           

    02D0041C          :

    02D0041C 0000FFFF            STW.D2T2      B0,*+B15[255]

    02D00420 00000000 ||         NOP           

    02D00424 00000000            NOP           

    02D00428 00000000            NOP           

    02D0042C 00000000            NOP           

    02D00430 00000000            NOP           

    02D00434 00000000            NOP           

    02D00438 00000000            NOP           

    02D0043C 0000FFFF            STW.D2T2      B0,*+B15[255]

    02D00440 00000000 ||         NOP           

    02D00444 00000000            NOP           

    02D00448 00000000            NOP           

    02D0044C 00000000            NOP           

    02D00450 00000000            NOP           

    02D00454 00000000            NOP           

    02D00458 00000000            NOP           

    02D0045C 0000FFFF            STW.D2T2      B0,*+B15[255]

    02D00460 00000000 ||         NOP           

    02D00464 00000000            NOP           

    02D00468 00000000            NOP           

    02D0046C 00000000            NOP           

    02D00470 00000000            NOP           

    02D00474 00000000            NOP           

    02D00478 00000000            NOP           

    02D0047C 0000FFFF            STW.D2T2      B0,*+B15[255]

     

    Source DSP:

    02D00400          :

    02D00400 00000000            NOP           

    02D00404          :

    02D00404 1088D100            .word         0x1088d100

    02D00408          :

    02D00408 1088CEE0            .word         0x1088cee0

    02D0040C          :

    02D0040C 00000000            NOP           

    02D00410          :

    02D00410 61FACE01            .word         0x61face01

    02D00414          :

    02D00414 00000054 ||         STH.D1T1      A0,*-A0[0]

    02D00418          :

    02D00418 00000000            NOP           

    02D0041C          :

    02D0041C 0000FFFF            STW.D2T2      B0,*+B15[255]

    02D00420 00000000 ||         NOP           

    02D00424 00000000            NOP           

    02D00428 00000000            NOP           

    02D0042C 00000000            NOP           

    02D00430 00000000            NOP           

    02D00434 00000000            NOP           

    02D00438 00000000            NOP           

    02D0043C 0000FFFF            STW.D2T2      B0,*+B15[255]

    02D00440 00000000 ||         NOP           

    02D00444 00000000            NOP           

    02D00448 00000000            NOP           

    02D0044C 00000000            NOP           

    02D00450 00000000            NOP           

    02D00454 00000000            NOP           

    02D00458 00000000            NOP           

    02D0045C 0000FFFF            STW.D2T2      B0,*+B15[255]

    02D00460 00000000 ||         NOP           

    02D00464 00000000            NOP           

    02D00468 00000000            NOP           

    02D0046C 00000000            NOP           

    02D00470 00000000            NOP           

    02D00474 00000000            NOP           

    02D00478 00000000            NOP           

    02D0047C 0000FFFF            STW.D2T2      B0,*+B15[255]

    02D00438 00000000            NOP           

    02D0043C 0000FFFF            STW.D2T2      B0,*+B15[255]

    02D00440 00000000 ||         NOP           

    02D00444 00000000            NOP           

    02D00448 00000000            NOP           

    02D0044C 00000000            NOP           

    02D00450 00000000            NOP           

    02D00454 00000000            NOP           

    02D00458 00000000            NOP           

    02D0045C 0000FFFF            STW.D2T2      B0,*+B15[255]

    02D00460 00000000 ||         NOP           

    02D00464 00000000            NOP           

    02D00468 00000000            NOP           

    02D0046C 00000000            NOP           

    02D00470 00000000            NOP           

    02D00474 00000000            NOP           

    02D00478 00000000            NOP           

    02D0047C 0000FFFF            STW.D2T2      B0,*+B15[255]

     

    Can you see where the issue is?

    Regards.

    Estian.

  • Estian,

     

    What version of csl are you using?  I will try to duplicate your issue here.

     

    Regards,

    Travis

  • Hi Travis.

    I'm using CCS 3.3 for this.

    However, after testing it on CCS 4.0 it seems the symptoms are exactly the same.

    Thanx

    Estian.

  • My apologies, that did not answer your question..

    I am using CSL v. 03.03.01.001.

    Regards.

    Estian.

  • Estian,

    I will look at this further tomorrow on an EVM, but just looking through the CSL, I'd make the following changes to see if it has any affect.

    Src:

    Change this:        lsu_conf.intrReq               = 0;                     /* Interrupts                   */

    Comment out:

    //       /* Read the Port Error Status register. */
    //       response.index = SRIO_PORT_NUMBER;
    //       CSL_srioGetHwStatus (hSrio, CSL_SRIO_QUERY_SP_ERR_STAT, &response);
    //
    //        /* Was there an error detected? */
    //        if ((response.data & ~0x2) != 0)
    //        {
    //            /* Error was detected; we need to clear this before we can proceed... */
    //            printf ("Error: Port %d has Input/Output Error... clearing\n", SRIO_PORT_NUMBER);
    //
    //            /* Clearing the errors... */

    //           CSL_srioHwControl (hSrio, CSL_SRIO_CMD_SP_ERR_STAT_CLEAR, &response);

    //        }

     

    tgt:

    Comment out:

    //        /* Read the Port Error Status register. */
    //        response.index = SRIO_PORT_NUMBER;
    //        CSL_srioGetHwStatus (hSrio, CSL_SRIO_QUERY_SP_ERR_STAT, &response);
    //
    //        /* Was there an error detected? */
    //        if ((response.data & ~0x2) != 0)
    //        {
    //            /* Error was detected; we need to clear this before we can proceed... */
    //            CSL_srioHwControl (hSrio, CSL_SRIO_CMD_SP_ERR_STAT_CLEAR, &response);
    //        }

     

    Regards,

    Travis

     

  • Hi Travis.

    Thanx for the feedback!

    Following your advice above seems to make both Source and Target DSP's run smoothly and complete with all stdout messages reaching the output window.

    However, I still cannot restart the programs (i.e. Debug->Restart or, Ctrl+Shift+F5) without warm-resetting the DSP's and reloading the code. In fact, the target processor does not even want to 'Go Main' when simply restarting the program after one successful run.

    Whats up with that?

    Regards.

    Estian.

  • Estian,

    I've been playing around with this, and I think the problem is not with the code, but perhaps the code gen tools.  I am having problems with the printf statements.  The code is completely executed and all packets are received, but the printf statements don't always work.  I've been testing on CCS3.3 and CGT6.0.8, what version of the code gen tools do you have?  I'll try to ask around internally and see if there is a fix.

     

    Regards,

    Travis

  • Hi Travis.

    Thanx again for the feedback!

    Yes, the printf seems to be a little unstable. To me it means that there might be conflict between the two stdout functions on the DSPs during run-time. However, that is a lesser evil for now. My worst problem is that the DSP has to be warm reset (i.e. XWRSTz) and completely reloaded (every time) before the program successfully runs again..

    It seems to me that the SRIO module is not properly re-initiated at the start of each run, and there seems to be some residual error from each previous run that needs to be cleared. I have not been able to solve this issue yet. For the moment I am happy to do the warm reset every time. However, when integration time comes on this project, it will have to be sorted out.

     

    Regards.

    Estian.

  • Estian,

     

    Ok, good news, I think we've figured out both SRIO example issues. 

    First the printf issue.  After stepping through the code and trying different code gen tools, we verified that the printf was not being optimized out of the code by the compiler, in fact, the program was entering and exiting the printf function just fine.  Of course the output wasn't being correctly displayed though.  The problem turned out to be the stack AND heap size that was set in the link.cmd file.  Obviously that is dependent on the actual program and wasn't set high enough. 

    Second, the reset issue, or more appropriately, the ability to recompile, reload, and rerun the program without power cycling the board and restarting CCS.  This is because of the way SRIO initiates.  I'm attaching a new srio_disconnect function, along with a new src and tgt program that calls this function.  All you have to do is add the srio_disconnect.c file to your project and rebuild.  This function will completely disable SRIO once the example code has been run, so that you can reload a new one normally.  Make sure to halt the tgt device after running, then just reload tgt and src programs after building with any new changes.

    Please verify the answer or let me know that you were able to get it working.  I'm removing your link to the other forum post you had, since it is not related.

    Regards,

    Travis

     

    C6474_SRIO_CSL_FIX.zip
  • Hi Travis.

    Thanx a lot for this! I will test this and get back soon..

    I am unable to verify this answer, because it was posted as a reply and not a suggested answer.

    You might want to change it.

    Regards.

    Estian.

  • My apologies.

    I see it has changed now.

    I will verify if it works..

    Estian.

  • Ok.

    The good news from my side is that the adapted code you posted works great!

    However, it is essential that you halt the target processor before doing a restart (i.e. Cntl+Shift+F5), otherwise it all goes unstable again. But, this is easily overcome by changing the code of the Target to a finite loop that finishes after each transmission, and guarantees that the SRIO on the Target re-initializes properly on a restart.

    Thanx a lot Travis!

    Regards.

    Estian.

  • Hi Estian,

     

    I am getting the same issue for SRIO (srio_evm_dio_example), can u post working code?

     

    It will be helpful us.

     

    Thanks & Regards,

    Pubesh