This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Only a limited number of door bell can be received?

I am using C6455 EVM to do SRIO doorbell testing. In the Master board, I have the following code, which send 100 doorbells to Slave board (Mezzanine). However, on the slave side, I can only receive 6 door bells. Also, the print out on the slave part is very slow and shows different pace as the master board sends out the doorbell. I am wondering what caused this problem. Thank you very much,

Master code

srio_init();

for (i=0;i<100;i++)
{
    printf("*****************************************.\n");   
    printf("Door Bell 1.\n");
    //Reserved  Doorbell Reg #  rsv  Doorbell bit
    //9  2  1   4
    srio_doorbell(0x23);//Doorbell Reg #1   Doorbell bit 0011
    delay(10000000);
}

Slave code

Uint32 *db1_ICCR = (Uint32 *)0x02D00218;
for (i=0;i<100;i++)
{
 // clear the doorbell interrupt
 *db1_ICCR = (Uint32)0xFFFFFFFF;
 // wait for the doorbell interrupt;
 resp_db.index = 1;
 resp_db.data = 0;
 do {
         CSL_srioGetHwStatus (hSrio, CSL_SRIO_QUERY_DOORBELL_INTR_STAT, &resp_db);
 } while(resp_db.data == 0x00);
 printf("%d Doorbell received. %d\n",i,resp_db.data);
}

The output window on slave board is as following

0 Doorbell received. 8
1 Doorbell received. 8
2 Doorbell received. 8
3 Doorbell received. 8
4 Doorbell received. 8
5 Doorbell received. 8

  • The way the Doorbell works from the sender side is this:  You program the LSU, it sends the doorbell, it waits for the response, then you can program it again.  Regardless of the response that comes back the LSU (DONE or RETRY), the hardware does nothing further with this transaction.  So, if it is a RETRY, then software needs to resend the doorbell by initiating another doorbell transaction so the interrupt is not missed on the slave side.

    The way the Doorbell works on the receive side is this: The MAU receives the doorbell and determines which doorbell interrupt ICSR bit it is being asked to set, if the doorbell interrupt ICSR bit is not already set then it sets it and sends a DONE response, if the doorbell interrupt ICSR bit is already set then it sends a RETRY response.

    So,  a couple things to try, are you looking at the response to the LSU?  Does the CC say DONE for every transaction, or does it say RETRY?  I'm guessing some are retries, because the interrupt is not being cleared fast enough on the slave.  You can either simply space out your doorbell transactions on the master side with a timer, or you can use more of the doorbell interrupt ICSR bits by incrementing the doorbell bit and register each time you program the LSU to send the doorbell.  If you use the last method, you can still route the multiple ICSR bits to the same physical interrupt on the slave, when you get an interrupt (there will still be less than 100 interrupts),  there will be multiple bits set.

    Regards,

    Travis

  • Thank you, Travis.

    I will double check if the master keeps sending retry. I think this is the case, because other data transferring (direct IO) following each doorbell is never starting and I do not see the new data coming from master after the first several successful doorbell transactions.

    As to the speed that the interrupt is cleared fast enough on the slave, do you mean this is reason why the slave responded in a slower pace? If so, is there any other method to make this clearing procedure faster? Just assume this is the case, but why the slave stopped to respond after 6 door bell? This is not the speed but the slave is not accepting any door bell from master.

    I will get your updated based on my experiment.

    Thank you

     

     

  • Update of the methods based on your recommendations

     

    Method 1 space out your doorbell transactions on the master side with a timer

    It did not work, even though I slow down my doorbell transaction with a huge delay.

     

    Method 2 use more of the doorbell interrupt ICSR bits

    It did not work either. The symptom is the same as before and the slave stopped getting further doorbell after 6 iterations

    There is a very interesting result. If I assign Doorbell bit as 1 on the master side, in a successful case, the slave part will be with a resp_db.data as 2.  Should these two be the same?

     

    ON the master side

    CSL_SrioDirectIO_ConfigXfr lsu_conf;

    //Reserved Doorbell Reg # rsv Doorbell bit

    //                                  1                  1

    lsu_conf.doorbellInfo = 0x21;

     

    ON the slave side

              resp_db.index = 1;

              resp_db.data = 0;

              do {

    CSL_srioGetHwStatus (hSrio, CSL_SRIO_QUERY_DOORBELL_INTR_STAT, &resp_db);

                  } while(resp_db.data == 0x00);

              printf("%d Doorbell received. %d\n",i,resp_db.data);

     

  • I can't make sense of the doorbell bit discrepancy.  When you program the doorbell info field in the LSU register, it should be of the format : 9b reserved, 2b for the doorbell register #, 1b reserved, and 4b for the doorbell bit #.  So for each Doorbell request, you can only specify one specific ICSR bit to be set in the slave device.  Not sure how this could change from the master to the slave side.  Are you re-writing all the LSU registers for each Doorbell you try to send?

    Did you check to see that you are getting a DONE CC for the LSU each time?

    Another thing, I thought of looking at your code snippet.  When you clear the doorbell interrupt bit on the slave, are you also writing the interrupt pacing register for so that it can fire a new interrupt to the CPU?  The pacing register has to be written each time you clear the ICSR bit.

     

    Regards,

    Travis

  • In the EVM test, I called this function in the master side to set the doorbell bit in every interation. If necessary, I can change the bit value.

    // doorbell message
    void srio_doorbell(Uint16 db_info)
    {
        Uint8 lsu_no;
        CSL_SrioDirectIO_ConfigXfr lsu_conf = {0};

        /* Create an LSU configuration */
        lsu_conf.srcNodeAddr           = 0;             /* Local address */
        lsu_conf.dstNodeAddr.addressHi = 0;
        lsu_conf.dstNodeAddr.addressLo = 0;             /* Remote address */
        lsu_conf.byteCnt               = 0;
        lsu_conf.idSize                = 1;               /* 16 bit device id */
        lsu_conf.priority              = 2;               /* PKT priority is 2 */
        lsu_conf.xambs                 = 0;               /* Not an extended address */
        lsu_conf.dstId                 = LARGE_DSP2_ID;
        lsu_conf.intrReq               = 0;               /* No interrupts */
        lsu_conf.pktType               = SRIO_PKT_TYPE_DOORBELL;                        /* Doorbell type*/
        lsu_conf.hopCount              = 0;               /* Valid for maintainance pkt */
        lsu_conf.doorbellInfo          = db_info;         /* a doorbell pkt */

        lsu_conf.outPortId             = 0;               /* Tx on Port 0 */

        lsu_no = SELECTED_LSU;
        CSL_srioLsuSetup (hSrio, &lsu_conf, lsu_no);

        /* Wait for the completion of transfer */
        response.index = lsu_no;
        do {
            CSL_srioGetHwStatus (hSrio, CSL_SRIO_QUERY_LSU_BSY_STAT, &response);
     } while(response.data == 1); 
    }

    In my original code, after I run the master 8 times, the CC is with a value 111 (Transaction complete, Packet not sent due to unavailable outbound credit at given priority). I do not know how this happened.  In the first 7 iteration, I can see in the CC(02D00418 RIO_LSU1_REG6), the value is alway 000 (Transaction complete, No Errors (Posted/Non-posted))

    In the latest code, I did the following for writing the interrupt pacing register. (I am wondering if you have received the source code from Aaron.)

     Uint32 *INTDST1_RATE_CNTL = (Uint32 *)0x02D00324;
      *INTDST1_RATE_CNTL = 0;

    Thanks a lot, Travis.

  • That should be good for the pacing register.

    I think you are on to something with the CC=111.  If you get the outbound credit CC, you have to reprogram the LSU to try and send the Doorbell again.  I'm assuming you have other traffic on the same outbound port, which is temporarily congesting it and causing the outbound credit manager to time out.  Can you confirm this?  (I'm not able to run your code at this point.) This is why you are only seeing a few interrupts on the slave side, so you will need to resend the doorbell and not count that as an iteration from the master side. 

  • To reprogram, I think I did that every iteration in my code. Is this correct?

    About other traffic, in this code, you may see I only call srio_write_final(NWRITE) before the doorbell iteration begins. So, in the middle, there is no other traffic at all.

    Based on your suggestion, resending should be solving the issue. However, in my code, the 100 iteration should play the resending role since I did not change any information on the master side. But the slave part never acknowledged these doorbell from master since then. My understanding is that no matter how to resend the door bell, the slave never recovered from the CC111 condition. Am I correct?

    Actually, here is what happend in the LSU1_REGn (n is from 0 to 6) before and after the CCC 111 was present in the REG6 and it is just FYI.

    RIO_LSU1_REG0 0x00000000
    RIO_LSU1_REG1 0x00900208
    RIO_LSU1_REG2 0x00900008
    RIO_LSU1_REG3 0x00000000
    RIO_LSU1_REG4 0x20000200
    RIO_LSU1_REG5 0x00000054
    RIO_LSU1_REG6 0x00000000

    RIO_LSU1_REG0 0x00000000
    RIO_LSU1_REG1 0x00000000
    RIO_LSU1_REG2 0x00000000
    RIO_LSU1_REG3 0x00000000
    RIO_LSU1_REG4 0x21000200
    RIO_LSU1_REG5 0x002300A0
    RIO_LSU1_REG6 0x0000000E

  • I am trying to find the reason for CC 111. Is it possible that the DSP program keeps checking the ICSR in the following code and when the bit should be cleared, it is being read. So there is a conflict and it could not be cleared successfully?

     resp_db.index = 1;
     resp_db.data = 0;
     do {
            CSL_srioGetHwStatus (hSrio, CSL_SRIO_QUERY_DOORBELL_INTR_STAT, &resp_db);
        } while(resp_db.data == 0x00);

     

  • For closure on this issue, wanted to mention that there was some problems in the initialization of the SRIO peripheral.  The symptoms of the problem were that the original code only allowed 8 Doorbell messages to be sent, after which they stopped arriving on the destination device.  We think this was related to the fact that the port mode was not setup correctly, SP_IP_MODE[31:30] did not match PER_SET_CNTL[8].  The outbound credit watermarks were not setup correctly either, which may have added to the problem.  When we switched to a known good initialization sequence (attached), the issue was solved.  Also added the software error recovery snippet below that clears input/output error stopped states.

    Regards,

    Travis

     

     

    #define SP0_CS_TX      *(volatile Uint32*)(0x02D14014)
    #define SP1_CS_TX    *(volatile Uint32*)(0x02D14114)
    #define SP2_CS_TX      *(volatile Uint32*)(0x02D14214)
    #define SP3_CS_TX    *(volatile Uint32*)(0x02D14314)

    void clear_port()
    {

        unsigned int printRegs, errorStopped;

        printRegs = FALSE;
        errorStopped = FALSE;


            /*********************************************************************************/
            /***** Send Packet Not Accepted Control Symbol + Link Request Control Symbol *****/
            /*********************************************************************************/
            SP0_CS_TX = 0x40FC8000;
            SP1_CS_TX = 0x40FC8000;
            SP2_CS_TX = 0x40FC8000;
            SP3_CS_TX = 0x40FC8000;

        return;

    }

  • Once again. Thank you for your feedback, Travis.

    Actually using this initlization code and the corresponding main function from Aaron, the program works well on the C6455EVM.

    However, after I moved it to our digital board, here comes the new issue. The initialization could not be made (SP_ERR_STAT will never become 0x00000002). Thus, the connection could not be set up between FGPA and DSP (C6455). Previously, our dsp code worked well. At least, the SIRO initialization can be made on the DSP side when FPGA starts up.

    Currently, I am comparing the Serial RapidIO Peripheral Registers after using your initialization code and the ones after using my original initialization code. Actually, I found so many different register values. But I could not determine which modifications are necessary for this doorbell issue and which should be kept the same for our project connection.

    Regarding the part mentioned in your reply"We think this was related to the fact that the port mode was not setup correctly, SP_IP_MODE[31:30] did not match PER_SET_CNTL[8].  The outbound credit watermarks were not setup correctly either, which may have added to the problem.  When we switched to a known good initialization sequence (attached), the issue was solved. " In fact, I am using the similar initialization code as the Appendix A in SPRU976. It is very intersting to find that the SP_IP_MODE[31:30] could not be set to 01, no mater how to write value to force it.

    It will be very helpful if you can come up with an initialization code which is based on the Appendix A and my code. The source code provided by you and Aaron did help, but the problem of the initialization inconsistency needs to be investigated. Otherwise, it will be still very hard for us to handle this issue and apply it on our own digital board.

    Thank you very my,

  • Thanks for the feedback.  Looking at the Appendix A, it needs to be updated because it incorrectly sets BOOT_COMPLETE before you set the 1X_MODE variable in the SP_IP_MODE register.  Additionally, it incorrectly sets the watermarks to 0.  I will look at getting this updated in the documentation.  Again the docs

    Regarding your FPGA board, you will have to modify the port_width, PLL multiplier ratio for 2.5Gbps instead of 3.125Gbps, and also the PORT_OK checks only on Port0. 

    -Travis

  • Hi  im working with evm 6474. and i have problem with doorbell . the doorbell is not send to the destination .

    i dont get the done response , i get request not send due to xoff . i have run your exempl with my own srio init .

    can you help me to find out the problem.