This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

SRIO Discovery Algorithms

Hello,

I am trying to discover the switch in my network.  I am on a C6678 EVM. 

I receive a port OK.  So I am confident in the SERDES settings.  I have pre-tested ackId alignment, maintanence write and maintence read routines.  These can be found in my other posts.  They work with a static configuration.  So with them I change the destination id to 0xFF.

I can send a promiscuious maintanence read packet - out of power up - and receive a response from the switch.

However, I cannot align the ackids in order to do anything without a power cycle.  Any promiscuisous write attempts return a time out error code.  (in other words whether I just powered on or reloaded the emulator, I cannot get a promiscuisous write attempt to work.)

Do you think this is a problem with the settings in my switch or the settings on the DSP?  Do I need to set a bit on the DSP to accept a promiscuious write response?

Is there a discovery algorithm available to reference?

thanks,

Brandy

  • The switch is a TSI578 if that makes any difference on a NAT MCH.  I am using a uTCA set up.

  • With more investigation and testing, the problem is definitely the ACKIDs.  I can't get them to align.  My alignment routine can read what it should be on the DSP and reset the DSP to match.  Then I try to write the switch ACKID to make sure everyone is happy - nothing.  I can't write to the switch.

     

    If I power cycle, I can read some of the registers but they seem delayed.

     

    Please help!!

    Brandy

  • Just do port reset on the switch, you can do it on port by port basis. Unless you are afraid to lose some packets this is the easiest way.

    I think you have to enable port on the switch through its registers to be able to send packets to other ports (with hop count > 0). Also you need to setup routing before sending data.

  • Hi Alexey,

    I think I tried to reset the port on the switch, but it seems to be that the switch won't accept any packets after the acks are unaligned.  Maybe I didn't use the right register, do you know the offset of the port reset register?

    i do this: 

    CSL_SRIO_SendPortLinkMaintRequest(hSrio, 0, 3); //Reset the link

    I have the enable port registers working on the original switch I was using.  So that code is setup and works when the ACKs are aligned.  I will also set the route once I can reliably talk to the switch.

    Thanks for you advice!

    Brandy

     

  • Hi,

    I think I am wrong that you can reset port one by one, I checked and it seems switch we are working resets all ports and routing tables, so be carefull. You should also get library for your switch and adapt it for your dsp (you need to write a couple of OSAL functions). The library is provided by the manufacturer. Also there should be docs with the switch that describe how to do resets and register access in the right way.

    Back to the topic, I use the following code for reset (you dont always know the port you are connected to so I pass an std::set, partner_port is the switch port DSP is connected to):

    void RecoverLink(const SrioProcessingElement &partner,
                     const std::set<int> &likely_partner_ports) {
      // get srio handle
      CSL_SrioHandle srio_handle = CSL_SRIO_Open(kDefaultSrioInstanceNumber);
      if (srio_handle == NULL) {
        Raise(kErrorSrioHandle);
        return;
      }
      // get local port that the partner is connected to
      int srio_port = partner.local_port();
      // hotplug link restore procedure described in srio switch manual
      // clear local outstanding packets and reset ackid
      srio_handle->RIO_SP[srio_port].RIO_SP_ACKID_STAT = 0x80000000;
      // reset srio ports on both ends twice
      for (int reset_count = 0; reset_count < 2; ++reset_count) {
        // issue reset request to partner
        srio_handle->RIO_SP[srio_port].RIO_SP_LM_REQ = 0x3;
        // wait for response
        while (!(srio_handle->RIO_SP[srio_port].RIO_SP_LM_RESP & 0x80000000)) {
        }
        // reset local port
        ResetSrio();
      }
      // issue link-request/input-status from all likely switch ports
      for (std::set<int>::const_iterator port = likely_partner_ports.begin();
           port != likely_partner_ports.end(); ++port) {
        // cmd command spans bits 29-31, command is 4
        partner.WriteRegister(0x140 + 0x20*(*port), 0x4);
      }
      srio_handle->RIO_ERR_DET = 0x0;
      srio_handle->RIO_SP[srio_port].RIO_SP_ERR_STAT = 0x00020200;
      srio_handle->RIO_SP_ERR[srio_port].RIO_SP_ERR_DET = 0x0;
    }

    SrioProcessingElement is a wrapper around srio calls and config structures, functions that might be usefull are below:

    void SrioProcessingElement::WriteRegister(uint32_t register_offset,
                                              uint32_t register_value) const {
      // make a copy of stored config
      SRIO_LSU_TRANSFER lsu_config =lsu_transfer_config_;
      // setup write information
      lsu_config.rapidIOLSB = register_offset;
      lsu_config.ttype = 1;
      *data_ = SwapEndianness(register_value);
      // write
      ExecuteLsuTransaction(&lsu_config);
    }

    data_ address is copied to lsu_transfer_config_ in class constructor

      lsu_transfer_config_.dspAddress =
          reinterpret_cast<size_t>(ConvertAddressToGlobal(data_));

    void SrioProcessingElement::ExecuteLsuTransaction(
        SRIO_LSU_TRANSFER *lsu_config) const {
      // check initialization
      if (srio_handle_ == NULL) {
        Raise(kErrorSrioHandle);
        return;
      }
      // code finished accessing memory, write cache back
      Srio_osalEndMemAccess(data_, kCacheLineSize);
      // wait for lsu to become available
      while (CSL_SRIO_IsLSUFull(srio_handle_, lsu_number_)
             && CSL_SRIO_IsLSUBusy(srio_handle_, lsu_number_)) {
      }
      // get transaction context
      uint8_t transaction_context = 0;
      uint8_t transaction_id = 0;
      CSL_SRIO_GetLSUContextTransaction(srio_handle_, lsu_number_,
                                        &transaction_context, &transaction_id);
      // send request
      CSL_SRIO_SetLSUTransfer(srio_handle_, lsu_number_, lsu_config);
      // wait around till the transfer is completed.
      while (CSL_SRIO_IsLSUBusy(srio_handle_, lsu_number_)) {
      }
      // wait till right transfer completed
      uint8_t completion_code = 0;
      uint8_t context_bit = 0;
      do {
        CSL_SRIO_GetLSUCompletionCode(srio_handle_, lsu_number_, transaction_id,
                                      &completion_code, &context_bit);
      } while (transaction_context != context_bit);
      // code start accessing memory, invalidate cache
      Srio_osalBeginMemAccess(data_, kCacheLineSize);
      // set error acording to completion code
      assert(completion_code < kLsuCompletionCodeCount);
      if (kCompletionCodeToErrorMap[completion_code] != kErrorNone) {
        Raise(kCompletionCodeToErrorMap[completion_code]);
      }
    }

  • Hi Alexy,

    Thanks for the code snipet.  I tried the loop reset as you said but then I still can't write to the partner to reset.

    just let me verify:  SERDES settings have to be right because i have a PORT_OK, correct?

    Thanks for your thoughts,

    Brandy

  • Hi,

    I think Port_ok means that srio physical level is working so serdes is ok. If you can communicate with the switch after initial boot up then your setting should be correct. I am not sure how you reset DSP srio, I do this by disabling the srio controller and then by running my init function for the srio again as after initial power on. You should not reset or reinitialize qmss/cppi though (dont have the code atm so can not give more details).

    You can verify that reset happened on the switch side by running the software that comes with the switch. Send a few packets to change ack ids on the switch side, run your reset function. Other things that might cause problems:

    • register address for the issue link-request/input-status, i use "0x140 + 0x20*(*port)" but yours might be different
    • writing to switch register is strange. There is some byte order stuff going on and switch documentation is somewhat misleading. At least it took me some time to make working version of register write function (though not 100% sure in it). Try to read write some switch registers through dsp and verify  using swtich software that your function works.

    Alexey

  • Brandy,

    Remember the input/output port error clearing that we previously discussed.  That state can happen even with port_ok valid, but prevent packets from being sent.  Also, Alexey made a good comment about endianness on writing to the switch MMR.  You may have to play with that to get it formatted correctly.

    Regards,

    Travis

  • Hello Travis,

    You mean the magic number?  Got it.  In fact, if I just run the exact code that I used for the TSI577 - it will not work.  So the error clearing, the ack id alignment, the enabling of the ports, that worked with TSI577 does not work with TSI578. 

    if (portIsOkay)

    {

    LOG_INFO("Port %i is okay", i);

    CSL_SRIO_GetPortType(hSrio, i, &myValue);

    LOG_TRACE("Port %i has type %i, ", i, myValue);

    CSL_SRIO_GetSupportedPortWidth(hSrio, i, &myValue);

    LOG_TRACE("supports port widths 0x%x and ", myValue);

    CSL_SRIO_GetInitializedPortWidth(hSrio, i, &myValue);

    LOG_TRACE("initialized to port width 0x%x.", myValue);

    //Always write magic number to clear error stop conditions after initialization

    hSrio->RIO_PLM[i].RIO_PLM_SP_LONG_CS_TX1 = 0x2003F044;

    LOG_DEBUG("Correct Output Error Stop Condition for Port %i.", i);

     }

    I thought I had played around with the endianess a bit, but I will try again.  Unfortunately since the switch is on the an MCH, I don't see that i have JTAG access to it. 

    Thanks,
    Brandy

     

     

  • Brandy,

    I'm not sure why there would be any difference in the behavior of the two switches.  I'll check around and see if anything comes up.

    Regards,

    Travis

  • Thanks,  I'll keep trying!

     

    Brandy

  • Hi,

    So I got a bit of a diagnostic tool from NAT.  When I run the tool, it checks the status of the port.  At the point where I get a port ok, it also powers on with errors.  I mean then switch reports Port Ok with a stop condition.  Here is the results from the tool:


    SRIO (RET=0/0x0): 11
    Enter device (RET=1/0x1):
    Clear after read? (0=No, 1=Yes) (RET=0/0x0):
    SRIO port states of device 1
       port  0 : ctrl 0x50600001 stat 0x00000002 PORT_OK      initialized: x4
       port  1 : ctrl 0x00600001 stat 0x00000001 PORT_UNINIT
       port  2 : ctrl 0x50620001 stat 0x00000001 PORT_UNINIT
       port  3 : ctrl 0x00600001 stat 0x00000001 PORT_UNINIT
       port  4 : ctrl 0x50600001 stat 0x00000001 PORT_UNINIT
       port  5 : ctrl 0x00600001 stat 0x00000001 PORT_UNINIT
       port  6 : ctrl 0x50600001 stat 0x00000001 PORT_UNINIT
       port  7 : ctrl 0x00600001 stat 0x00000001 PORT_UNINIT
       port  8 : ctrl 0x50600001 stat 0x00000002 PORT_OK      initialized: x4
       port  9 : ctrl 0x00600001 stat 0x00000001 PORT_UNINIT
       port 10 : ctrl 0x50600001 stat 0x00000001 PORT_UNINIT
       port 11 : ctrl 0x00600001 stat 0x00000001 PORT_UNINIT
       port 12 : ctrl 0x50600001 stat 0x00000001 PORT_UNINIT
       port 13 : ctrl 0x00600001 stat 0x00000001 PORT_UNINIT
       port 14 : ctrl 0x50600001 stat 0x00000001 PORT_UNINIT
       port 15 : ctrl 0x00600001 stat 0x00000001 PORT_UNINIT
    SRIO (RET=0/0x0): 11
    Enter device (RET=1/0x1):
    Clear after read? (0=No, 1=Yes) (RET=0/0x0):
    SRIO port states of device 1
       port  0 : ctrl 0x50600001 stat 0x00000002 PORT_OK      initialized: x4
       port  1 : ctrl 0x00600001 stat 0x00000001 PORT_UNINIT
       port  2 : ctrl 0x50620001 stat 0x00000001 PORT_UNINIT
       port  3 : ctrl 0x00600001 stat 0x00000001 PORT_UNINIT
       port  4 : ctrl 0x50600001 stat 0x00000001 PORT_UNINIT
       port  5 : ctrl 0x00600001 stat 0x00000001 PORT_UNINIT
       port  6 : ctrl 0x50600001 stat 0x00030202 PORT_OK      initialized: x4
       port  7 : ctrl 0x00600001 stat 0x00000001 PORT_UNINIT
       port  8 : ctrl 0x50600001 stat 0x00000002 PORT_OK      initialized: x4
       port  9 : ctrl 0x00600001 stat 0x00000001 PORT_UNINIT
       port 10 : ctrl 0x50600001 stat 0x00000001 PORT_UNINIT
       port 11 : ctrl 0x00600001 stat 0x00000001 PORT_UNINIT
       port 12 : ctrl 0x50600001 stat 0x00000001 PORT_UNINIT
       port 13 : ctrl 0x00600001 stat 0x00000001 PORT_UNINIT
       port 14 : ctrl 0x50600001 stat 0x00000001 PORT_UNINIT
       port 15 : ctrl 0x00600001 stat 0x00000001 PORT_UNINIT

    The port ok occurs when I enable the driver with

    CSL_SRIO_EnablePeripheral(hSrio);

    Any thoughts about why the switch is starting off with errors?

    thnaks,

    Brandy

  • Brandy,

    Unsure why, but this is exactly the scenario we have seen before where you can get port_ok but be in input or output error stopped state.  In your case, it is the Output error stopped.  By sending the magic number and aligning Ackids before any packets are attempted, it should clear both these error states on both devices.  Output error stopped state usually means is received a PNA control symbol, so it almost sounds like it tried to send a packet to the DSP before ackids were aligned.

    Regards,

    Travis

  • Brandy,

    I did some checking with one of our engineers that worked with our chassis setup and the NAT MCH with the TSI578 on board.  He was making performance measurements, so his method would be to do a run, then reset the EVM and reset the switch port connected to the DSP.  That way he didn't do any Ackid alignment, as both sides were always reset to 0.  He mentioned that you should have received a NAT MCH manual.  Unfortunately, I cannot provide ours to you directly due to NDAs and privacy issues.  He used scripts to reset the port and re-establish port data rate.  Essentially this was done by...

    " 0xA0 is being written to register 0x10 of the MCH to set the speed to 3.125Gbaud and to hold the TSI578s in reset. Then a 0xA3 is written to register 0x10 to set the speed and take the TSI578s out of reset.  The bottom two bits of register 0x10 of the MCH control resetting the TSI578s."

    Hope that helps,

    Travis

  • ok, then something weird is happening because even right out of power up I cannot communicate with the switch.  My post from Jan 10th, is out of a power cycle on the chassis, so both the MCH and EVM should be at zero!

    Would it be possible to get together with NAT and help me work through this?  They have started a problem support document that I could email you with everything we've already tried.

    Thanks,
    Brandy

  • Brandy,

    Just to be clear since I wasn't explicit earlier, the scripting I referred to is done through the console port, so PC --> USB --> MCH and uses hyperterminal on the PC.  It wasn't sent through the SRIO link itself.

    To your point, ackid alignment should not be an issue out of power-up since all are at zero.  However, I wouldn't rule out the possibility of input/output error stopped state. 

    You can send me the doc and I'll take a look.

    Regards,

    Travis

  • Travis,

    I removed this line of code (in red) in my reset routine and now I get a bit further.  That is, my maintanence writes are getting a completion code that the transfer was good.  However, my reads are still not getting a valid response.


    while( (hSrio->RIO_SP[0].RIO_SP_LM_RESP & 0x0000001F) != 0x10 )

    {

    // issue reset request to partner

    //hSrio->RIO_SP[0].RIO_SP_LM_REQ = 0x3;

    CSL_SRIO_SendPortLinkMaintRequest(hSrio, 0, 3);

    // wait for response, response is cleared after reading so, you can't trust a debug statement of this.

    while ((hSrio->RIO_SP[0].RIO_SP_LM_RESP & 0x80000000) == 0) ;

    LOG_TRACE ("2 Link Main Response 0x%08x, ackid 0x%x, link stat 0x%x", hSrio->RIO_SP[0].RIO_SP_LM_RESP, (hSrio->RIO_SP[0].RIO_SP_LM_RESP & 0x000003E0) >> 5, (hSrio->RIO_SP[0].RIO_SP_LM_RESP & 0x0000001F));

    // reset local port

    // CSL_SRIO_SetEventMgmtResetRequest(hSrio,0xFF);

    }

    I also went back and tested this routine with the Vadatech MCH/TSI577 switch.  It still works like a champ!

    Thanks,

    Brandy

  • Brandy,

    BrandyJ said:

    // reset local port

    // CSL_SRIO_SetEventMgmtResetRequest(hSrio,0xFF);

    }

    This line doesn't cause a port reset.  By writing this register, all you do is clear the RIO_EM_RST_PORT_STAT register.  This register indicates which port on the DSP received a reset link-request control symbol from it's partner(s).  It is simply a flag register and should have no bearing on the behavior of sending/receiving packets.  On the DSP, which only supports 4 ports, only bits 3:0 are valid, the rest are reserved, but writing 0xFF instead of 0x0F shouldn't matter either.

    Something else is going on.

    Regards,

    Travis

  • Yeah, I agree it must be something else, since when I commented it out - there was no change :) 

    Thanks for continuing to support! 

  • Thank you everyone for your advice.  Turns out the MCH I was using had a hardware problem.  I am now back to the problem at hand of the discovery algorithm.  I will use all your hints here and starting coding.  If I have more questions, I will start another post.

    Brandy