Hello,
My application is running on a board with multiple TI C6678 chips and 2 SRIO Switches.
I am using SRIO communication between C6678s either directly (chip-to-chip via SRIO Ports) or indirectly (via SRIO Switches).
I have several problems, but the biggest of them is that Type 11 messages are lost during SRIO communication.
Here are the configuration and test details.
1. Configuration - SRIO Device
- Four 1x SRIO Ports are configured; SRIO Port 3 is not used.
- Each SRIO Port operates at 1.25Gbps
- Each SRIO device is connected to the neighbouring SRIO devices via SRIO Ports 1 and 2.
- Each SRIO device is connected to the remote (non-neighbouring) SRIO devices via Port 0 which is connected to a SRIO Switch.
- 8 Device IDs (one per Core) are defined as follows: one ID is defined using standard CSR and the rest of 7 IDs are defined using the TLM Port Base Routing registers.
- 6 Garbage queues are defined to collect descriptors in error situations.
2. Configuration - Queues, SRIO Drivers, Sockets, etc.
- We have a "total communication" requirement for SRIO: each Core on any C6678 chip should be able to communicate via SRIO with any Core on any other C6678 chip on the board.
- Type 11 messages are used for communication
- Given the requirement above and various limitations (16 SRIO queues per device, etc.), I came up with the SRIO topology below.
- 3 SRIO Queues/C6678 chip: each queue is assigned (CSL_SRIO_SetTxQueueSchedInfo) to one active SRIO Port (0, 1 and 2).
- 1 TX Free Queue/Core: 1 descriptor
- 1 RX Free Queue/Core: 1023 descriptors
- 1 RX Completion Queue/Core
- 4 SRIO Drivers/Core: Application Managed, Polling Mode
- 3 SRIO TX Drivers are associated (1:1) to the 3 SRIO Queues above. Each of the 3 Drivers manages SRIO transmission on a separate SRIO Port. These 3 Drivers are NOT configured for the receive operation.
- 1 SRIO RX Driver configured for the receive operation: It defines a Receive Flow which accepts messages for a given Device ID (CSL_SRIO_SetFlowControl). The Receive Flow uses the RX Free and RX Completion Queues above.
- 4 SRIO Sockets: Type 11, Raw, Non-Blocking, Multi Segment, one for each SRIO Driver above.
- the 3 SRIO sockets associated to the TX drivers have Pending Packet Count set to 8 .
- the SRIO socket associated to the RX driver is bound to the Core's Device ID, accepts ANY Mailbox/Letter and it has Pending Packet Count set to 1023 (maximum size of the RX Free Queue).
3. SRIO Driver
- I made the following change to the standard driver as supplied by TI (pdk_C6678_1_1_2_5/packages/ti/drv/srio): when starting an SRIO Driver, if the TX Queue is specified, open the associated CPDMA channel number
3. Tests
- Set-up:
- 8 Core on the same C6678 chip wake up every 10 ms and each Core sends an 1500 bytes Type 11 message to a remote (non-neighbouring) Core.
- Since the destination Core is remote, SRIO Port 0 and the SRIO Switch will be used.
- Execution:
- Test 1: the 8 sending Cores do the sending operation in 100 consecutive 10 ms time slots. Results are correct: each sending Core sends 100 messages and the destination Core receives a total of 800 messages.
- Test 2: the 8 sending Cores do the sending operation in 200 consecutive 10 ms time slots. Results are NOT correct: each sending Core sends 200 messages and the destination Core receives less than the number of expected messages (ex: 1222 instead of 1600).
- Investigation:
- When messages are "lost", no errors are reported (TX, RX, Garbage queues, etc.) and all the RX and RX queues on all Cores have the expected number of descriptors.
- If I "space" the sending (ex: every 3rd 10ms time slot) and repeat Test 2, the result is correct: 1600 messages are received.
- If I increase the number of RX descriptors and the RX socket Pending Packet Count (ex: 2047 instead of 1023) and repeat Test 2, the result is correct again: 1600 messages received.
- The bitrate of Tests 1 and 2 is the same; the only difference is that Test 2 executes for a longer duration of time (twice as long). Message size (100 bytes, 1500 bytes) seems irrelevant.
4. Request
- I tried different investigation paths and I am looking for new ideas.
- I would appreciate any suggestions/help to solve this problem.
5. Please note that
- I read the SRIO User Guide, SRIO LLD, Silicon Errata, etc.
- I followed some of the related Forum discussions.
- I executed successfully most of the SRIO test programs provided by TI (pdk_C6678_1_1_2_5/packages/ti/drv/srio) including the test using 2 EVMs and a Break Out Card.
Thanks,
Sergiu