This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

C6670 EMAC No TX Free Descriptor error

Expert 2985 points

Hi TI experts,

When I want to implement the each one of C6670's 4 Cores simultaneously  sends 5  MAC packets continuously, each with 600 Bytes data payload,  per 1.5 microseconds through C6670's EMAC port, I met a "No TX Free Descriptor" error.

My situation: C6670, customer's board, CCSv5.3, SYS/BIOSv6.33.06.50.

My project is based on C:\ti\pdk_C6670_1_1_2_6\packages\ti\drv\exampleProjects\PA_multicoreExample_exampleProject.

In my project, the 1.5 microseconds GPIO interrupt signal notifies  the each core of the C6670's 4 Cores  to send one MAC packets, each with 100 Bytes data payload,  out, everything goes well.

But if I change to send 5 packets, each with 600 Bytes payload, per 1.5ms per core, I met a "No TX Free Descriptor" error.

As I see it, every time the Core pops a Host Descriptor from TX Free Queue and then sets the data buffer pointer into the Host Descriptor and then pushs it into the PA TX Queue(648), so the packet is sent out. After that, the Host Descriptor is returned into the TX Free Queue. So the TX Free Queue will never exhaust.

And I suppose that if I increase the size of payload or the number of packets each time, the Host Descriptor can not return to the TX Free Queue timely! So the TX Free Queue will exhaust and then I meet the error "No TX Free Descriptor".

But at the same time, I doubt that "each one of C6670's 4 Cores simultaneously  sends 5  MAC packets continuously, each with 600 Bytes data payload,  per 1.5 microseconds" is beyond the limit of C6670's QMSS and Gigabit EMAC performance.

So can anyone give me some advice to go further?

The configuration codes of QMSS/CPPI/TX Queue/RX Queue is here

cppi_qmss_mgmt.c
/**  
 * @file cppi_qmss_mgmt.c
 *
 * @brief 
 *  This file holds all the APIs required to configure CPPI/QMSS LLDs and 
 *  to send/receive data using PA/QM.
 *
 *  \par
 *  ============================================================================
 *  @n   (C) Copyright 2009, Texas Instruments, Inc.
 * 
 *  Redistribution and use in source and binary forms, with or without 
 *  modification, are permitted provided that the following conditions 
 *  are met:
 *
 *    Redistributions of source code must retain the above copyright 
 *    notice, this list of conditions and the following disclaimer.
 *
 *    Redistributions in binary form must reproduce the above copyright
 *    notice, this list of conditions and the following disclaimer in the 
 *    documentation and/or other materials provided with the   
 *    distribution.
 *
 *    Neither the name of Texas Instruments Incorporated nor the names of
 *    its contributors may be used to endorse or promote products derived
 *    from this software without specific prior written permission.
 *
 *  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 
 *  "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT 
 *  LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
 *  A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT 
 *  OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, 
 *  SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT 
 *  LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
 *  DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
 *  THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT 
 *  (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 
 *  OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 *
*/
#include "multicore_example.h"
#include <ti/drv/qmss/qmss_firmware.h>
//JF: Added for Multicore SGMII Interrupt
#include <ti/sysbios/family/c64p/EventCombiner.h>

/* QMSS device specific configuration */
extern Qmss_GlobalConfigParams  qmssGblCfgParams;
/* CPPI device specific configuration */
extern Cppi_GlobalConfigParams  cppiGblCfgParams;

#if ( !defined( _LITTLE_ENDIAN ) && !defined( _BIG_ENDIAN ) ) \
||  ( defined(_LITTLE_ENDIAN ) && defined( _BIG_ENDIAN ) )
#error either _LITTLE_ENDIAN or _BIG_ENDIAN must be defined
#endif


#define PA_TX_Queue 736
#define PA_TX__General_Queue 1000

/* Number of Tx Free descriptors to allocate */
#define     NUM_TX_DESC                 260096//260096//NUM_HOST_DESC/2

/* Number of Rx Free descriptors to allocate */
#define     NUM_RX_DESC                2048// NUM_HOST_DESC/2

/* Buffer sizes configured for 
 * -  maximum command size to PA 
 * -  Maximum size of the control messages
 *    from DSP
 * - Maximum size of the packets being transmitted
 */
#define TX_BUF_SIZE 		(((700+16)/16)*16)
#define RX_BUF_SIZE 		TX_BUF_SIZE  


/* Host Descriptor Region - [Size of descriptor * Number of descriptors] 
 *
 * MUST be 16 byte aligned.
 */
#pragma DATA_SECTION(gHostDesc, ".sharedDDR")
#pragma DATA_ALIGN (gHostDesc, 16)
UInt8 far gHostDesc[SIZE_HOST_DESC * NUM_HOST_DESC];

/* Buffers to be used for TX */
//#pragma DATA_SECTION (cppiMemTX, ".cppiMemTX");
//#pragma DATA_ALIGN(cppiMemTX, 16)
//Uint8 cppiMemTX[NUM_TX_DESC][TX_BUF_SIZE];

/* Buffers to be used for RX */
#pragma DATA_SECTION (cppiMemRX, ".cppiMemRX");
#pragma DATA_ALIGN(cppiMemRX, 16)
Uint8 cppiMemRX[NUM_RX_DESC][RX_BUF_SIZE];

/* QMSS queue handles */

/* Queue with free descriptors */
#pragma DATA_SECTION(gGlobalFreeQHnd, ".sharedDDR")
Qmss_QueueHnd                  far         gGlobalFreeQHnd;

/* TX queues used to send data to PA PDSP/CPSW.*/
#pragma DATA_ALIGN   (gPaTxQHnd, 128)
#pragma DATA_SECTION(gPaTxQHnd, ".sharedDDR")
Qmss_QueueHnd           far                gPaTxQHnd [NUM_PA_TX_QUEUES];

/* TX queue with free decriptors attached to data buffers for transmission.*/
#pragma DATA_ALIGN   (gTxFreeQHnd, 128)
#pragma DATA_SECTION(gTxFreeQHnd, ".sharedDDR")
Qmss_QueueHnd             far              gTxFreeQHnd;

/* RX queue with free decriptors attached to data buffers to be used
   by the PASS CPDMA to hold the received data.*/
#pragma DATA_ALIGN   (gRxFreeQHnd, 128)
#pragma DATA_SECTION(gRxFreeQHnd, ".sharedDDR")
Qmss_QueueHnd              far             gRxFreeQHnd;

/* RX queue used by the application to receive packets from PASS/CPSW.
   Each core has an independent RX queue. */
#pragma DATA_SECTION(gRxQHnd, ".sharedDDR")
Qmss_QueueHnd             far              gRxQHnd[NUM_CORES];


/* CPPI Handles used by the application */
#pragma DATA_SECTION(gCpdmaHnd, ".sharedDDR")
Cppi_Handle               far              gCpdmaHnd;

#pragma DATA_SECTION(gCpdmaTxChanHnd, ".sharedDDR")
Cppi_ChHnd                  far            gCpdmaTxChanHnd [NUM_PA_TX_QUEUES];

#pragma DATA_SECTION(gCpdmaRxChanHnd, ".sharedDDR")
Cppi_ChHnd                 far             gCpdmaRxChanHnd [NUM_PA_RX_CHANNELS];

Cppi_FlowHnd                   gRxFlowHnd;


//******************************************************************//
//JF: Added for Multicore SGMII Interrupt
//******************************************************************//
#define		RX_INT_THRESHOLD			1u
#define		PA_ACC_CHANNEL_NUM			0u
#pragma DATA_ALIGN (gHiPriAccumList, 16)
UInt32                                  gHiPriAccumList[(RX_INT_THRESHOLD + 1) * 2];
Bool                                    gIsPingListUsed = 0;
//******************************************************************//
//JF: Added for Multicore SGMII Interrupt
//******************************************************************//

//JF: Add for MemoryRegion1 used by SRIO Driver
#define NUM_HOST_DESC0               128
#define SIZE_HOST_DESC0              48
#pragma DATA_SECTION(host_region, ".sharedDDR")
#pragma DATA_ALIGN (host_region, 16)
Uint8   far host_region[NUM_HOST_DESC0 * SIZE_HOST_DESC0];


//for NJ SGMII Send
#define		LAN_Buffer_Length_MAX		1024
#define		LAN_Buffer_HEAD_Length		14


/* Constructed data packet to send. 
   Each core will have a slightly modified version
   of this packet which is stored in the core's local memory. */
#define PACKET_UDP_DEST_PORT_SHIFT  36
#define PACKET_PAYLOAD_SHIFT        42
#pragma DATA_ALIGN(pktMatch, 16)
//UInt8 pktMatch[] = {
//							0x00, 0x0A, 0x35, 0x01, 0x02, 0x03,                      /* Dest MAC */
//							0x00, 0x0A, 0x35, 0x01, 0x02, 0x07,                      /* Src MAC  */
//                            0x08, 0x00,                                              /* Ethertype = IPv4 */
//                            0x45, 0x00, 0x00, 0x6c,                                  /* IP version, services, total length */
//                            0x00, 0x00, 0x00, 0x00,                                  /* IP ID, flags, fragment offset */
//                            0x05, 0x11, 0x32, 0x26,                                  /* IP ttl, protocol (UDP), header checksum */
//                            0xc0, 0xa8, 0x01, 0x01,                                  /* Source IP address */
//                            0xc0, 0xa8, 0x01, 0x0a,                                  /* Destination IP address */
//                            0x12, 0x34, 0x56, 0x78,                                  /* UDP source port, dest port */
//                            0x00, 0x58, 0x1d, 0x18,                                  /* UDP len, UDP checksum */
//                            0x32, 0x33, 0x34, 0x35, 0x36, 0x37, 0x38, 0x39,          /* 80 bytes of payload data */
//                            0x3a, 0x3b, 0x3c, 0x3d, 0x3e, 0x3f, 0x40, 0x41,
//                            0x42, 0x43, 0x44, 0x45, 0x46, 0x47, 0x48, 0x49,
//                            0x4a, 0x4b, 0x4c, 0x4d, 0x4e, 0x4f, 0x50, 0x51,
//                            0x52, 0x53, 0x54, 0x55, 0x56, 0x57, 0x58, 0x59,
//                            0x5a, 0x5b, 0x5c, 0x5d, 0x5e, 0x5f, 0x60, 0x61,
//                            0x62, 0x63, 0x64, 0x65, 0x66, 0x67, 0x68, 0x69,
//                            0x6a, 0x6b, 0x6c, 0x6d, 0x6e, 0x6f, 0x70, 0x71,
//                            0x72, 0x73, 0x74, 0x75, 0x76, 0x77, 0x78, 0x79,
//                            0x7a, 0x7b, 0x7c, 0x7d, 0x7e, 0x7f, 0x80, 0x81  };

/* Tx/Rx packet counters */
volatile UInt32						gTxCounter = 0, gRxCounter = 0;
//JF: Added for Multicore SGMII Interrupt
UInt32 printf_flag = 0;

/* High Priority Accumulation Interrupt Service Handler for this application */

/** ============================================================================
 *   @n@b Convert_CoreLocal2GlobalAddr
 *
 *   @b Description
 *   @n This API converts a core local L2 address to a global L2 address.
 *
 *   @param[in]  
 *   @n addr            L2 address to be converted to global.
 * 
 *   @return    UInt32
 *   @n >0              Global L2 address
 * =============================================================================
 */
UInt32 Convert_CoreLocal2GlobalAddr (UInt32  addr)
{
	UInt32 coreNum;

    /* Get the core number. */
    coreNum = CSL_chipReadReg(CSL_CHIP_DNUM); 

    /* Compute the global address. */
    return ((1 << 28) | (coreNum << 24) | (addr & 0x00ffffff));
}    

/** ============================================================================
 *   @n@b Init_Qmss
 *
 *   @b Description
 *   @n This API initializes the QMSS LLD on core 0 only.
 *
 *   @param[in]  
 *   @n None
 * 
 *   @return    Int32
 *              -1      -   Error
 *              0       -   Success
 * =============================================================================
 */
Int32 Init_Qmss (Void)
{
    Int32                       result;
    Qmss_MemRegInfo             memCfg;
    Qmss_InitCfg                qmssInitConfig;
    Qmss_MemRegInfo     		memRegInfo;
    Cppi_DescCfg                cppiDescCfg;
    UInt32                      numAllocated;

    /* Initialize QMSS */
    memset (&qmssInitConfig, 0, sizeof (Qmss_InitCfg));

    /* Set up QMSS configuration */

    /* Use internal linking RAM */
    qmssInitConfig.linkingRAM0Base  =   0;   
    qmssInitConfig.linkingRAM0Size  =   0;
    qmssInitConfig.linkingRAM1Base  =   0x0C000000;
    qmssInitConfig.maxDescNum       =   NUM_HOST_DESC*2;
    
    qmssInitConfig.pdspFirmware[0].pdspId = Qmss_PdspId_PDSP1;
#ifdef _LITTLE_ENDIAN    
    qmssInitConfig.pdspFirmware[0].firmware = (void *) &acc48_le;
    qmssInitConfig.pdspFirmware[0].size = sizeof (acc48_le);
#else
    qmssInitConfig.pdspFirmware[0].firmware = (void *) &acc48_be;
    qmssInitConfig.pdspFirmware[0].size = sizeof (acc48_be);
#endif    

    /* Initialize the Queue Manager */
    result = Qmss_init (&qmssInitConfig, &qmssGblCfgParams);
    if (result != QMSS_SOK)
    {
        System_printf ("SGM Error initializing Queue Manager SubSystem, Error code : %d\n", result);
        return -1;
    }

    /* Start Queue manager on this core */
    Qmss_start ();

    /* Setup the descriptor memory regions. 
     *
     * The Descriptor base addresses MUST be global addresses and
     * all memory regions MUST be setup in ascending order of the
     * descriptor base addresses.
     */

    /* Initialize and setup CPSW Host Descriptors required for example */
    memset (gHostDesc, 0, SIZE_HOST_DESC * NUM_HOST_DESC);
    memCfg.descBase             =   (UInt32 *) gHostDesc;
    memCfg.descSize             =   SIZE_HOST_DESC;
    memCfg.descNum              =   NUM_HOST_DESC;
    memCfg.manageDescFlag       =   Qmss_ManageDesc_MANAGE_DESCRIPTOR;
    memCfg.memRegion            =   Qmss_MemRegion_MEMORY_REGION0;
    memCfg.startIndex           =   0;

    /* Insert Host Descriptor memory region */
    result = Qmss_insertMemoryRegion(&memCfg);
    if (result == QMSS_MEMREGION_ALREADY_INITIALIZED)
    {
        System_printf ("Memory Region %d already Initialized \n", memCfg.memRegion);
    }
    else if (result < QMSS_SOK)
    {
        System_printf ("Error: Inserting memory region %d, Error code : %d\n", memCfg.memRegion, result);
        return -1;
    }    

    //JF: Add for MemoryRegion1 used by SRIO Driver
    /* Memory Region 1 Configuration */
    //memRegInfo.descBase         = (UInt32 *)l2_global_address((UInt32)host_region);
    memRegInfo.descBase         = (UInt32 *) host_region;
    memRegInfo.descSize         = SIZE_HOST_DESC;
    memRegInfo.descNum          = 128;
    memRegInfo.manageDescFlag   = Qmss_ManageDesc_MANAGE_DESCRIPTOR;
    memRegInfo.memRegion        = Qmss_MemRegion_MEMORY_REGION1;//JF:change from Qmss_MemRegion_MEMORY_REGION_NOT_SPECIFIED
    memRegInfo.startIndex		= NUM_HOST_DESC;//JF: JF added
    result = Qmss_insertMemoryRegion (&memRegInfo);
    if (result < QMSS_SOK)
    {
        System_printf ("Error inserting memory region: %d\n", result);
        return -1;
    }

    /* Initialize all the descriptors we just allocated on the
     * memory region above. Setup the descriptors with some well
     * known values before we use them for data transfers.
     */
    memset (&cppiDescCfg, 0, sizeof (cppiDescCfg));
    cppiDescCfg.memRegion       =   Qmss_MemRegion_MEMORY_REGION0;
    cppiDescCfg.descNum         =   NUM_HOST_DESC;
    cppiDescCfg.destQueueNum    =   QMSS_PARAM_NOT_SPECIFIED;     
    cppiDescCfg.queueType       =   Qmss_QueueType_GENERAL_PURPOSE_QUEUE;
    cppiDescCfg.initDesc        =   Cppi_InitDesc_INIT_DESCRIPTOR;
    cppiDescCfg.descType        =   Cppi_DescType_HOST;
    
    /* By default:
     *      (1) Return descriptors to tail of queue 
     *      (2) Always return entire packet to this free queue
     *      (3) Set that PS Data is always present in start of SOP buffer
     *      (4) Configure free q num < 4K, hence qMgr = 0
     *      (5) Recycle back to the same Free queue by default.
     */
    cppiDescCfg.returnPushPolicy            =   Qmss_Location_TAIL;    
    cppiDescCfg.cfg.host.returnPolicy       =   Cppi_ReturnPolicy_RETURN_ENTIRE_PACKET;    
    cppiDescCfg.cfg.host.psLocation         =   Cppi_PSLoc_PS_IN_DESC;         
    cppiDescCfg.returnQueue.qMgr            =   0;    
    cppiDescCfg.returnQueue.qNum            =   QMSS_PARAM_NOT_SPECIFIED; 
    cppiDescCfg.epibPresent                 =   Cppi_EPIB_EPIB_PRESENT;
    
    /* Initialize the descriptors, create a free queue and push descriptors to a global free queue */
    if ((gGlobalFreeQHnd = Cppi_initDescriptor (&cppiDescCfg, &numAllocated)) <= 0)
    {
        System_printf ("Error Initializing Free Descriptors, Error: %d \n", gGlobalFreeQHnd);
        return -1;
    }
    else
    {
        System_printf ("Initializing Free Descriptors. \n");
    }        
   
    /* Queue Manager Initialization Done */
    return 0;
}


/** ============================================================================
 *   @n@b Init_Qmss_Local
 *
 *   @b Description
 *   @n This API initializes the QMSS LLD in cores other than core 0.
 *
 *   @param[in]  
 *   @n None
 * 
 *   @return    Int32
 *              -1      -   Error
 *              0       -   Success
 * =============================================================================
 */
Int32 Init_Qmss_Local (Void)
{
  Int32            result;

  while(1)
  {
      /* Block until Qmss_init() has completed by core 0 */
      result = Qmss_start();
      if(result == QMSS_NOT_INITIALIZED)
      {
          System_printf ("QMSS Not yet Initialized\n");
          continue;
      }
      else if (result != QMSS_SOK)  {
        System_printf ("Qmss_start failed with error code %d\n", result);
        return (-1);
      }

      if (result == QMSS_SOK) 
      {
          break;
      }
  }

  return 0;
}


/** ============================================================================
 *   @n@b Init_Cppi
 *
 *   @b Description
 *   @n This API initializes the CPPI LLD, opens the PASS CPDMA and opens up
 *      the Tx, Rx channels required for data transfers.
 *
 *   @param[in]  
 *   @n None
 * 
 *   @return    Int32
 *              -1      -   Error
 *              0       -   Success
 * =============================================================================
 */
Int32 Init_Cppi (Void)
{
    Int32                       result, i;        
    Cppi_CpDmaInitCfg           cpdmaCfg;
    UInt8                       isAllocated;        
    Cppi_TxChInitCfg            txChCfg;
    Cppi_RxChInitCfg            rxChInitCfg;

    /* Initialize CPPI LLD */
    result = Cppi_init (&cppiGblCfgParams);
    if (result != CPPI_SOK)
    {
        System_printf ("Error initializing CPPI LLD, Error code : %d\n", result);
        return -1;
    }

    /* Initialize PASS CPDMA */
    memset (&cpdmaCfg, 0, sizeof (Cppi_CpDmaInitCfg));
    cpdmaCfg.dmaNum     = Cppi_CpDma_PASS_CPDMA;
    if ((gCpdmaHnd = Cppi_open (&cpdmaCfg)) == NULL)
    {
        System_printf ("Error initializing CPPI for PASS CPDMA %d \n", cpdmaCfg.dmaNum);
        return -1;
    }    

    /* Open all CPPI Tx Channels. These will be used to send data to PASS/CPSW */             
    for (i = 0; i < NUM_PA_TX_QUEUES; i ++)
    {
        txChCfg.channelNum      =   i;       /* CPPI channels are mapped one-one to the PA Tx queues */
        txChCfg.txEnable        =   Cppi_ChState_CHANNEL_DISABLE;  /* Disable the channel for now. */
        txChCfg.filterEPIB      =   0;
        txChCfg.filterPS        =   0;
        txChCfg.aifMonoMode     =   0;
        txChCfg.priority        =   2;
        if ((gCpdmaTxChanHnd[i] = Cppi_txChannelOpen (gCpdmaHnd, &txChCfg, &isAllocated)) == NULL)
        {
            System_printf ("Error opening Tx channel %d\n", txChCfg.channelNum);
            return -1;
        }

        Cppi_channelEnable (gCpdmaTxChanHnd[i]);
    }

    /* Open all CPPI Rx channels. These will be used by PA to stream data out. */
    for (i = 0; i < NUM_PA_RX_CHANNELS; i++)
    {
        /* Open a CPPI Rx channel that will be used by PA to stream data out. */
        rxChInitCfg.channelNum  =   i; 
        rxChInitCfg.rxEnable    =   Cppi_ChState_CHANNEL_DISABLE; 
        if ((gCpdmaRxChanHnd[i] = Cppi_rxChannelOpen (gCpdmaHnd, &rxChInitCfg, &isAllocated)) == NULL)
        {
            System_printf ("Error opening Rx channel: %d \n", rxChInitCfg.channelNum);
            return -1;
        }

        /* Also enable Rx Channel */
        Cppi_channelEnable (gCpdmaRxChanHnd[i]);    
    }
    
    /* Clear CPPI Loobpack bit in PASS CDMA Global Emulation Control Register */
    Cppi_setCpdmaLoopback(gCpdmaHnd, 0);   

    //JF: Added for IPC work well
    result = Cppi_init (&cppiGblCfgParams);
    if (result != CPPI_SOK)
    {
        System_printf ("Error initializing CPPI LLD, Error code : %d\n", result);
        return -1;
    }

    /* CPPI Init Done. Return success */
    return 0;
}    

/** ============================================================================
 *   @n@b Setup_Tx
 *
 *   @b Description
 *   @n This API sets up all relevant data structures and configuration required
 *      for sending data to PASS/Ethernet. It sets up a Tx free descriptor queue,
 *      PASS Tx queues required for send.
 *
 *   @param[in]  
 *   @n None
 * 
 *   @return    Int32
 *              -1      -   Error
 *              0       -   Success
 * =============================================================================
 */
Int32 Setup_Tx (Void)
{
    UInt8                       isAllocated;
    UInt32						i;
    Qmss_Queue                  qInfo;
    Ptr                   		pCppiDesc;

	UInt32 coreNum;
    coreNum = CSL_chipReadReg(CSL_CHIP_DNUM);

    /* Open all Transmit (Tx) queues. 
     *
     * These queues are used to send data to PA PDSP/CPSW.
     */
	for (i = 0; i < NUM_PA_TX_QUEUES; i ++)
	{

		if ((gPaTxQHnd[i] = Qmss_queueOpen (Qmss_QueueType_PASS_QUEUE, QMSS_PARAM_NOT_SPECIFIED, &isAllocated)) < 0)
		{
			System_printf ("Error opening PA Tx queue \n");
			return -1;
		}
	}
	SYS_CACHE_WB ((void *)gPaTxQHnd, 128, CACHE_WAIT);
    

    /* Open a Tx Free Descriptor Queue (Tx FDQ). 
     *
     * This queue will be used to hold Tx free decriptors that can be filled
     * later with data buffers for transmission onto wire.
     */
//    if ((gTxFreeQHnd = Qmss_queueOpen (Qmss_QueueType_STARVATION_COUNTER_QUEUE, PA_TX_Queue + coreNum, &isAllocated)) < 0)
//    {
//        System_printf ("Error opening Tx Free descriptor queue \n");
//        return -1;
//    }

    if ((gTxFreeQHnd = Qmss_queueOpen (Qmss_QueueType_GENERAL_PURPOSE_QUEUE, PA_TX__General_Queue + coreNum, &isAllocated)) < 0)
    {
        System_printf ("Error opening Tx Free descriptor queue \n");
        return -1;
    }
    
    SYS_CACHE_WB ((void *)&gTxFreeQHnd, 128, CACHE_WAIT);
            

    qInfo = Qmss_getQueueNumber (gTxFreeQHnd);

    /* Attach some free descriptors to the Tx free queue we just opened. */
    for (i = 0; i < NUM_TX_DESC; i++)
    {
        /* Get a free descriptor from the global free queue we setup 
         * during initialization.
         */
        if ((pCppiDesc = Qmss_queuePop (gGlobalFreeQHnd)) == NULL)
        {
            break;                
        }

        /* The descriptor address returned from the hardware has the 
         * descriptor size appended to the address in the last 4 bits.
         *
         * To get the true descriptor size, always mask off the last 
         * 4 bits of the address.
         */
        pCppiDesc = (Ptr) ((UInt32) pCppiDesc & 0xFFFFFFF0);

//        /* Populate the Tx free descriptor with the buffer. */
//        Cppi_setData (Cppi_DescType_HOST, pCppiDesc, (Uint8 *)(&cppiMemTX[i]), TX_BUF_SIZE);
//
//        /* Save original buffer information */
//        Cppi_setOriginalBufInfo (Cppi_DescType_HOST, pCppiDesc, (Uint8 *)(&cppiMemTX[i]), TX_BUF_SIZE);

        /* Setup the Completion queue:
         *
         * Setup the return policy for this desc to return to the free q we just
         * setup instead of the global free queue.
         */
        Cppi_setReturnQueue ((Cppi_DescType) Cppi_DescType_HOST, pCppiDesc, qInfo);

//        Cppi_setPacketLen    (Cppi_DescType_HOST, pCppiDesc, TX_BUF_SIZE);
//
//        SYS_CACHE_WB (pCppiDesc, SIZE_HOST_DESC, CACHE_FENCE_WAIT);

        /* Push descriptor to Tx free queue */
        Qmss_queuePushDescSize (gTxFreeQHnd, pCppiDesc, SIZE_HOST_DESC);
    }
    if (i != NUM_TX_DESC)
    {
        System_printf ("Error allocating Tx free descriptors \n");            
        return -1;
    }

    /* All done with Rx configuration. Return success. */
    return 0;
}

/** ============================================================================
 *   @n@b Setup_Rx
 *
 *   @b Description
 *   @n This API sets up all relevant data structures and configuration required
 *      for receiving data from PASS/Ethernet. It sets up a Rx free descriptor queue
 *      with some empty pre-allocated buffers to receive data, and an Rx queue
 *      to which the Rxed data is streamed for the example application. 
 *
 *   @param[in]  
 *   @n None
 * 
 *   @return    Int32
 *              -1      -   Error
 *              0       -   Success
 * =============================================================================
 */
Int32 Setup_Rx (void * SGMII_ISR_arg)
{
    UInt8                       isAllocated;
    UInt32						i;
    Qmss_Queue                  rxFreeQInfo, rxQInfo;
    Ptr                   		pCppiDesc;
    Cppi_RxFlowCfg              rxFlowCfg;
    Ptr                         pDataBuffer;
    Uint32                      mySWInfo[] = {0x11112222, 0x33334444};
	UInt32                      coreNum;

	 //JF: Added for Multicore SGMII Interrupt
	UInt16                      numAccEntries, intThreshold;
	UInt8                       accChannelNum;
	Int32                       result;
	Qmss_AccCmdCfg              accCfg;
	Int16                       eventId;

    /* Get the core number. */
    coreNum = CSL_chipReadReg(CSL_CHIP_DNUM); 
    
    /* Open a Receive (Rx) queue. 
     *
     * This queue will be used to hold all the packets received by PASS/CPSW
     *
     */
    if ((gRxQHnd[coreNum] = Qmss_queueOpen (Qmss_QueueType_GENERAL_PURPOSE_QUEUE, RX_QUEUE_NUM_INIT+coreNum, &isAllocated)) < 0)
    {
        System_printf ("Error opening gRxQHnd queue \n");
        return -1;
    }            
    rxQInfo = Qmss_getQueueNumber (gRxQHnd[coreNum]);

//if(coreNum == 0)
//{
    //******************************************************************//
    //JF: Added for Multicore SGMII Interrupt
    //******************************************************************//
    intThreshold    =   RX_INT_THRESHOLD;
    numAccEntries   =   (intThreshold + 1) * 2;
    accChannelNum   =   PA_ACC_CHANNEL_NUM+coreNum;
    memset ((Void *) gHiPriAccumList, 0, numAccEntries * 4);
    result = Qmss_disableAccumulator (Qmss_PdspId_PDSP1, accChannelNum);
    if (result != QMSS_ACC_SOK && result != QMSS_ACC_CHANNEL_NOT_ACTIVE)
    {
        System_printf ("Error Disabling high priority accumulator for channel : %d error code: %d %d\n",
                      accChannelNum, result, coreNum);
        return -1;
    }
    accCfg.channel             =   accChannelNum;
    accCfg.command             =   Qmss_AccCmd_ENABLE_CHANNEL;
    accCfg.queueEnMask         =   0;
    accCfg.listAddress         =   Convert_CoreLocal2GlobalAddr((Uint32) gHiPriAccumList);
    accCfg.queMgrIndex         =   gRxQHnd[coreNum];
    accCfg.maxPageEntries      =   (intThreshold + 1); /* Add an extra entry for holding the entry count */
    accCfg.timerLoadCount      =   0;
    accCfg.interruptPacingMode =   Qmss_AccPacingMode_LAST_INTERRUPT;
    accCfg.listEntrySize       =   Qmss_AccEntrySize_REG_D;
    accCfg.listCountMode       =   Qmss_AccCountMode_ENTRY_COUNT;
    accCfg.multiQueueMode      =   Qmss_AccQueueMode_SINGLE_QUEUE;//Qmss_AccQueueMode_SINGLE_QUEUEQmss_AccQueueMode_MULTI_QUEUE
    if ((result = Qmss_programAccumulator (Qmss_PdspId_PDSP1, &accCfg)) != QMSS_ACC_SOK)
    {
        System_printf ("Error Programming high priority accumulator for channel : %d queue : %d error code : %d\n",
                        accCfg.channel, accCfg.queMgrIndex, result);
        return -1;
    }
    eventId     	=   48;
    EventCombiner_dispatchPlug (eventId, (EventCombiner_FuncPtr)SGMII_ISR_arg, (UArg)NULL, TRUE);
    //******************************************************************//
    //JF: Added for Multicore SGMII Interrupt
    //******************************************************************//
//}



    /* The following RX queues are shared between cores, so their
       initialization is done by core zero only*/
    if(!coreNum)
    {   
        /* Open a Rx Free Descriptor Queue (Rx FDQ). 
         *
         * This queue will hold all the Rx free decriptors. These descriptors will be
         * used by the PASS CPDMA to hold data received via CPSW.
         */
        if ((gRxFreeQHnd = Qmss_queueOpen (Qmss_QueueType_STARVATION_COUNTER_QUEUE, QMSS_PARAM_NOT_SPECIFIED, &isAllocated)) < 0)
        {
            System_printf ("Error opening Rx Free descriptor queue \n");
            return -1;
        }            
        
        SYS_CACHE_WB ((void *)&gRxFreeQHnd, 128, CACHE_WAIT);
        
        rxFreeQInfo = Qmss_getQueueNumber (gRxFreeQHnd);

        /* Attach some free descriptors to the Rx free queue we just opened. */
        for (i = 0; i < NUM_RX_DESC; i++)
        {
            /* Get a free descriptor from the global free queue we setup 
             * during initialization.
             */
            if ((pCppiDesc = Qmss_queuePop (gGlobalFreeQHnd)) == NULL)
            {
                System_printf ("Error poping descriptor.\n");
                break;                
            }

            /* The descriptor address returned from the hardware has the 
             * descriptor size appended to the address in the last 4 bits.
             *
             * To get the true descriptor size, always mask off the last 
             * 4 bits of the address.
             */
            pCppiDesc = (Ptr) ((UInt32) pCppiDesc & 0xFFFFFFF0);
            
            pDataBuffer = (Uint8 *)(&cppiMemRX[i]);
            /* Populate the Rx free descriptor with the buffer we just allocated. */
            Cppi_setData (Cppi_DescType_HOST, pCppiDesc, (UInt8 *)pDataBuffer, RX_BUF_SIZE);

            /* Save original buffer information */
            Cppi_setOriginalBufInfo (Cppi_DescType_HOST, pCppiDesc, (UInt8 *)pDataBuffer, RX_BUF_SIZE);
            

            /* Setup the Completion queue:
             *
             * Setup the return policy for this desc to return to the free q we just
             * setup instead of the global free queue.
             */
            Cppi_setReturnQueue (Cppi_DescType_HOST, pCppiDesc, rxFreeQInfo);

            Cppi_setSoftwareInfo (Cppi_DescType_HOST, pCppiDesc, (UInt8 *) mySWInfo);

            Cppi_setPacketLen    (Cppi_DescType_HOST, pCppiDesc, RX_BUF_SIZE);
            
            SYS_CACHE_WB (pCppiDesc, SIZE_HOST_DESC, CACHE_FENCE_WAIT);
            
            /* Push descriptor to Tx free queue */
            Qmss_queuePushDescSize (gRxFreeQHnd, pCppiDesc, SIZE_HOST_DESC);           
        }        
        if (i != NUM_RX_DESC)
        {
            System_printf ("Error allocating Rx free descriptors \n");
            return -1;
        }
    }
    
    /* Setup a Rx Flow on each core. The only difference among the cores is the rxQInfo.
     *
     * A Rx flow encapsulates all relevant data properties that CPDMA would
     * have to know in order to succefully receive data.
     */
    /* Initialize the flow configuration */
    memset (&rxFlowCfg, 0, sizeof(Cppi_RxFlowCfg));
    rxFreeQInfo = Qmss_getQueueNumber (gRxFreeQHnd);

    /* Let CPPI pick the next available flow */
    rxFlowCfg.flowIdNum             =   CPPI_PARAM_NOT_SPECIFIED;    

    rxFlowCfg.rx_dest_qmgr          =   rxQInfo.qMgr;    
    rxFlowCfg.rx_dest_qnum          =   rxQInfo.qNum;  
    rxFlowCfg.rx_desc_type          =   Cppi_DescType_HOST; 

    rxFlowCfg.rx_ps_location        =   Cppi_PSLoc_PS_IN_DESC;  
    rxFlowCfg.rx_psinfo_present     =   1;    /* Enable PS info */
    
    rxFlowCfg.rx_error_handling     =   0;    /* Drop the packet, do not retry on starvation by default */       
    rxFlowCfg.rx_einfo_present      =   1;    /* EPIB info present */       
    
    rxFlowCfg.rx_dest_tag_lo_sel    =   0;    /* Disable tagging */
    rxFlowCfg.rx_dest_tag_hi_sel    =   0;    
    rxFlowCfg.rx_src_tag_lo_sel     =   0;    
    rxFlowCfg.rx_src_tag_hi_sel     =   0;    

    rxFlowCfg.rx_size_thresh0_en    =   0;    /* By default, we disable Rx Thresholds */
    rxFlowCfg.rx_size_thresh1_en    =   0;    /* By default, we disable Rx Thresholds */
    rxFlowCfg.rx_size_thresh2_en    =   0;    /* By default, we disable Rx Thresholds */
    rxFlowCfg.rx_size_thresh0       =   0x0;
    rxFlowCfg.rx_size_thresh1       =   0x0;
    rxFlowCfg.rx_size_thresh2       =   0x0;

    rxFlowCfg.rx_fdq0_sz0_qmgr      =   rxFreeQInfo.qMgr; /* Setup the Receive free queue for the flow */
    rxFlowCfg.rx_fdq0_sz0_qnum      =   rxFreeQInfo.qNum;    
    rxFlowCfg.rx_fdq0_sz1_qnum      =   0x0; 
    rxFlowCfg.rx_fdq0_sz1_qmgr      =   0x0;
    rxFlowCfg.rx_fdq0_sz2_qnum      =   0x0;
    rxFlowCfg.rx_fdq0_sz2_qmgr      =   0x0;
    rxFlowCfg.rx_fdq0_sz3_qnum      =   0x0;
    rxFlowCfg.rx_fdq0_sz3_qmgr      =   0x0;

    rxFlowCfg.rx_fdq1_qnum          =   rxFreeQInfo.qNum;  /* Use the Rx Queue to pick descriptors */
    rxFlowCfg.rx_fdq1_qmgr          =   rxFreeQInfo.qMgr;
    rxFlowCfg.rx_fdq2_qnum          =   rxFreeQInfo.qNum;  /* Use the Rx Queue to pick descriptors */
    rxFlowCfg.rx_fdq2_qmgr          =   rxFreeQInfo.qMgr;
    rxFlowCfg.rx_fdq3_qnum          =   rxFreeQInfo.qNum;  /* Use the Rx Queue to pick descriptors */
    rxFlowCfg.rx_fdq3_qmgr          =   rxFreeQInfo.qMgr;

    /* Configure the Rx flow */
    if ((gRxFlowHnd = Cppi_configureRxFlow (gCpdmaHnd, &rxFlowCfg, &isAllocated)) == NULL)
    {
        System_printf ("Error configuring Rx flow \n");
        return -1;
    }

    /* All done with Rx configuration. Return success. */
    return 0;
}

/** ============================================================================
 *   @n@b SendPacket
 *
 *   @b Description
 *   @n This API is called to actually send out data onto wire using ethernet.
 *      On success, this API increments a global Tx counter to indicate the same.
 *
 *   @param[in]  
 *   @n None
 * 
 *   @return    Int32
 *              -1      -   Error
 *              0       -   Success
 * =============================================================================
 */
UInt8 myDSP_MACAddr_copy[4][6];
UInt32			queue_pkt_num  = 0;
Int32 ICT_BBU_BSP_SGMII_SendPacket (UInt8 *payload, Uint32 length, UInt8 * dst_MACAddr)
{
    Cppi_Desc*      pCppiDesc;
    UInt32          dataBufferSize;
    char            psFlags = (!cpswSimTest)?pa_EMAC_PORT_0:pa_EMAC_PORT_1;	//hby_change;
    UInt8 			pktMatch[LAN_Buffer_Length_MAX];
    Int16      		ii;
    UInt32			cycle_i = 0;


	UInt32 coreNum;
    coreNum = CSL_chipReadReg(CSL_CHIP_DNUM);

    for (ii=0; ii<6; ii++)
	{
		pktMatch[ii] = dst_MACAddr[ii];
	}
    for (ii=0; ii<6; ii++)
	{
		pktMatch[ii+6] = myDSP_MACAddr_copy[coreNum][ii];
	}
    pktMatch[12] = 0x08;
    pktMatch[13] = 0x00;

    if (length < LAN_Buffer_Length_MAX - LAN_Buffer_HEAD_Length)
    {
    	for (ii=0; ii<length; ii++)
		{
			pktMatch[ii+LAN_Buffer_HEAD_Length] = payload[ii];
		}
    	for (ii=length; ii<LAN_Buffer_Length_MAX - LAN_Buffer_HEAD_Length; ii++)
		{
			pktMatch[ii+LAN_Buffer_HEAD_Length] = 0;
		}
    }
    else
    {
    	System_printf ("The data you transmit is too long \n");
    	BIOS_exit (-1);
    }

    do
    {
    	queue_pkt_num = Qmss_getQueueEntryCount(gTxFreeQHnd);
    }while(queue_pkt_num == 0);

    if ((pCppiDesc = Qmss_queuePop (gTxFreeQHnd)) == NULL)
    {
        System_printf ("No Tx free descriptor. Cant run send/rcv test \n");
        return -1;
    }

    /* The descriptor address returned from the hardware has the
     * descriptor size appended to the address in the last 4 bits.
     *
     * To get the true descriptor size, always mask off the last
     * 4 bits of the address.
     */
    pCppiDesc = (Ptr) ((UInt32) pCppiDesc & 0xFFFFFFF0);


    //dataBufferSize  =   sizeof (pktMatch);
    dataBufferSize = length+14;

//    /* Disable Interrupts */
//    key = Hwi_disable();

    /* Cleanup the prefetch buffer also. */
    CSL_XMC_invalidatePrefetchBuffer();

    SYS_CACHE_INV (pCppiDesc, SIZE_HOST_DESC, CACHE_FENCE_WAIT);

    Cppi_setData (  Cppi_DescType_HOST,
                    (Cppi_Desc *) pCppiDesc, 
                    (UInt8 *) Convert_CoreLocal2GlobalAddr((UInt32)pktMatch),
                    dataBufferSize
                 );
    Cppi_setPacketLen (Cppi_DescType_HOST, (Cppi_Desc *)pCppiDesc, dataBufferSize);
    
    if (cpswLpbkMode != CPSW_LOOPBACK_NONE)
    {
        /* Force the packet to specific EMAC port if loopback is enabled */
        Cppi_setPSFlags(Cppi_DescType_HOST, (Cppi_Desc *)pCppiDesc, psFlags);
    }
    else
    {
        Cppi_setPSFlags(Cppi_DescType_HOST, (Cppi_Desc *)pCppiDesc, 0);
    }

    /* Clear PS Data */
    Cppi_setPSLen (Cppi_DescType_HOST, (Cppi_Desc *)pCppiDesc, 0);

    SYS_CACHE_WB (pCppiDesc, SIZE_HOST_DESC, CACHE_FENCE_WAIT);
  
//    /* Reenable Interrupts. */
//    Hwi_restore(key);

    /* Send the packet out the mac. It will loop back to PA if the mac/switch
     * have been configured properly
     */
    Qmss_queuePushDescSize (gPaTxQHnd[8], pCppiDesc, SIZE_HOST_DESC);

    /* Increment the application transmit counter */
    gTxCounter ++;

    /* Give some time for the PA to process the packet */
    //CycleDelay (10000);
//	for(cycle_i = 0; cycle_i < 1000; cycle_i++)
//		asm(" nop 5");

    return 0; 
}

/** ============================================================================
 *   @n@b ReceivePacket
 *
 *   @b Description
 *   @n This API is called to Receive packets.
 *
 *   @param[in]  
 *   @n None
 * 
 *   @return    Int32
 *              -1      -   Error
 *              0       -   Success
 * =============================================================================
 */
Int32 ReceivePacket (void)
{
	Cppi_Desc     *hd;
	Int            j;
	UInt32         coreNum;
    Int32          status=0;

    /* Get the core number. */
    coreNum = CSL_chipReadReg(CSL_CHIP_DNUM); 
	
	/* Wait for a data packet from PA */
    for (j = 0; j < 100; j++)  
    {
      CycleDelay (1000);
      if (Qmss_getQueueEntryCount (gRxQHnd[coreNum]) > 0)   
      {
    	System_printf ("One Packet Received Done.\n");
        hd = (Cppi_Desc *)(((UInt32)Qmss_queuePop (gRxQHnd[coreNum])) & ~0xf);
        if(VerifyPacket(hd) != 0)
            status=-1;
      }
    } 
    
    return (status);
}

  • If I change the code below

        if (cpswLpbkMode != CPSW_LOOPBACK_NONE)
        {
            /* Force the packet to specific EMAC port if loopback is enabled */
            Cppi_setPSFlags(Cppi_DescType_HOST, (Cppi_Desc *)pCppiDesc, psFlags);
        }
        else
        {
            Cppi_setPSFlags(Cppi_DescType_HOST, (Cppi_Desc *)pCppiDesc, 0);
        }

    to

     Cppi_setPSFlags(Cppi_DescType_HOST, (Cppi_Desc *)pCppiDesc, 1);

    so that the DSP will bypass the ALE and send the packet out directly through port1 which connects with a SWITCH in our custom board.

    And then the performance of sending MAC packets will be increased very much! 

  • Hi Feng Jin,
    Apologize for the delay. Thank you for the update.
    Please refer the Throughput Performance Guide for C66x KeyStone Devices(SPRABK5A1) for ethernet performance numbers. Please refer below thread to improve the performance.
    e2e.ti.com/.../199755
    Thank you.