This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TMS320C5504: USB bulk TX using CPPI

Part Number: TMS320C5504

Hi Everyone,

Background:
There is a working application on TMS320C5504 DSP using USB connection with PC.
The connection uses control transaction on endpoint 0 and bulk in/out
transmissions on endpoint 2. Loading and unloading endpoint 2 fifo is done by cpu.
At this time there is a need to increase the usb throughput. To achieve this I need
to use the build in CPPI dma module. Unfortunately the documentation and the
CPPI module architecture doesn't make the job simple for me. I'll try to describe
my application and perhaps someone could give me a tip or point what I'm doing wrong.

I'm using the sprugh9a and SPRS659G documentations.

Goal:
The goal is to enable bulk TX transmission (DSP -> PC) on endpoint 2.
At the beginning I want to transmit 512 bytes of data. The PC side is prepared
and sends the IN tokens on endpoint 2 (I see it on the USB analyzer).

1. Endpoint 2 configuration

TXFIFOSZ - 0x06 - no double buffering, size 2^(6+3) = 512
TXMAXP - 0x200 - max packet 512
PERI_TXCSR - 0x3400 - TX enable, DMA enable

2. Queue Manager initialization

Memory region 0
CPU addr (word) : 0x00DCE0
USB addr : 0x0009B9C0

QMEMRBASE1[0] - 0xB9C0
QMEMRBASE2[0] - 0x0009
QMEMRCTRL1[0] - 0x0000 - 32 descriptors each 32 byte size
QMEMRCTRL2[0] - 0x0000

Linking Ram 0
CPU addr (word) : 0x00DC60
USB addr : 0x0009B8C0

LRAM0BASE1 - 0xB8C0
LRAM0BASE2 - 0x0009
LRAM0SIZE - 0x0020 - 32 entries (not writable or readable!)

Linking Ram 1
CPU addr (word) : 0x00DCA0
USB addr : 0x0009B940

LRAM1BASE1 - 0xB940
LRAM1BASE2 - 0x0009

3. Host packet descriptors

placement in Memory region 0

word 0 - 0x80000200 - type 0x10, 0 additional protocol specific words, size 512 bytes

word 1 - 0x10000000 - port 2, channel 0, sub channel 0, tag 0

word 2 - 0x14004000 - type usb, on chip memory

word 3 - 0x00000200 - size 512

word 4 - 0x01300000 - buffer addr with tx data (sdram CS0)

word 5 - 0x00000000 - only one descriptor

word 6 - 0x00000200 - size 512

word 7 - 0x01300000 - buffer addr with tx data (sdram CS0)

4. Queue Manager configuration

queue 18 - TX for endpoint 2

what to configure ?

5. Channel setup

TXGCR2[1] - 0x8000 - TX enable

6. DMA scheduler configuration

DMA_SCHED_CTRL1 - 0x0000 - one entry only
ENTRYLSW[0] - 0x0001 - channel 1 TX
DMA_SCHED_CTRL2 - 0x8000 - enable

7. push descriptor


CTRL1D[18] - 0x3802 - 32 byte descriptor size but and LSB from descriptor address
CTRL2D[18] - 0x0137 - MSB of the descriptor address

How to perform single double word write with 5504 DSP ??

After writing CTRL1D and CTRL2D for queue 18 I see that the queue has pending request
(PEND1 bit 2) the stat registers values are QSTATA[18] - 0x0001, QSTAT1B[18] - 0
QSTAT2B[18] - 0, QSTATC[18] - 0, no data is transferred on USB

What I'm missing in the configuration ?

Other unclear things and quastions.

1. Channel and Port. In the CPPI terminology the channel is described as a part
of the port. Each endpoint has dedicated port and one channel number 0 for USB.

In the HPD word 1 there is port number. Do I understand it correctly that for
endpoint 2 the numeric value for the port is 2?

In the scheduler description part there are 4 channels (from 0 to 3). Are those
reflected to the endpoints 1-4 or only one channel numbet 0 is valid ? I assumed
that those channels are for endpoints. I this correct ?

In HPD word 1 there is also channel value. Schould it be 0 (because there is a
notice that each port has one channel number 0) or should it reflect to the channel
in the scheduler.

2. Do I understand it correctly, that I don't need to write anything to linking ram ?

3. Can I use only one HPD for transfer ? (even for bigger transfers about 1MB)

4. While configuration I noticed that I can't read correctly the LRAM0SIZE register
after write, either the write or read does not work correctly.

5. The CTRL1D and CTRL2D register description is a little bit unclear. Do I understand
it correct that I should write the address parts of the descriptor to push ?








  • Hi,

    I've notified the USB experts. Their feedback will be posted here.

    Best Regards,
    Yordan
  • Lukasz,

    We are looking into your questions and will get back to you.

    Lali
  • Hi Lukasz,

    First of all, there is one thing I noticed in your code segment that you are using the DARAM for CDMA. It is incorrect. CDMA cannot access to the DARAM. It can only access to SARAM

    You may want to look into the USB CPPI example -- CSL_USB_DmaExample in the CSL which implements the USB Bulk endpoint example using the CPPI. If you write everything from scratch, it is going to be challenge. There are CSL functions like  USB_init(), USB_open() and USB_config() and USB_resetDev() were implemented for USB device initialization and configuration. USB_requestEndpt(), USB_configEndpt() are used to configure endpoints. The key function to look into is usb_isr() which handles the all the USB interrupts (aggregated into 1 interrupt). It examines the events which causes the USB interrupt, the processes them accordingly. for CDMA (CPPI DMA) transactions (RX or TX), please look into the

    CSL_USB_RX_INT_EPn and

    CSL_USB_TX_INT_EPn events

    In fact, CSL_USB_RX_INT_EPn happens when the EPn has got a received packet from PC. It means you have the OUT packet from PC in your receiving buffer (associated with the RX HPD in the queue). Use USB_getDataCountReadFromFifo() to get the number of bytes in the receiving buffer, and process them, then you will need to use USB_confDmaRX and USB_dmaRxStart() to add the HPD back to the RX request queue


    CSL_USB_TX_INT_EPn happens when the previous transmit packet (the TX HPD in the queue) has been transmitted to the PC, you will need to put the next TX packet in the TX request queue with USB_confDmaTx() the followed by USB_dmaTxStart()

    to mark the TX packet is ready to go (just waiting for the IN token from PC to come). If you do not know what to sent yet, then do not call USB_dmaTxStart(). The IN token from PC will be NAKed.

    Best regards,

     

    Ming

    USB_init();

    /* Initialize the USB module */

    hUsbDev = USB_open(CSL_USB0);

    if(hUsbDev == NULL)

    {

    printf("\nUSB open failed\n");

    return(result);

    }

    /* Initialize the USB module */

    status = USB_config(

  • Hello,

    At the beginning I want to thank for the hints. Unfortunately it looks like I will need to rewrite my usb code using CSL (what I already tried but I got problems with my existing  host side driver configuration). I always want to avoid using CLS (building from scratch needs more understanding and allows to debug the hardware while development).

    By the way. I'm missing something or I make a huge misunderstanding while reading the memory map. Because I don't sea exactly where the dual access port memory is used.

    DARAM range is : from CPU point of view 0x0000C0 - 0x010000 (byte) -> 0x000060 - 0x008000 (word), from USB controller side its 0x0001000C - 0x00020000 (byte address only)

    And I use :

    Memory region 0 - USB addr : 0x0009B9C0 (SARAM)

    Linking Ram 0 - USB addr : 0x0009B8C0 (SARAM)

    Linking Ram 1 - USB addr : 0x0009B940 (SARAM)

    buffer addr with tx data 0 USB addr : 0x01300000 (sdram CS0)

    Lukasz

  • Hi Lukasz,

    The CPPI is using the byte address, while the C55x is using the word address. It is required to convert the word address into the byte address for CPPI. The formula is word_addr*2 + 0x80000.

    In CSL, all the address conversion is done at the CSL function level. You can write your own code, but please use the CSL implementation as your reference.

    The other thing I think is worth to mention here is that the CPPI is a 32 bit engine, while the C5xx is a 16 bit machine, therefore, there is a 16bit to 32 bit bridge is in the middle. In order to write a 32 bit register in CPPI, you will need two 16 bit write ops. The write to the lower 16 bit will not trigger the 32 bit register write, it only buffers the 16 bit of the 32 data on the bridge. The higher 16 bit write will trigger the 32 bit register write which write both lower and higher 16 bit of the 32 bit data to the CPPI register, so you should always write the lower 16 bit first then followed by the higher 16 bit write. That is why those two 16 bit writes should be atomic (no interrupt is allowed). Please refer to the CSL implementation for details.

    Ming
  • Hello,

    thanks for tips. I was already performing the writes as recommended. Unfortunately I'm busy with some other tasks. I'll write the result of the CLS use.

    Lukasz