MIBSPI with DMA triggered from two tasks

Masayuki MIYOSHI

Prodigy 90 points

Other Parts Discussed in Thread: TMS570LS3137

Hello,

I am trying to check the MIBSPI and DMA function on TMS570LS3137 device.

On my board MIBSPI1, two slaves are connected.
And my RTOS generates two periodic tasks (1ms task and 10ms task).

Now, I would like to carry out SPI communication with SLAVE1 by 1ms task,

andwith SLAVE2 by 10ms task.

Furthermore, in order to reduce the overhead of CPU,
I would like to use RX DMA and TX DMA with MIBSPI1.

The conditions of MIBSPI1 communication are as follows.

"MASTER(TMS570) <-> SLAVE1"

Communication cycle : 1ms (SPI is triggered by 1ms task)
The number of data transmission at 1 cycle : 1 to 100 words.

(The number is variable and it may change for every cycle. It is decided in the 1ms task).

Word length : 32bits
Baudrate : 5MHz
Chip select : CS_0

"MASTER(TMS570) <-> SLAVE2"

Communication cycle : 10ms (SPI is triggered by 10ms task)
The number of data transmission at 1 cycle : 1 to 100 words.

(The number is variable and it may change for every cycle. It is decided in the 10ms task).

Chip select : CS_1

Others are the same conditions as SLAVE1.

The priority of SLAVE1 is higher than SLAVE2, and the communication with SLAVE2 can be interrupted

by the communication with SLAVE1.

I think there are three points which make the problem not easy.

1. The number of data transmission is variable for every cycle.

2. In the worst case, the send data size is over MIBRAM size.
(There are 128 buffers in MIBRAM, and one buffer can contain 16bits data for each of transmit and receive.
so MIBRAM can contain up to 2048 bits transmit/receive data .

However, I would like to send 3200 bits (32-bit * 100-word) data in the case of maximum transmission. )

3. Cooperation with TX/RX DMA.

Could you please tell me how it can be resolved?

over 12 years ago

0 Anthony F. Seely over 12 years ago

TI__Guru 68920 points

Hi Masayuki,

The DMA can extend the effective size of the Multibuffer RAM so the length (#2) isn't a problem.

The MibSPI can also proiritize one transfer group over the other, so we can make your 1ms task higher priroity.

I'll try to find an example for you.

Best Regards,

Anthony

0 Masayuki MIYOSHI over 12 years ago in reply to Anthony F. Seely

Prodigy 90 points

Hi Anthony,

thank you for your reply.

For your information, I am attaching my present project file; 3288.Mibspi_DMA_sample.zip.

(Only sys_main.c , mibspi.c and sys_dma.c are related to this question, although there are many files.)

I created it provisionally in order to check the cooperation of MIBSPI1 and DMA.

In this project, I use two transfer groups, TG0 and TG1.
TG0 is triggered by 1ms task and TG1 is triggered by 10ms task.
If a trigger starts, TX DMA will occur, MIBSPI1 communication starts, and rx data is stored to global variable by RX DMA.

TG0 is assigned 64 buffers and TG1 is also assigned 64 buffers (total 128 buffers).
Since one buffer can contain 16 bits tx data, I use two buffers for one data word transmission (SLAVE's word length is 32bits);
therefore TG0/TG1 can transmit 32 data words (64buffer * 16 bit / 32bit = 32 words) in one trigger cycle.

Under the conditions that the number of data transmission is invariable (fixed to 32 data words for each of TG0 and TG1),
it seems to work well.

However, I would like to send variable number data words (up to 100) in one trigger cycle, as a question I asked first.
Although I tried the method using a DMA chain, or the method using "TG completed ISR " (change the setting of control packet in ISR),
it did not work well.

I would appreciate it if you would give me an example.

Best Regards,

Masayuki

0 Anthony F. Seely over 12 years ago in reply to Masayuki MIYOSHI

TI__Guru 68920 points

Hi Masayuki,

Thanks for sending your project, it gave me a good idea of what you're trying to do.

Here's the strategy that I am thinking would work.

1) TG0 and TG1 would each be set to buffer size of 1.

2) Set TG0 and TG1 for one-shot mode

3) Use the DMAxCOUNT register to determine the # of transfers (this is where your variable length comes in)

4) DMA would be programmed for however many elements, it should transfer one element (16bit) for each

DMA request from the MibSPI.

5) In the 1ms and the 10ms task, you would:

a) make sure the last transfer completed

b) reprogram the DMA channel to point to your new data buffers

c) reprogram DMAxCOUNT to the new length

d) enable the TG again (bit 31 of TGxCTRL). (Trigger another one-shot).

In this model, you don't run out of MibSPI RAM because you use only 1 location for each TG.

This is like having 2 virtual 'SPI's' if you like, each serviced by DMA with its own RX and TX buffer.

What you are using from the MIB unit is:

- prioritization capability and ability of high priority TG to interrupt lower priority one

- DMA count to automatically generate a certain # of DMA transfers and then stop until triggered again

Best Regards,

Anthony

0 Masayuki MIYOSHI over 12 years ago in reply to Anthony F. Seely

Prodigy 90 points

Hi Anthony,

I am trying the method which you taught me ,
but it has not worked yet.
My present project is here [removed]
I use 1 mibram buffer for each TG in this project.

Now, I have some questions.

1) priority of TG0 and TG1

I use the TG0 in 1ms task and the TG1 in 10ms task.
However, it seems that the TG1 is not interruptable and
the communication cycle of the TG0 is disturbed.
The TG0 is kept waiting until the TG1 finishes.

What should I do for giving a high priority to TG0?

2) reprogram the DMA channel to point to your new data buffers

The number of communication data was able to be changed by rewriting DMAxCOUNT.
But it seems that new communication begin from the position of a continuation of last communication.
(DMA sequencer holds the index of a data buffer about the data that is sent last time?)
I would like to transmit from the head of a data buffer each cycle.

Would you tell me how to reprogram the DMA channel to point to new data buffers?

3) 2buffer version

For explanation, I call the above project "1buffer version" and
I made another project that is "2buffer version", I use two mibram buffer for each TG in this project[removed]

With the "1buffer version", in order to transmit one SLAVE's data word (32bit length), I have to use TX DMA twice. . It is inefficient.
And the "2buffer version" was made so that 32 bit data could be transmitted by 1 time of TXDMA.

With "2buffer version", higher 16 bit data are not transmitted
although lower 16 bit data can be transmitted correctly.

Please see the attached pictures.
The Pictures shows the result of trying to send four 32-bit data.

1st transmit is correct(32bit transmitted), but The 2nd or subsequent ones is not transmitted correctly (only lower 16bit is transmitted).

I think that NOBRK or BUFID (DMAxCTRL) are related.
Would you teach solution?

Best Regards,

Masayuki

0 Masayuki MIYOSHI over 12 years ago in reply to Masayuki MIYOSHI

Prodigy 90 points

I found another problem in 1buffer version project　(7268.mibspi_1buffer).

I set NOBRK=0 for TG1 (DMAxCTRL). Therefore, TG1 transmits only 16bit data (not 32bit data) in one trigger cycle.

I would like to send continuous 32-bit data.

If NOBRK=1 is used, TG1 can transmit 32bit data continuously, 　but　TG0 cannot interrupt and TG0 cycle of 1ms is disturbed.

I tried also about the TG1's LOCK bit (Multi-buffer RAM Transmit Data Register) to send continuous 32-bit data.

I set LOCK =1 when the higher 16bit data is transmitted, and LOCK=0 when the lower 16bit data is transmitted.

However, the result did not change (only 16-bit is send in one trigger cycle).

Do you have any idea to solve both the problem of a priority and the problem of the data length?

Best Regards,

Masayuki

0 Masayuki MIYOSHI over 12 years ago in reply to Masayuki MIYOSHI

Prodigy 90 points

Hi Anthony,

I've noticed that the project I attached above has a simple bug (7268.mibspi_1buffer and 6177.mibspi_2buffer).

sys_main.c:

mibspi->DMACTRL[TG0] = (g_DMA_oneshot << 31U) /* oneshot */
| ( (g_TG_buflen-1U) << 24U) /* BUFID */
| (DMA_CH0 << 20U) /* RXDMA_MAP */
| (DMA_CH1 << 16U) /* TXDMA_MAP */
| (1U << 15U) /* RXDMA */
| (1U << 14U) /* TXDMA */
| (1U << 13U) /* NOBRK */
| ((DMA_BUFLEN -1U) << 8U); /* ICOUNT */ /* not used */

At the red line, the limit of 1Fh was not applied to DMA_BUFLEN (the length of ICOUNT in DMAxCTRL is 5-bit width).
Therefore, the setup of the priority had not worked when the DMA_BUFLEN is bigger than 20h (=32).

| ((0U) << 8U); is the correct code because I use the DMAxCOUNT .

Since it has been confused, please let me ask some questions anew.
Would you forget the two above-mentioned messages I wrote, or can I remove the messages ("May 01 2013 05:25 AM" and "May 01 2013 21:54 PM")?

-----

I am trying your method and I made two projects; "1buffer version" and "2buffer version".

With "1buffer version" , one buffer is assigned for each TG.
As a result, in order to transmit one SLAVE's data word (32bit length), I have to use TX DMA twice (16bit x 2 times).

Yet with "2buffer version", I use two buffers for each TG.
It can transmit 32-bit data by one TX DMA.

In order to reduce an overhead, I will use not "1buffer version" but "2buffer version."; [[View:http://e2e.ti.com/cfs-file.ashx/__key/communityserver-discussions-components-files/312/6064.2buf.zip]]

Now, I would like to ask three questions because the project has some problems.

1) With TG0, only a 16bit data comes out when NOBRK=1.

Please see the attached pictures.

The 1st transmission is correct(32bit transmitted), but The 2nd or subsequent ones is not transmitted correctly (only lower 16bit is transmitted).

And the following figure is a screenshot of memory dump.

As the portion enclosed with a red line shows,higher 16-bit data are not receivable in TG0.

What should I do to send 32bit data for the 2nd or subsequent transmission?

2) TG1 transmits only 32-bit data at one trigger cycle.

Burst transmission is possible if I set it as NOBRK=1.
However, interruption is forbidden from TG0 at this time, and the priority of TG0 and TG1 is reversed.
That's why I have set NOBRK of TG1 as 0.

But if NOBRK is 0, only one 32-bit data will be transmitted by 1 cycle.
Please see the following picture for details.

I would like to carry out burst transmission of TG0 and TG1, as shown in the following figure.

If the trigger of TG0 starts while TG1 is performing burst transmission, TG0 TG0 will preempt the SPI lines.
And if TG0 communication is finished, TG1 will start transferring the remaining data.
Is it possible?

3) reprogram the DMA channel to point to your new data buffers

I add the function "changePntrOfDMA" to sys_main.c which performs DMA reprogramming , but it doesn't work well.

I would like to transmit from the head of a data buffer each cycle.

Would you tell me how to reprogram the DMA channel to point to new data buffers?

Best Regards,

Masayuki

0 Anthony F. Seely over 12 years ago in reply to Masayuki MIYOSHI

TI__Guru 68920 points

Hi Masayuki-san,

I haven't yet read the posts that you've made over the last day or two; my apologies. I was focused on trying to get something to work here that I could upload.

I am close I think, except I probably got the priority between the tasks and the SPI buffers backward from the way you wanted it.

I'll upload these files tonight and then spend some time tomorrow trying catch up with the work you've done and see how we compare.

Thanks and Best Regards,

Anthony

Here is the project:

5282.forum3.zip

Here is a screenshot: CS0 is the 1ms task and CS1 is the 10ms task. I have the 1ms task sending short bursts and the 10ms task sending longer

bursts so it's easy to compare them.

0 Anthony F. Seely over 12 years ago in reply to Anthony F. Seely

TI__Guru 68920 points

Hi Masayuki-san,

I made a few changes to the project I posted on May 2 to try to address your concerns.

I think this should show how to use the 2 buffers together to make the 32-bit transfer without interruption,

and also the question about always starting from the head of the DMA buffer.

Mainly, I changed the DMA mode to Indexed and use the element index to iterate through the buffer, but the frame index is 0 so each frame starts back at the beginning of the buffer.

The 32-bit transfer is done by using the lock.

This still has two problems and I am going to need to consult with the design owner to get an answer, because it's not behaving like I would expect.

The two problems are with switching to channel 1. The first time the MibSPI does channel 1 it is at the 10ms mark but then after this for some reason it is retriggering at 1ms even on the 10ms channel. And the 2nd point is that I can't set NOBRK on the TG0 channel because it seems that if I do this, the DMA is disabled and the data is read into RAM, but the RXEMPTY isn't set and the sequencer seems to stop at TG0. This could be just a misunderstanding on my part on how this is supposed to work.

Best Regards,

Anthony

Project:

0333.forum4.zip

0 Masayuki MIYOSHI over 12 years ago in reply to Anthony F. Seely

Prodigy 90 points

Hi Anthony-'san' :-)

Thank you for uploading the project.

I have a question separate from the problems that you mentioned.

I tried to build it in my environment, but only the first 32-bit in the TX buffer came out to SIMO line,

next 32-bit didn't come out.
And the received data was confirmed by enabling analog loopback mode, the result of SPI is as follow.

In your environment, all data in the buffer is sent correctly?

---

Among the requirements listed below, I think that the item "Priority" and "Word length(32-bit)" especially make the situation complicated...

・ Communication cycle : SLAVE1=1ms, SLAVE2=10ms
・ Priority : 1ms > 10ms
・ Word length of the SLAVEs : 32-bit
・ Number of transmit data : Not fixed. That number is determined dynamically by the result of calculation in the task.
・ Performance : reduce the load on the ARM core by using DMA with SPI.

Best Regards,

Masayuki

0 Anthony F. Seely over 12 years ago in reply to Masayuki MIYOSHI

TI__Guru 68920 points

Masayuki-san,

I am seeing some issues with the receive data as well.

I understand the requirements that you listed above, and these are the one's I'm trying to work through.

It does seem that the word length of 32 is adding to the complexity; but don't understand it yet. I am talking to the design team.

Thanks and Best Regards,

Anthony

0 Haixiao Weng over 12 years ago in reply to Anthony F. Seely

TI__Genius 13810 points

2313.forum6.zip

On top of Anthony’s project, I did 3 things

Force the Mibspi transfer group into suspend mode instead of skip mode. By doing this, each group0 or group1 can run correctly by itself (but not together).
After step a, if they run together, group1 cannot run even if group0 is free (actually, it is suspend, so it blocks group1). To overcome this, I enabled a DMA BTC(block transfer complete interrupt). In the ISR, I disabled the according transfer group (e.g. if the interrupt is caused by group0, disable transfer group0).
Increased the 2^nd data of group1 to 102 32-bit data to verify that group0 can break into group1’s transfer.

0 Masayuki MIYOSHI over 12 years ago in reply to Haixiao Weng

Prodigy 90 points

Hi, Haixiao,

Thank you for your sample and I wrote the forum6.out to my board.

1. Only 32 bits of the head of TX_DATA ( 0xD0C0E1D1) in spite of TX_DATA0 = (0xD0C0, 0xE1D1, 0xF2E2, 0x03F3, 0x1504, 0x2615, ...).

　Is the data ( 0xF2E203F3, 0x15042615,,, ) transmitted in your environment?

The TG1 has the same problem; transmission of "0xA1B1B2C2" is repeated and 0xC3D3D4E4,,, doesn't come out.

2. Is there any method which meet the requirement, without using ISR?

Best regards

Masayuki

0 Haixiao Weng over 12 years ago in reply to Masayuki MIYOSHI

TI__Genius 13810 points

Thanks for finding that.

The DMA settings in the previous project is not correct. The source frame offset for TX should be 4 (bytes) and the destination frame offset for RX should be 4(bytes) too. In the old project, all of them were set to 0. That is why you see the data repeating again and again.

I updated a new project with issue fixed. I verify the TX_DATA0 and RX_DATA0 and TX_DATA1 and RX_DATA1 in CCS IDE. On the logic analyzer, I only verified a few data, because my logic analyzer is sth 20 years ago and it is painful to read the clock and data edges.

5141.forum6.zip

Regards,

Haixiao

0 Masayuki MIYOSHI over 12 years ago in reply to Haixiao Weng

Prodigy 90 points

Haixiao

Thank you for your prompt action.

I changed some parameters and rebuilt the project.

I feel strange about the priority of TG0 and TG1, with following conditions.

1) baudrate (mibspi.c)

I set baudrate=5MHz to each of TG0, TG1.
The C-source is as below. (Because VCLK is set 90MHz in my environment, I should set 17 to prescale value to make 5MHz CLK,)

-----------

mibspiREG1->FMT0 =

...
// | (29U << 8U) /* baudrate prescale */
| (17U << 8U) /* baudrate prescale */
...

mibspiREG1->FMT1 = (4U << 24U) /* wdelay */
...
// | (89U << 8U) /* baudrate prescale */
| (17U << 8U) /* baudrate prescale */
...

2) The number of messages that TG1 transmit. (sys_main.c)

void vTask1(void *pvParameters) // 10ms task
{
...
// dsize = dsize % 4 + 1;
dsize = dsize % 6 + 20;
...
}

The picture shows the result, with above settings.

It seems that TG0 is interrupted by the TG1's data; "0xA1B1B2C2".
What brought this on?

I'm attaching source code here7360.sys_main.c 4606.mibspi.c.

When I rebuild the project, I use not os_portasm.asm but os_portasm_fix.asm.

By the way, I asked the question ;

" 2. Is there any method which meet the requirement, without using ISR?".

Although there is no reply, isn't there any method?

I'm happy if it is not necessary to use a software interrupt.

I would like to solve this problem only by hardware if it is possible.

Best Regards,

Masayuki

0 Haixiao Weng over 12 years ago in reply to Masayuki MIYOSHI

TI__Genius 13810 points

2705.forum8_Compiler495.zip

In the attached project, find following stuff in sys_main.c, if you remove the highlighted portion, you can re-create this problem (TG1 breaks into TG0). If you keep the highlighted portion, the problem goes away.

//====================================
void vTask0(void *pvParameters) /* 1ms task */
{
portTickType xLastWakeTime;
uint32 dsize;

    /* Initialize the xLastWakeTime variable with current time. */
    xLastWakeTime = xTaskGetTickCount();
    mibspiREG1->PCCLR= 1<<4;
    dsize = 5;
    mibspiREG1->PCDIR = 1<<4;
    for (;;)
    {
    mibspiREG1->TGCTRL[0] = (
            (1 << 31)            /* TGENA    */
             | (0 << 30)            /* ONE-SHOT */
             | (0 << 29)            /* PRST     */
             | (TRG_ALWAYS << 20)   /* TRIGEVT */
             | (TRG_DISABLED << 16) /* TRIGSRC */
             | (0 << 8)             /* PSTART */
           );
    mibspiREG1->TGCTRL[1] = (
            (1 << 31)            /* TGENA    */
             | (0 << 30)            /* ONE-SHOT */
             | (0 << 29)            /* PRST     */
             | (TRG_ALWAYS << 20)   /* TRIGEVT */
             | (TRG_DISABLED << 16) /* TRIGSRC */
             | (2 << 8)             /* PSTART */
           );
    /* - configuring dma control packets for task 1 */
    dmaConfigCtrlPacketTX((uint32)(&TX_DATA0),(uint32)(&(mibspiRAM1->tx[0].data)),dsize);
    dmaConfigCtrlPacketRX((uint32)(&(mibspiRAM1->rx[0].data)),(uint32)(&RX_DATA0),dsize);

/* - setting dma control packets */

dmaSetCtrlPacket(DMA_CH0,g_dmaCTRLPKTRX);
dmaSetCtrlPacket(DMA_CH1,g_dmaCTRLPKTTX);

/* Enable DMA Request */
mibspiDmaTrigTG0(mibspiREG1, dsize-1);

    /* Vary payload size to illustrate */
    dsize = dsize % 4 + 19;
    ui32_Task0Count++;

    if(ui32_Task0Count==25) mibspiREG1->PCSET= 1<<4;
        vTaskDelayUntil( &xLastWakeTime, (1 / (SCALE * portTICK_RATE_MS)) );
    }
}

Here is my thought:

1. Why does this problem occur?

When TG0 is transfer data or TG0 is waiting for data (suspending), TG1 can not be served. However, once TG0 transfer completes, there is a tiny time slot before TG0 goes into suspending mode. TG1 may squeeze in this tiny time slot and cause the problem.

2. Workaround in the attached project.

a) I moved TG1 to TG2, so now the useful groups here are TG0 and TG2.

b) Whenever I enable TG0, I enable TG1 too. Once enabled, TG1 is a dummy group and never got served, it stays in the suspending mode. Then TG1 will block TG2 even in that tiny time slot.

c) When I disable TG0 in the interrupt, I disable TG1 too. Then, TG2 can be served.

So far, I did not find any method working without using the block tranfer end interrupt. It occurs every ms.

Do both TGs need to support more than 100 data length? If the higher priority one only need to support upto 62 32-bit words, we can do it without interrupt.

Thanks,

Haixiao

0 Masayuki MIYOSHI over 12 years ago in reply to Haixiao Weng

Prodigy 90 points

Haixiao,

Thank you for uploading the solution.
I confirmed that the problem goes out by using dummy TG1.

Both TGs may send about 100 messages,
so, I will adopt the method that using software interrupt.

I appreciate your prompt response.

Best Regards,

Masayuki

Arm-based microcontrollers

Arm-based microcontrollers forum

MIBSPI with DMA triggered from two tasks