Using uDMA for getting real time SSI data

Simon Einfeldt

Other Parts Discussed in Thread: TM4C129ENCPDT

I am currently working on a project based on the TM4C129ENCPDT where I need to receive and process serial data from an SSI in real time. In order to do so, I am using a uDMA configured for ping pong mode to move data from the SSI data register to a suitable buffer in internal RAM.

The ping pong setup should ensure that data is continuously copied from the SSI with no data loss.

The SSI needs to run at a clock frequency of 16 MHz and my core clock frequency is set to 96 MHz. With a SSI receive FIFO depth of eight 16 bit units, this means that an empty receive FIFO will be filled in 16*8*(1/(16000000 Hz)) = 8*(10^-6) s = 8 us or 768 core clock cycles.

What happens if another part of my code running in the processor core blocks RAM access for 768 core clock cycles? Based on actual testing, my best guess is that since uDMA access to RAM is always prioritised lower than processor core access to RAM, the uDMA will not be able to empty the SSI FIFO, and a SSI receive overflow (RXOR) will occur. Obviously, this receive overflow ruins my real time capability.

It seems strange that such an overflow condition can arise when the presence of a ping pong uDMA mode suggests that uDMA could or should in fact be used for real time purposes.

Does anyone have any enlightening thoughts on the matter? Am I understanding and using the architecture correctly? Could I do anything to let my uDMA outrank the processor core for RAM access?

over 10 years ago

Amit Ashara over 10 years ago

TI__Guru**** 244440 points

Hello Simon,

You are correct. The uDMA is a lower priority than CPU and LCD on the TM4C129 devices. The better way of handling this is using the uDMA in burst only mode so that the DMA spends more time transferring data per control word fetch from the Control Table.

Also another way to bypass the arbitration issue is to use of 2 Banks of SRAM for DMA alone. Since the Banks are 4 way interleaved the CPU can use the other 2 banks while DMA fills one bank for Ping and another Bank for Pong.

Regards
Amit

Simon Einfeldt over 10 years ago in reply to Amit Ashara

Prodigy 215 points

Hi Amit, thank you for getting back to me :-) I have verified that I am using the uDMA on burst mode only, and I've been trying to figure out how to ensure that DMA data and CPU data are kept in separate SRAM banks, but I couldn't find any detailed information in the datasheet.

I know that the SRAM memory map begins at address 0x2000.0000 and that the TM4C129ENCPDT has 256 kB of internal SRAM. Does this mean that I can assume the following memory map scheme?

SRAM bank 1: 0x2000.0000 - 0x2000.FFFF (64kB)
SRAM bank 2: 0x2001.0000 - 0x2001.FFFF (64kB)
SRAM bank 3: 0x2002.0000 - 0x2002.FFFF (64kB)
SRAM bank 4: 0x2003.0000 - 0x2003.FFFF (64kB)

ALSO: I have an external SDRAM module connected to the TM4C129ENCPDT via the EPI interface. The SDRAM is currently clocked at 50% of the core clock, but if I set the EPI clock divider to 0, I am in fact able to run the SDRAM at full core clock.

How is this even possible? My SDRAM module can handle up to 133 MHz, but the TM4C129ENCPDT datasheet specifies that external SDRAM modules can only be run at up to 50% of the clock. Do you know the reason behind this limitation and why I am able to bypass it?

When I do run the external SDRAM at full core clock, my initial SSI receive overrun problem disappears (even though the SSI and uDMA only work on internal SRAM and thus should not be affected by the speed of the external SDRAM). Can you explain this?

Amit Ashara over 10 years ago in reply to Simon Einfeldt

TI__Guru**** 244440 points

Hello Simon,

Did you check the clock on the CLK pin as 120MHz? It may be possible for a few "lucky" devices, but by timing not every uC would work at the same frequency.

As for the original issue: The SRAM mapping is correct for the banks as you mentioned and you must ensure that one of the bank is exclusively for the uDMA.

Regards
Amit

Simon Einfeldt over 10 years ago in reply to Amit Ashara

Prodigy 215 points

Hi Amit,

Yes, I used an analog oscilloscope to verify that the SDRAM clock was in fact running at the core clock frequency (96 MHz in my case).

I'm using the GCC compiler and linker and I know that I can ensure that the uDMA variables are linked to a specific SRAM address using

__attribute__ ((section(".udma_variables_sram_region"))) ,

but I couldn't think of a way to keep other variables outside of this region except for specifically linking ALL other variables to another region using

__attribute__ ((section(".other_variables_sram_region"))) .

This seems rather drastic, and I'm not sure if it is the right way to do it. Do you have any suggestions?

Regards,
Simon

Amit Ashara over 10 years ago in reply to Simon Einfeldt

TI__Guru**** 244440 points

Hello Simon

In the Linker file you can change the size of the SRAM so that CPU does not know that another bank is available. In the C code you can use an address pointer for the DMA Table.

Regards
Amit

Simon Einfeldt over 10 years ago in reply to Amit Ashara

Prodigy 215 points

Thank you Amit, that's a nice solution.

In the end it turned out that that our low priority uDMA that we use to transfer data to external SDRAM was bottlenecking our high priority uDMA that we use to get data from the SSI peripheral to internal SRAM.

When we switched the low priority transfer from external SDRAM to internal SRAM, our timing problems disappeared, and the high priority transfer is able to run with no problems.

Regards
Simon

Amit Ashara over 10 years ago in reply to Simon Einfeldt

TI__Guru**** 244440 points

Hello Simon,

Did you change the high priority destination location as well?

Regards
Amit

Simon Einfeldt over 10 years ago in reply to Amit Ashara

Prodigy 215 points

Yes, I tried placing my ping and pong buffers in different SRAM banks, but it did not seem to make any difference, so I disabled it again.

Amit Ashara over 10 years ago in reply to Simon Einfeldt

TI__Guru**** 244440 points

Hello Simon

That is strange. It should have worked but as you mentioned the root cause was the uDMA being held by a low priority task to the SDRAM, does make sense as well since the uDMA core will process one thread at a time.

Regards
Amit

Simon Einfeldt over 10 years ago in reply to Amit Ashara

Prodigy 215 points

Hi Amit,

On second thought it may not be so strange after all. If you read the processor data sheet carefully, you'll notice that on page 610 it is mentioned that

"The SRAM is implemented using four-way 32-bit wide interleaved SRAM banks (separate SRAM arrays)"

I wasn't quite sure what the interleaving part meant, but after doing some research I ended up here. I think that interleaving means that the memory map that I suggested in a previous post is actually not correct, and that the memory map should probably look more like this:

SRAM bank 1	SRAM bank 2	SRAM bank 3	SRAM bank 4
0x2000.0000	0x2000.0004	0x2000.0008	0x2000.000C
0x2000.0010	0x2000.0014	0x2000.0018	0x2000.001C
0x2000.0020	0x2000.0024	0x2000.0028	0x2000.002C
0x2000.0030	0x2000.0034	...	...

This would explain why linking a ping buffer to an address in the 0x2000.0000 - 0x200.FFFF range and a pong buffer in the 0x2001.0000 - 0x2001.FFFF range did not solve my problem as the ping and pong buffers would actually still be spread across multiple physical SRAM banks.

Regards
Simon

Amit Ashara over 10 years ago in reply to Simon Einfeldt

TI__Guru**** 244440 points

Hello Simon,

I ran a test on the SRAM memory bank and the access were in fact as per the organization of the address mentioned, unless I misread the data. Anyways, I have planned a few more tests to see the actual access of memory with multi initiators.

Regards
Amit

Arm-based microcontrollers

Arm-based microcontrollers forum

Using uDMA for getting real time SSI data