interfacing C6455 with an external ADC

Francesco Montorsi

Prodigy 140 points

Other Parts Discussed in Thread: TMS320C6455, ADS831, CCSTUDIO

Hi all,

First of all: I'm rather new to the DSP world and I have to start with a non-trivial task...

I have the TMS320C6455 EVM board. I need to interface it with an external ADC which runs at about 80MSPS.

I have two options:

a) an ADC with a serial LVDS output (which thus sends digital data _very_ quickly; approx at 800Mhz)

b) an ADC with a parallel CMOS/TTL output (which sends 8 digital bits at much lower speed; approx at 80Mhz)

I wonder:

1) what is the best choice? I think it's b) given that it's very very difficult to operate a digital link at 800Mhz between two separed boards connected by a (hand-made!!) cable

2) which digital bus should I use?

For solution a) I've been looking at the McBSP bus documentation. I couldn't find info about its speed however.

For solution b) I've been looking at the HPI and EMIF interfaces; both however seem to be coinceived for external memory interfacing (address & data). I couldn't find info about their speeds, too :(

Any hints?

Thank you very much indeed,

Francesco Montorsi

over 16 years ago

0 Devin over 16 years ago

TI__Intellectual 1920 points

Lower-end data converters (around 1MSPS or less) are generally directly connected to a DSP via a serial port such as McBSP or parallel port such as an Asynchronous EMIF; however, the higher-end converters present a problem for I/O throughput on most processors. For example, 80MHz ADC is too much data for an EMIFA to handle on its own.

The only interface any current TI processor has capable of this high throughput is the SRIO interface on the 6455, but this would require an external device (such as an FPGA) to function as a FIFO between the ADC and DSP. In this example, the FPGA would connect directly to the data converter and would buffer the data. Then when a certain buffer size is reached it would burst the data into the DSP with a high-speed peripheral such as SRIO.

Be warned that even though the DSP can receive this large throughput that real-time is almost impossible to maintain due to the number of cycles available per sample.

The timing data for the McBSP, EMIF, and HPI can be found in the Data Sheet http://focus.ti.com/lit/ds/symlink/tms320c6455.pdf.

McBSP – pg. 190

EMIF – pg. 162

HPI – pg. 177

0 RandyP over 16 years ago in reply to Devin

TI__Guru* 84110 points

As Devin explained, your option a) for a serial interface would not be practical since their is probably not an SRIO-compatible ADC on the market and the McBSP will not run at 800Mbps.

There is not going to be a perfect, clean way to interface directly with the EMIFA bus, but let's talk about a couple of possibilities. BTW, the HPI is a slave peripheral on the C6455 so it can only be used by an external bus master trying to read from or write to the C6455's memory space, and it cannot be used the other way around for the C6455 to read directly from an external device.

In order to read from an ADC like the ADS831 with a parallel bus, there has to be a clock source and the data has to be read synchronously with that clock source.

One "cheap and easy" way to do this would be to use EMIFA in asynchronous mode with the AOEn signal connected to the ADS831 clock and use the EDMA to constantly read data from the ADS831. There are weaknesses to this solution, though:

The EMIFA async mode requires each timing component (R_SETUP, R_STROBE, R_HOLD) to be at least 1 cycle, so this means there will be 3 AECLKOUT cycles for each read. Since the maximum speed is AECLKOUT=133MHz, this means that theoretically the fastest you could do this would be 133/3=44MHz, and that is way slower than the 80MSPS you want to achieve.
Depending on how you setup the EDMA, these reads might have gaps when the EDMA completes an operation. Even if it is a long sequence of reads, there will be a time when the sequence ends and the EDMA will have to reload to start a new sequence. There could be gaps in the middle, too, if there were ever stalls that hold up the EDMA writes to the destination or when the EDMA completes filling a FIFO and starts filling it again (within the Transfer Controller). These delays might be hidden or minimized by careful design, but do keep them in mind.
The bad thing about those gaps is that the effective clock from AOEn would not be a pure continuous clock, but would have phase shifts whenever there were gaps. The good thing with this approach, though, is that there is always a rising edge of the clock when the data is to be read, so the ADS831 and the DSP stay in sync forever.
I do not know how the ADS831 would behave with a stuttering clock.

Another "cheap and easy" way to interface to the ADS831 with the EMIFA would be to use the SBSRAM Programmable Synchronous I/F. This would let you keep a free-running clock to the ADS831 and would allow you to read data on every clock cycle during a burst. But this has at least one variation on the weaknesses above, but still a potential weakness:

The EDMA would again be setup to continuously read from the ADC and write the data to DSP memory. There are limits to how long a burst can be performed on the EMIFA and from the EDMA, and this could result in gaps where one or more clock cycles goes by without the data being read. This means you would lose samples from time-to-time, if this happened. It is also possible that the EMIFA and EDMA can team up to keep the reads going continuously forever, and this would work perfectly for you.

On your EVM, you could setup either of these with your EMIFA configuration and EDMA setup to start continuous reads then look at the AOEn and/or SOEn signals on a scope to see how well they behave. If the SBSRAM option works with no stutters on SOEn, then you have a possible solution.

A "guaranteed by design" method would require external logic, specifically an FPGA/CPLD that can read continuously from the ADS831 on one side while building a wider word for the EMIFA to read more slowly on the other side. Since the EMIFA can go all the way to 64 bits wide, you can slow down the access rate by reading 64 bits at a time less often than the 8-bit data is coming from the ADC. This would be dependent on your system requirements, for example do you require generating a new result from each byte of data immediately or can you wait for a buffer to fill with say 1024 samples before performing your DSP operations on the data.

In any of these cases, you have the concern that Devin expressed about "the number of cycles available per sample". If you have to operate on each byte as it comes in, you only have 1000MHz/80MSPS or 12 cycles per sample. If you can wait for a buffer to fill and then operate on that buffer, then you can have a lot more time to work with high-speed loops and optimized SIMD (Single Instruction Multiple Data) instructions in the C64x+ arsenal.

If any of this sounds useful for your application and you want to pursue it more, send us some more details about your application or requirements.

0 Francesco Montorsi over 16 years ago in reply to Devin

Prodigy 140 points

Hi,

first of all, thanks for the reply.

Devin said:

The only interface any current TI processor has capable of this high throughput is the SRIO interface on the 6455, but this would require an external device (such as an FPGA) to function as a FIFO between the ADC and DSP. In this example, the FPGA would connect directly to the data converter and would buffer the data. Then when a certain buffer size is reached it would burst the data into the DSP with a high-speed peripheral such as SRIO.

I understand; at a first glance the SRIO bus seems more complex to handle than McBSP, EMIF or GPIO but I'll have a look at the SRIO User's Guide to understand if that's feasible to use it for an ADC=>FPGA=>DSP system.

Devin said:

Be warned that even though the DSP can receive this large throughput that real-time is almost impossible to maintain due to the number of cycles available per sample.

this shouldn't be a problem: I'm interested to sample some milliseconds of the analog input signal, store all the samples in the DSP (external) memory (the EVM DDR memory) and then elaborate them later... thus I don't need to do any real-time processing.

Devin said:

The timing data for the McBSP, EMIF, and HPI can be found in the Data Sheet http://focus.ti.com/lit/ds/symlink/tms320c6455.pdf.

McBSP – pg. 190

EMIF – pg. 162

HPI – pg. 177

thanks for the pointers! I hoped to find somewhere an explicit (raw) data rate for the various available busses, instead of the precise timing characteristics.

Trying to estimate the data rates myself I've got these numbers:

EMIFA = about 1/6ns for AECLKIN = 166Mhz

McBSP = 1/10ns = 100Mhz

GPIO = CPUck/24 = 50Mhz

Can you confirm these (raw) numbers?

If they're correct, 80Mhz seems to be in the range of the EMIFA bus, isn't it?

Maybe I'm missing how the bus exactly works and the effective data-rate is lower because of the protocol?

Thank you very much indeed for the invaluable help!

0 RandyP over 16 years ago in reply to Francesco Montorsi

TI__Guru* 84110 points

I think the datasheet limits the AECLKIN speed to 133MHz. If you have newer version of the datasheet that states 166MHz, great, but if this is just the convenient internal speed, that may not be valid. The effective rate will be reduced depending on the interface format as I discussed in an earlier post.

The 100MHz value for the McBSP is actually 100Mbps. This is a serial bus so the byte rate would be slower by /8.

GPIO is an interesting idea. Where did you get the CPUclk/24 figure? That sounds like an estimate for the number of CPU cycles it takes to do continuous reads from the GPIO module, but you might be able to do better using EDMA. It would take some analysis to figure out how to sync your reads with the clock rate to the ADS831, and even how to generate that clock. It might be tough to do, but it might be possible.

If you use the SBSRAM protocol that I mentioned above (read up on the details in the EMIFA User's Guide), you could surely sustain a pretty long continuous burst. How much data do you want to store prior to later evaluation?

0 Francesco Montorsi over 16 years ago in reply to RandyP

Prodigy 140 points

RandyP said:
There is not going to be a perfect, clean way to interface directly with the EMIFA bus, but let's talk about a couple of possibilities. BTW, the HPI is a slave peripheral on the C6455 so it can only be used by an external bus master trying to read from or write to the C6455's memory space, and it cannot be used the other way around for the C6455 to read directly from an external device.

thanks for clearing the fog on this: I feared that the HPI was not suitable for the task...

RandyP said:
In order to read from an ADC like the ADS831 with a parallel bus, there has to be a clock source and the data has to be read synchronously with that clock source.
....

On your EVM, you could setup either of these with your EMIFA configuration and EDMA setup to start continuous reads then look at the AOEn and/or SOEn signals on a scope to see how well they behave. If the SBSRAM option works with no stutters on SOEn, then you have a possible solution.

Good idea! I'll try this to sort this out as soon as possible but having never used the EMIF bus it will take me some time.

I have also a problem with the generation of digital test signals since in my university lab we have not much digital instruments (mostly analog ones).

RandyP said:
A "guaranteed by design" method would require external logic, specifically an FPGA/CPLD that can read continuously from the ADS831 on one side while building a wider word for the EMIFA to read more slowly on the other side. Since the EMIFA can go all the way to 64 bits wide, you can slow down the access rate by reading 64 bits at a time less often than the 8-bit data is coming from the ADC. This would be dependent on your system requirements, for example do you require generating a new result from each byte of data immediately or can you wait for a buffer to fill with say 1024 samples before performing your DSP operations on the data.

That's another possible solution, since I don't need to do real-time processing. Transferring data from the ADC to the DSP EVM board memory with the use of an FPGA would be fine for me (even if it would cost much in development/learning time)...

RandyP said:
If any of this sounds useful for your application and you want to pursue it more, send us some more details about your application or requirements.

I'm trying to setup a wideband OFDM modem. The problems are mostly in the receiver and in particular we have no experience in the ADC-DSP link; anyway I'll post more precise questions about the EMIF bus in the future :)

Thank you very much indeed for your invaluable help!

0 Francesco Montorsi over 16 years ago in reply to RandyP

Prodigy 140 points

Hi RandyP,

RandyP said:

I think the datasheet limits the AECLKIN speed to 133MHz. If you have newer version of the datasheet that states 166MHz, great, but if this is just the convenient internal speed, that may not be valid. The effective rate will be reduced depending on the interface format as I discussed in an earlier post.

As I said I never used EMIF before so I may easily be wrong but table 7.42 on page 159 says "6ns" as the minimum cycle time, thus 1/6ns = 166Mhz. But 166 or 133 is not critical for me; I just wanted a rough indication of the possible speed of the various busses to decide which one was suitable for my application...

RandyP said:
The 100MHz value for the McBSP is actually 100Mbps. This is a serial bus so the byte rate would be slower by /8.

indeed

RandyP said:
GPIO is an interesting idea. Where did you get the CPUclk/24 figure?

in page 242 of the datasheet, table 7-114 says that for GPIO inputs the pulse duration high must be 12/CPUck and pulse duration low must be the same; thus the minimum clock period (for a GPIO pin output used as clock) seems to be 24/CPUck....

RandyP said:
That sounds like an estimate for the number of CPU cycles it takes to do continuous reads from the GPIO module, but you might be able to do better using EDMA. It would take some analysis to figure out how to sync your reads with the clock rate to the ADS831, and even how to generate that clock. It might be tough to do, but it might be possible.

My vague idea of GPIO usage for the ADC<=>DSP link was: provide with an external clock source the clock to both the ADS831 and the DSP (on a GPIO pin); then I hoped to be able to use the edge-detection feature of GPIO to trigger an interrupt when the clock goes high and then read the data coming from the ADC in the interrupt handler. I may need something between the external clock source and the DSP to simulate the propagation delay of the ADC however (so that when the interrupt is generated the data is already available on the other DSP GPIO pins to be read)...

RandyP said:
If you use the SBSRAM protocol that I mentioned above (read up on the details in the EMIFA User's Guide), you could surely sustain a pretty long continuous burst. How much data do you want to store prior to later evaluation?

Well, unfortunately I don't even know that for sure. I think that 1msec would be enough.... do you think it would be feasible?

0 Francesco Montorsi over 16 years ago in reply to Francesco Montorsi

Prodigy 140 points

Francesco Montorsi said:

That sounds like an estimate for the number of CPU cycles it takes to do continuous reads from the GPIO module, but you might be able to do better using EDMA. It would take some analysis to figure out how to sync your reads with the clock rate to the ADS831, and even how to generate that clock. It might be tough to do, but it might be possible.

My vague idea of GPIO usage for the ADC<=>DSP link was: provide with an external clock source the clock to both the ADS831 and the DSP (on a GPIO pin); then I hoped to be able to use the edge-detection feature of GPIO to trigger an interrupt when the clock goes high and then read the data coming from the ADC in the interrupt handler. I may need something between the external clock source and the DSP to simulate the propagation delay of the ADC however (so that when the interrupt is generated the data is already available on the other DSP GPIO pins to be read)...[/quote]

well, today I tried to see how fast could the GPIO pins switch between high/low state. Basically the relevant portion of the test program is:

// generate a clock-like signal on GPIO PIN #2

pinNum = CSL_GPIO_PIN2;

while (1) {

CSL_gpioHwControl(hGpio, CSL_GPIO_CMD_SET_BIT, &pinNum);

CSL_gpioHwControl(hGpio, CSL_GPIO_CMD_CLEAR_BIT, &pinNum);

}

but the result is that the GPIO pin #2 switches at about only 10Mhz.... building the program in release mode makes no difference. I'd like to eventually write that portion of program in assembler to make sure that the software is not limiting the performances... however I couldn't find informations about writing directly in assembler for this kind of DSP... can someone point me in the right direction?

Besides I suspect the DSP is not running at its maximum speed (1Ghz or 1.2Ghz) even if the EVM manual says that by default the DSP runs at 1Ghz (with 1Ghz I should be able to switch GPIO pins at about 1Ghz/24 =41Mhz); in fact, I see in the "CCStudio: Parallel Debug Manager" window that the clock for cpu_0 (i.e. for my DSP) is marked as "low".

I'd like to push the DSP internal PLL up to the maximum limit... however I tried to open an instance to the PLLC module and to run:

CSL_PllcHwSetup hwSetup = CSL_PLLC_HWSETUP_DEFAULTS_1GHZ;

....

/* code to open an instance to the CSL_PLLC_1 */

....

status = CSL_pllcHwSetup (hPllc, &hwSetup);

but the result is pretty weird: the example seems to run but never ends. Trying to halt the debugger I get:

------------------------------------

Error during: Execution,
An unknown error prevented the emulator from accessing the processor
in a timely fashion.
It is recommended to RESET EMULATOR. This will disconnect each
target from the emulator. The targets should then be power cycled
or hard reset followed by an emureset and reconnect to each target.

Sequence ID: 15
Error Code: -1080
Error Class: 0x00000020

------------------------------------

After restarting both the DSP (with a power off and then power on) and the Code Composer Studio everything works again...

Any hints would be very appreciated.

Thanks!

0 RandyP over 16 years ago in reply to Francesco Montorsi

TI__Guru* 84110 points

Francesco Montorsi said:

... the result is that the GPIO pin #2 switches at about only 10Mhz.... building the program in release mode makes no difference.

This pretty much says it all. The GPIO path will not work. We should end the discussion there and go back to the SBSRAM/EMIFA method (my favorite), but you have a lot of other questions. Good job on the GPIO speed evaluation, btw; you evaluated GPIO writes, though, and not reads, so there could be some differences but I doubt they would be more than 10% different one way or the other.

Francesco Montorsi said:

... in fact, I see in the "CCStudio: Parallel Debug Manager" window that the clock for cpu_0 (i.e. for my DSP) is marked as "low".

Why are you using PDM? Do you have more than one C6455 on the JTAG chain? I have no idea what "low" means where you see it.

Here's a way to find out how fast your processor is running.

#include <csl.h>
#include <csl_tsc.h>
void main()
{
        CSL_Uint64        counterVal;
        CSL_tscEnable(); // [ed: corrected]
        while (1)
               counterVal = CSL_tscRead();
}

Build/load the program, run to tscStart, open a Watch Window, look at a clock with a seconds hand, click run, wait 10 seconds, click halt, look at counterVal in the Watch Window. Divide that number by 10 and you will have the number of cycles per second that the DSP is running. You will have to reset or maybe even power cycle to get the TSC counter to start over at zero again.

Francesco Montorsi said:

I'd like to push the DSP internal PLL up to the maximum limit... however I tried to open an instance to the PLLC module and to run...

The PLLs can be pretty tricky to work with. There is a PLL User's Guide for the device on our website that has the answers. But also the GEL file that you use probably configures the PLL in the OnTargetConnect() GEL "auto" function. You can look at that GEL script and see what the steps are, then try playing with some of the values to get the results you want. Be sure to save a copy of the original in case you want to go back to it later.

Francesco Montorsi said:

but the result is pretty weird: the example seems to run but never ends. Trying to halt the debugger I get:
------------------------------------
Error during: Execution,
An unknown error prevented the emulator from accessing the processor
in a timely fashion.

This is one of those error messages that just tells you something went really wrong. Most likely the attempt to access the PLL made something get locked up or running in an invalid state. There is no good answer other than "don't do that". If you copy the GEL script in code, you should be able to make the same settings in the code, but it may be easier for evaluation purposes to just do it in the GEL script. In a real application, you have to do it all in the code, of course.

0 Francesco Montorsi over 16 years ago in reply to RandyP

Prodigy 140 points

Hi Randy,

thanks again for the help...

RandyP said:

This pretty much says it all. The GPIO path will not work. We should end the discussion there and go back to the SBSRAM/EMIFA method (my favorite), but you have a lot of other questions. Good job on the GPIO speed evaluation, btw; you evaluated GPIO writes, though, and not reads, so there could be some differences but I doubt they would be more than 10% different one way or the other.

I think in the end I'll need to switch to EMIF but I think that somehow I should be able to get more than 10Mhz-speed on the GPIO for a DSP clocked at 1Ghz...

RandyP said:
Why are you using PDM? Do you have more than one C6455 on the JTAG chain?

yes, the c6455 EVM is composed of two boards: the "main one" and the mezzanine one. Thus I have two c6455 in total (even if at least for now I've never used the one in the mezzanine board)...

RandyP said:
Here's a way to find out how fast your processor is running.
#include <csl.h>
#include <csl_tsc.h>
void main()
{
        CSL_Uint64        counterVal;
        CSL_tscEnable(); // [ed: corrected]
        while (1)
               counterVal = CSL_tscRead();
}

many thanks for the program! I ran it (for my c6455 I replaced CSL_tscStart() with CSL_tscEnable() to make it compile) and followed your suggested procedure, getting a result of 1Ghz clock for the DSP... good because it means I won't need to work with manual cfg of the DSP clock; it also means that GPIO is running slow even with DSP almost clocked at its maximum speed :(

Before leaving GPIO however I'd like to test/know about two other things:

1) could it be that CCS is limiting the performances of my DSP? I mean: the EVM needs to communicate with CCS about its status through the USB cable... if I can setup the DSP to run autonomously without the USB cable plugged in, maybe it will go faster... of course to test this I first need to learn how to flash the EVM flash memory with my program and make the DSP autonomous (and it seems like there's no EVM manual for that).

2)I saw that CSL_gpioHwControl() disassembly is quite long: I haven't looked at c6455 ASM details yet but I wonder if writing the pin set/clear operations directly in assembler could improve the performances... or maybe it's just wasted time and that the bus speed in practice is far from the theorical 50Mhz....

what do you think of above points? maybe I'm missing something...

thanks again

0 RandyP over 16 years ago in reply to Francesco Montorsi

TI__Guru* 84110 points

Francesco Montorsi said:

1) could it be that CCS is limiting the performances of my DSP? I mean: the EVM needs to communicate with CCS about its status through the USB cable... if I can setup the DSP to run autonomously without the USB cable plugged in, maybe it will go faster... of course to test this I first need to learn how to flash the EVM flash memory with my program and make the DSP autonomous (and it seems like there's no EVM manual for that).

There should be a FlashWriter or FlashBurn utility somewhere that came with your EVM. If you do not find it, you should be able to get it from the board vendor's support web site.

But you do not need to for this particular case, at least. First, once you load your program, you can use the CCS menu command Debug->RunFree. This will start the DSP running your code and will also logically remove CCS from communicating with the DSP. To get CCS control back, you have to use Debug->Halt. We always try this because we always suspect CCS of getting in the way, but it never is the problem anymore (10-15 years ago, maybe, but not so much now we have CCS 3.x). When CCS is running normally, it only does refreshes at certain times, like after single-step or halt or when you click to refresh a window. These refreshes will cause problems with your performance only while the refreshes are happening so they will not hurt your long-term measurements.

Second, the fact that you get the same performance with Debug and Release builds, this means the CPU time is not the critical factor. If you were doing reads and writes to L2 or DDR memory, you would see a significant difference between the two. This tells me that the bulk of the time is spent in the accesses to the config space.

Francesco Montorsi said:

2)I saw that CSL_gpioHwControl() disassembly is quite long: I haven't looked at c6455 ASM details yet but I wonder if writing the pin set/clear operations directly in assembler could improve the performances... or maybe it's just wasted time and that the bus speed in practice is far from the theoretical 50Mhz....

CSL_gpioHwControl() is long because it does a lot of different things. It is basically a big switch statement for all the different control commands you can pass to it. Sometimes the CSL might be inefficient in how it accesses Config space because it is written to be as robust as possible. But in this case I think it only does a single Config bus write so you would not get any appreciable benefit from going to assembly code.

You have been measuring writes so far, so it might be interesting to measure reads. You could use a while (1) loop that increments a counter and reads the GPIO port, then use your 10 second wristwatch timer and find out how many times it was able to do the reads. It could also be interesting to try EDMA or QDMA to do the GPIO reads, but they will probably be around the same rate as the CPU.

0 Francesco Montorsi over 16 years ago in reply to RandyP

Prodigy 140 points

Hi,

sorry for the long delay with this reply but I and other co-workers tried different solutions in the meantime. We learned how to use the EMIF bus and we're using with some success. However we still need to learn how to use it in conjunction with EDMA (any sample code about this would be fantastic! the available examples perform transfers of memory blocks from an address to another one; they never involve an external peripheral).

We also learned and tested some stuff on an FPGA to try to solve the problem (but we decided to try at least a low-speed [10Mhz] direct link between the DSP and the ADC first); we have some news also about GPIO however.

RandyP said:
First, once you load your program, you can use the CCS menu command Debug->RunFree. This will start the DSP running your code and will also logically remove CCS from communicating with the DSP. To get CCS control back, you have to use Debug->Halt. We always try this because we always suspect CCS of getting in the way, but it never is the problem anymore (10-15 years ago, maybe, but not so much now we have CCS 3.x). When CCS is running normally, it only does refreshes at certain times, like after single-step or halt or when you click to refresh a window. These refreshes will cause problems with your performance only while the refreshes are happening so they will not hurt your long-term measurements.

You're right; it looks like CCS is causing any noticeable performance hit... but:

RandyP said:

CSL_gpioHwControl() is long because it does a lot of different things. It is basically a big switch statement for all the different control commands you can pass to it. Sometimes the CSL might be inefficient in how it accesses Config space because it is written to be as robust as possible. But in this case I think it only does a single Config bus write so you would not get any appreciable benefit from going to assembly code.

We haven't tried assembly yet. However we managed to get an amazing 4x of GPIO speed using the code:

while (1) {

hGpio->regs->SET_DATA = (1<<pinNum);

hGpio->regs->CLR_DATA = (1<<pinNum);

}

instead of the previous code:

while (1) {

CSL_gpioHwControl(hGpio, CSL_GPIO_CMD_SET_BIT, &pinNum);

CSL_gpioHwControl(hGpio, CSL_GPIO_CMD_CLEAR_BIT, &pinNum);

}

The former code, in fact, runs at about 41Mhz vs the 10-11Mhz of the latter one! For a first simple test of DSP-ADC interfacing, we're aiming to do something like:

while (1) {

// generate a clock-transition for the ADC on the GPIO pin that we have connected from the DSP to the ADC clock:

hGpio->regs->SET_DATA = (1<<pinNum);

hGpio->regs->CLR_DATA = (1<<pinNum);

// read the data the ADC sent over the GPIO bus to the DSP:

bus[i%BUFFER_SIZE] = hGpio->regs->IN_DATA;

i++;

}

we still cannot test this code in practice because of some troubles we're having with the connectors of the ADC EVM. Anyway, do you think that it could work? (obviously all pins connected to the ADC data pins are configured as inputs; the pin we use to generate the clock for the ADC is the only DSP output)

RandyP said:

You have been measuring writes so far, so it might be interesting to measure reads. You could use a while (1) loop that increments a counter and reads the GPIO port, then use your 10 second wristwatch timer and find out how many times it was able to do the reads. It could also be interesting to try EDMA or QDMA to do the GPIO reads, but they will probably be around the same rate as the CPU.

As I said we're not able to use EDMA/QDMA yet :(

Any simple example code which uses the EDMA with the GPIO would be very appreciated.

Another problem we are facing with this low-speed GPIO link between the DSP and the ADC is that we have troubles to find many usable GPIO pins on the DSP EVM.

So far we managed to get 8 usable pins from the DSP EVM. All other GPIO pins seems to be multiplexed for other purposes and the mux/demux in the EVM board are controlled by a CPLD. Unfortunately we are currently unable to use the GPIO pins multiplexed with McBSP1 because the CPLD doesn't generate the correct signals for the mux/demux. Does anyone have experience with this kind of problem?

We suspect that the problem is not only hw (i.e. the routing of the GPIO pin signals on the EVM) but also sw (if we put an oscilloscope probe right on the pins of the DSP we want to use, we cannot read any of the GPIO signals we send through the DSP software...). As far as we know however upon startup all internal DSP subsystems are off and we explicitely turn on _only_ the GPIO, so that we don't understand why the DSP seems to not generate the signals we tell it to generate on some pins...

Thanks a lot for any hint,

Francesco

0 RandyP over 16 years ago in reply to Francesco Montorsi

TI__Guru* 84110 points

With my current setup, I do not have a scope so my performance measurements are just based on the time it takes to go through the code 10000 times and calculating the resulting frequency.

I was able to duplicate your GPIO performance numbers on my DSK, getting 9.4 MHz with functional CSL and 41.7 MHz with register CSL. A different version of CSL, CGT, and compiler switches might explain the slight difference between our numbers; I am using CSL 3.00.10.02 and CGT 6.1.5 and only -o3.

When I add the bus[i] = hGpio->regs->IN_DATA; line to the loop, the rate goes down to 11.9 MHz since an extra read is then going on in the loop. If you include the %BUFFER_SIZE it goes down to 11.1 MHz in my test. Also, from looking at the optimized assembly code, the three GPIO access instructions (2 STWs to Set and Clear the clock bit and then the 1 LDW to read IN_DATA) are immediately one after one the other - this means the clock pulse will not be 50% duty cycle and the read could be happening at an unexpected relationship to the clock edges depending on where the read phase of the LDW falls relative to the write phase of the STW(s). Plus, everything in the GPIO peripheral is run off a slower clock within the peripheral so the actual GPIO behavior may have to be determined by bench testing. This has been an interesting exercise, but I will still recommend against it for these speeds. If you were looking at 1 MHz or less, then you can do things to make sure the GPIO reads and writes happen in the right order, like reading the CLR_DATA registers after writing to it and before reading the IN_DATA register (writes will always be in order and reads will always be in order, and writes/reads to the same address will always be in order, assuming the compiler/optimizer plays nicely).

To your question on finding enough GPIOs, the McBSP1 (possibly mislabeled as McBSP2) pins that come to J3 can be used as GPIOs, and the External Interrupt 4-7 pins can be used as GPIOs. This should be enough for your needs, with GP[11:4] for a reasonably easy-to-get byte of data and GP[0] and GP[3] available for a CLK and maybe something else.

EDMA3 is one of the most powerful features of the "modern" set of DSPs. It is not trivial to understand and it is very worth the effort to look through the User's Guide to learn all of its features. In the next few months, we should have some online training material to make it a little easier to understand, and our IW6455 4-day workshop is a great way to quickly get up-to-speed and developing with many of the features of the C6455.

The Technical Training Organization has developed a set of example projects that are being integrated into the workshop. They may be going to revisions from the early copy that was sent, but I have attached TTO_LLD_example1_async_6455.zip to this posting for your use. It does a simple copy from a SrcBuf to a DstBuf without requiring any external events for synchronization to start the transfer; it is started asynchronously by direct program control. This example uses the EDMA3 LLD (Low Level Drivers) which are available from the CCS Update Advisor web site.

There are good comments in main.c to explain what the project does. To see how the EDMA transfer is setup, look at the edma_createChan() function in the edma.c file. For your application, where you want to read from an external peripheral on the EMIF into an internal memory buffer (for best speed, this is my recommendation), you only need to change the source address to the address of the EMIF peripheral. There might be some other little details, like 0-indexing for the source and ACNT/BCNT and hard-coded BUFFSIZE that need to change, too, but you will figure that out after you start to understand the LLD example and the EDMA3 peripheral. You will have to configure and initialize the EMIF for SBSRAM access, too.

TTO_LLD_example1_async_6455.zip

Processors

Processors forum

interfacing C6455 with an external ADC