This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

AM2634-Q1: AM2634 + EDMA3 + MCSPI + SPI-ADC System Performance Questions

Part Number: AM2634-Q1
Other Parts Discussed in Thread: AM2634, ADS7038-Q1, ADS7038

Hi,

for our application we need additional analog input channels besides the one available on the AM2634. Mainly because of different voltage levels in the system an external multichannel ADC with an SPI interface shall be used. Actually the ADS7038-Q1 is our preferred solution. However I am not sure if the chosen solution is the best one, since we do use the called 'On-The-Fly' channel selection mode for the SPI-ADC. To do so we need to issue a transfer every(!) 1.25 usec to write transmit the next channel we want to convert next. Basically it works. However we intend to use the EDMA for many other requests and I am a little un-sure if this system solution will be able to handle all the transfers of data into the on-chip system, despite the fact that this is a multi-core system with a lot of resources and functionality.

Besides the 1.25 us transfers to/from the external SPI-ADC we will need DMA transfers every 5 usec for

- FSI channels
- Data transfers from the Realtime control system to the system memory etc etc

All in all we will need (without any networking DMA) 5..7 channels, two of them every 1.25 usec the remaining channels every 5 usec. Now before I ask the questions I want to show how it shall be done for the external SPI-ADC.

The 'components' on the SoC are shown that are configured to handle the 1.25us transfers. Not shown on the drawing above are the digital isolators needed to isolate the different voltage levels from each other. The AM2634 is on voltage level 'A', the external SPI-ADC is on voltage level 'B'. In real life it looks the following:

The so called 'On-Thy-Fly' mode of the ADS7038 shall be used in this context. However may be a better approach exists. It looks like each conversion requires the rising edge of the chip select. In other words I can not issue a burst conversion of a bunch of analog input channels with one pin signal event or so ...

- The 'scope-shot' shows the trigger from the EPWM0 unit every 1.25us (light-blue)
- The chip select from the SPI0 to the external SPI-ADC (Magenta)
- The clock. In this configuration clock speed is 25 MHz, 16 clocks per transfer (Green).

(Not shown on this scope-shot are the data lines SDI and SDO)

Since more than 4 external channels are needed a sequence with a different order is used to select all the channels needed. After 15 usec this sequence repeats. The table that describes the order of this sequence is in main memory and is read by the DMA. The system shall be configured to run endless. It shall transfer data every 1.25 and 5 usec and process and filter data every 5 usec. On this level of processing is shall operate with mainly 'no-jitter' close as good as it would be if an external FPGA would be used, to guarantee jitter-free operation on this data collection and pre-processing level.

Now the questions (sorry a lot of questions ...):

Q1: Can this be guaranteed with this system approach? I know this question is a little bit ambiguous. However I have no idea how to measure the EDMA3 performance with the involved IP's (MCSPI+RTCS+FSI+PRU) on this system without implementing the whole application which I can't right now. Any idea how I could measure this or make a potential problem visible is very welcome. Please let me know.

Q2: After the EPWM0 event (1.25 us trigger) more than 300ns are needed until the falling edge of the chip select is taking place. How much time is needed for the DMA? Is this mainly because the DMA is 'slow' or is the root-cause the MCSPI-IP configuration? The DMA in this context is reading one-word (16Bit) from the table in the system memory and is writing this one word to the output register/buffer of the MCSPI component in order to select the channel to be converted next. As far as I understand the system clock is 200 MHz for the DMA. Correct?

Q3: Is there a better way to periodically issue a write & data transfer operation as it is done now on the MCSPI0-IP to select and collect the data from the external SPI-ADC? I do use the DMA. I couldn't find any other option for example a timer or an EPWM unit directly. That is the reason why I did use one DMA channel and also because I did use ths 'On-The-Fly' mode of this ADC. However this is not a must.

Q4: Is there a better way to connect an external multi channel ADC to the AM2634 than the MCSPI interface with respect to the system-transfer performance?

Q5: Is it correct that the DMA read request (one data word received from the external SPI-ADC) must be issued after each word? The read request is needed to store the received value into the data memory of the PRU0 for pre-processing. I can not burst-store a packet but must transfer each word? How much time does it take? Or how can I make the time needed visible? As far as I understand there is no option to show the time the DMA is master on the internal system to a GPIO pin for visibility of the occupied bus owner-ship time.

Q6: Is it correct that I can not use DMA and FiFo within the MCSPI at the same time for the same channel? Here I only use channel0 of the MCSPI. I tried to enable and configure the receive Fifo on the MCSPI but doing so I do not get any DMA read requests anymore ...

Hopefully not too many questions. Anyway thanks for sharing and hopefully I can get some ideas to better 'estimate' if the system performance will be enough under all circumstances or if other concepts maybe are better for this type of application.

br
Markus

  • Hi Markus,

    First of all, I need clarify a few things in your system:

    1. You said that every 1.25us, you will need to switch ADC channel, there are 8 channels in the SPI-ADC, so one cycle takes 8x1.25us = 10us. I do not understand where the 15us comes from.

    2. You said there is another DMA used to transfer data on FSI every 5us and also a DMA used to transfer data between the Realtime control system to the system memory every 5us. I assume the Realtime control system is a R5F core, right?

    3. What does the PRU0 do here? Does it process the SPI-ADC data?

    4. In your system diagram, the EDMA3 is triggered by the Event from the EPWM0 and also the read request from the SPI-ADC. How does it set up?

    Q1: It looks like you only have a few events 1.25us and 5us, so AM263x should be able to handle your system OK.

    Q2: It really depends on how you trigger the EMDA for McSPI TX from EPWM0. You may want to check the 13.1.3.4.6.5 First MCSPI Word Delay for the first word delay.

    Q3: The other alternative is to use the PRU to trigger it.

    Q4: The other alternative is to use the PRU to trigger it. I am not sure which way is better., but if you will use the PRU to further process the ADC data, then use the PRU is probably a better way.

    Q5: Given the way the ADS7038-Q1 works, it seems to be the only way. It should not take long though, because if you set up the EMDA3 for McSPI correctly, then it should be triggered by the HW. No SW is involved.

    Q6: According to the TRM 13.1.3.5.2.1.8 Transfer Procedures With FIFO, the FIFO and DMA can be used at the same time.

    Best regards,

    Ming

  • Hi Ming,

    thank you for your reply. I try to answer the your questions to get a better picture.

    1. You said that every 1.25us, you will need to switch ADC channel, there are 8 channels in the SPI-ADC, so one cycle takes 8x1.25us = 10us. I do not understand where the 15us comes from.

    Sorry for not being more descriptive. The basic internal main tick so to say for the low level preprocessing stuff is 5 usec aka 200 KHz. For this we do need a few additional external analog signals on another voltage potential. That's why we are adding the external SPI-ADC - galvanically isolated. Not all the input signals connected to this external SPI-ADC are needed every 5 usec. Some of them can be sampled at a lower rate. We did identify an assignment of 6 input signals and an acquisition sequence that repeats itself every 3x5usec aka 15 usec. The sampling of the analog inputs on the SPI-ADC is shown below:

    The above sampling schema explain that for the application we do need 'Ud' & 'Udbot' every 5usec, the phase voltages 'UphsR,S,T' every 15usec, the discharge current we sample every 5usec... In order to achieve this, a sampling rate of 1.25 us has been choosen for the external SPI-ADC. There might be other channel sequences. However we want it to be phase-aligned with the main tick being 5usec since all the other analog input signals we measure directly using the 5 on-chip ADC's of the AM2634. That's how all it comes to play 4x12,5 --> 5 usec.

    2. You said there is another DMA used to transfer data on FSI every 5us and also a DMA used to transfer data between the Realtime control system to the system memory every 5us. I assume the Realtime control system is a R5F core, right?

    Sorry for the misspelling the TRM of the AM2634 names it "Real-time Control Subsystem"(CONTROLSS). Part of the this sub-system are the on-chip ADC's and the trigger sources like the EPWM's. Therefore I do not talk here about the R5F ARM cores. I mean the peripheral HW resources that makes it possible to measure the data and bring it in the system every 5usec ready for being pre-processed on the PRU's. What the PRU's are doing is mainly filtering. The checked and filtered value are being distributed using FSI to an external compagnon chip and to the upper level R5F CPU's on which a more traditional control-loop system based on SimuLink or similar will be executed at a much lower control loop period...aka 10..20KHz or so

    3. What does the PRU0 do here? Does it process the SPI-ADC data?

    Yes!

    4. In your system diagram, the EDMA3 is triggered by the Event from the EPWM0 and also the read request from the SPI-ADC. How does it set up?

    If I want the 1.25usec Trigger-Event being phase aligned with the 5usec Event for the sampling within the CONTROSS I need to use the EPWM's and therefore I need one EPWM for the 1.25us to trigger one DMA channel which writes the correct control word from a table in memory to the external SPI-ADC. In my understanding this is the only way to do so. I know the MCSPI also has a DMA write request option. However how do I phase-align this with the 5usec main tick of the pre-processing control loop? Using two EPWM's this can be done very easily and precise.

    Q1: It looks like you only have a few events 1.25us and 5us, so AM263x should be able to handle your system OK.

    Okay. Thankyou.

    Q2: It really depends on how you trigger the EMDA for McSPI TX from EPWM0. You may want to check the 13.1.3.4.6.5 First MCSPI Word Delay for the first word delay.

    Thank you for this. I did check it with my current test application. This delay is set to the minimum. Therefore the ~300ns delay must have another root-cause.

    Q3: The other alternative is to use the PRU to trigger it.

    Will discuss this option with my client...

    Q4: The other alternative is to use the PRU to trigger it. I am not sure which way is better., but if you will use the PRU to further process the ADC data, then use the PRU is probably a better way

    Okay. We will look into it.

    Q5: Given the way the ADS7038-Q1 works, it seems to be the only way. It should not take long though, because if you set up the EMDA3 for McSPI correctly, then it should be triggered by the HW. No SW is involved.

    Yes, once setup the sample program runs without CPU's. That's correct. The sample programs are made to check if the internal bandwidth is high enough to get all the needed data in place and transferred every 5us (1.25us). Still I need a better understanding from the amount of time needed for example for the EDMA to process a trigger and to really execute the data movement.

    For example the 300ns seen on the MCSPI, is it from the trigger event pre-processing of the EDMA or is this just the MCSPI internal logic? If it is the last root-cause then I am confident that the EDMA is not using a lot of time to pre- and post-process the trigger event before it moves a single word. In other words the EDMA is not having a lot of overhead in terms of time needed until it does move the data. On earlier architectures single word data movement ops were not very effective since the overhead to do so was significant. For larger data blocks this is a no brainer. That's also one of the reasons I would like to prefer using the reception FiFo within the MCSPI to collect for example 4 words and then issue a single DMARdRequest from the MCSPI to move the 4 words being held in the FiFo in one DMA transaction. The EDMA still needs to read the parameter block, process it and write the changed block back to the memory including the update of all its event registers etc. How much of all this happens in parallel I don't know.

    Q6: According to the TRM 13.1.3.5.2.1.8 Transfer Procedures With FIFO, the FIFO and DMA can be used at the same time.

    Okay thank you. I will look into it. Last time I did try I did not work.

    br
    Markus

  • Hi Markus,

    Thank you so much for your detailed answers. Now it is clearer.

    "Still I need a better understanding from the amount of time needed for example for the EDMA to process a trigger and to really execute the data movement."

    I do not have the definite answer for you right now. I will check with the IP owner next Monday and get back to you early next week.

    Best regards,

    Ming