This thread has been locked.
If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.
Hi
We suffer from unexplained jitter in FreeRTOS (Cortex R, core 0).
Measurement method:
Differences are used to assess system jitter.
Simple example, one task, jitter within 2-5 uSecs. Good.
It looks like latency grows when I enable more parts in software, plus some factors that I cannot explain.
No access to hardware. Low priority tasks wake up periodically, but do nothing.
For example, there is big difference in latency with and without initilization of network code.
Without init of network, latency/jitter is +/- 20 uSec, with init of network, but cable is disconnected (no interrupts from DMA) +/- 80 uSec.
All code run from DDR, DDR is cached.
Cache miss can explain such a behavior?
Any thoughts ?
Hi Rasty,
In the nutshell, with the complex application such as the TCP/IP network stack running, those behaviors are expected, because the latency you measured is from a low priority task. If the higher priority tasks are running taking long time to finish, then the response from the low priority task will be delayed. It is not because of the interrupt service routine was delayed, but simply because the low priority task's response was delayed.
One way to reduce the latency is to reduce the execution time for the higher priority tasks by putting the most frequently used function and data into the TCM or OCRAM instead of putting them in DDR. Of course, they need to do the profiling for their application to understand the functions and data areas to be put in faster memory like TCM and OCRAM.
Of course, you can also increase the priority of the task which they need short response time.
There are only 32KB program cache and 32KB data cache, so when the program is as complex as TCP/IP network stack, the cache misses will increase for sure, therefore the execution time for the higher priority tasks will increase too. That is why you want to put the most frequently used functions and data in TCM or OCRAM instead of DDR.
Best regards,
Ming
Hi,
I think that you did not get me right.
I measure response of higherst priority task. It jitters, despite the fact that there is nothing above it.
I expect some influence of cache refill, but not that high.
Thanks
Rasty
Hi Rasty,
Do you have any tasks have the same priority as the task you measure the response latency?
Best regards,
Ming
Hi Rasty,
If that is the case, then the delay must be in the network related ISRs. Can you measure the time taken in the network related ISRs using the CycleCounterP APIs?
Best regards,
Ming
Hi,
Network cable is unplugged, no DMA interrupts, checked with break point.
Thanks
Rasty
I found major source of real-time problems, but not all.
How to I reconfigure Ethernet to work in polling mode, without DMA, cache invalidation and without interrupts?
Hi Rasty,
Since this is a question for Ethernet, I will forward your thread to our Ethernet-CPWS expert for further help!
Best regards,
Ming
Hi Rasty
Is it possible to share a test which we can replicate on the EVM?
1. From this image, are you inferring that there is a Cache invalidate API which may be impacting the performance? I'll try to look more in the call trace, but I'd expect this to be just invalidating a particular cache block and not impacting the high priority task's cachebility. (unless there is some interdependence wrt same cache line having cached the code for your high priority task)
2. Is this call trace only when sending/receiving packets? Or does this trigger even when the cable is disconnected?
3. What interrupts are registered (and triggering) for the R5F and what are their priorities? I see a 500us timer interrupt and an Ethernet interrupt(?).
Regards
Karan
Hi
1. I did not expect cache invalidation , because there is an option to define non-cached memory for stack via MPU.
2. Just a put a breakpoint to cache invalidating function and see what happens.
3. Only timer interrupt, and DMA.
Is it possible to switch to polling mode?
Thanks
Rasty
1. GCC complied example (not TI CLANG).
2. Rasty to share which SDK example is being used here. TI to reuse (with TI CLANG) that and add a 500us timer task to see if we are able to replicate this issue.
3. Rasty to comment on what is the priority of the 500us timer task.
Is it possible to switch to polling mode?
4. TI to share steps for this.
Regards
Karan
Hi
Please explain how all those questions are related to my request?
** is it possible to switch Ethernet driver/stack to polling mode and/or eliminate cache invalidation?
Timer priority is irrelevant, because problem is general slow down of the system after initilization of Ethernet stack, while network cable is *not* connected.
Rasty
Your request is a part of question 4 per my notes.
Other questions / comments will help us debug this issue.
Regards
Karan
Hi Rasty,
You can refer "enet_lwip_cpsw" example for polling method implementation.
file:///C:/Repos/mcu_plus_sdk/docs/api_guide_am243x/EXAMPLES_ENET_LWIP_CPSW.html
Can you share your test example to reproduce the issue on my setup?
Best Regards
Ashwani
Notes from 3/7
1. TCP/IP is not time critical, this can be put in DDR un-cached. Ashwani (TI) to provide details on what buffers/code can be put in uncached DDR. Also how to make these code changes for changing the mem map.
Step1. Put Data only in uncached DDR.
Step2. Put .text selectively in uncached DDR.
Regards
Karan
Hi Rasty,
Sorry for delay in response.
Please follow below document for latency benchmark.
Allow me some more time to sync internally and get back to you.
Best Regards,
Ashwani
Hi
How do I change this parameters (come from file generated by sysconfig)
/*! \brief RX packet task stack size */
#define LWIPIF_RX_PACKET_TASK_STACK (1024U)
/*! \brief TX packet task stack size */
#define LWIPIF_TX_PACKET_TASK_STACK (1024U)
/*! \brief Links status poll task stack size */
#if (_DEBUG_ == 1)
#define LWIPIF_POLL_TASK_STACK (3072U)
#else
#define LWIPIF_POLL_TASK_STACK (1024U)
#endif
Thanks
Rasty
Hi Karan
I still did not get unswer to this question.
I tried following
1. Defined memory area in ddr from address 0x90000000 as non-cachable (DDR_NC)
2. Moved following sections to that non cachable area
DDR : ORIGIN = 0x80000000 , LENGTH = 0x10000000
DDR_NC : ORIGIN = 0x90000000 , LENGTH = 0x10000000
*(*ENET_DMA_DESC_MEMPOOL)
*(*ENET_DMA_RING_MEMPOOL)
/*#if (ENET_SYSCFG_PKT_POOL_ENABLE==1)*/
*(*ENET_DMA_PKT_MEMPOOL)
/*#endif*/
} > DDR_NC
.bss (NOLOAD) : ALIGN (128) {*(ENET_DMA_OBJ_MEM)} > DDR_NC
.bss (NOLOAD) : ALIGN (128) {*(ENET_DMA_PKT_INFO_MEMPOOL)} > DDR_NC
3. Disable cache invalidation.
Once I disable cache invalidation I get very strange behavior of stack
Need help to move DMA buffers to non-cached memory and get rid of cache invalidation.
get rid of cache invalidation
We can not completely get rid of cache invalidation as it is handled in driver code.
DMA descriptor and DMA RING MEMORY should always be cached.
ENET_DMA_OBJ_MEM)
This should be in cached section.
ENET_DMA_PKT_MEMPOOL
This is related to packet payload. So, can be cached or un-cached memory location, based on use case.
Regards
Ashwani
Please Explain why it should be cached.
Whay are alternatives? Do you have DMA-less network drivers? Polling?
Thanks
rasty
Please Explain why it should be cached.
If you want to move DMA descriptor and DMA RING to un-cached memory, Then
1 Then, you will get reduced performance.
2. You need to remove cache invalidation related code from driver
Do you have DMA-less network drivers?
We do not support DMA-less networking with ICSSG and CPSW with 1G.
Regards
Ashwani
Can you send me a summay of places that I have to change in order to get rid of cache and invalidation?
I'm asking because I already did that by my own, but stack does not work well. I probably miss something important.
How do I change this parameters (come from file generated by sysconfig)
Once you generated files from sysconfig.
Then, you edit the files.
Copy the generated files and paste them into example/project directory.
Set SysConfig to be not included in the build and then build the project.
Can you send me a summay of places that I have to change in order to get rid of cache and invalidation?
We are working on this.
Regards
Ashwani
I have indication that cache invalidation is not all.
Even if I comment out cache invalidation, I still have ISR jiitter of 50 uSec. I have an impression that something related to Ethernet/DMA disables interrupts for a long time or some ISR (like Udma_eventIsrFxn) takes long.
Hi Rasty,
Can you help me with the detailed steps to reproduce the issue on my setup?
I will start with SDK 9.1.
which SDK example to use ?
What are the local changes on your setup that I need to add my setup to reproduce the issue ?
Regards
Ashwani
You can take any TI TCP/IP example.
On top of it
1. Add periodic timer, say 125 uSec
2 From timer ISR give a semaphore to high priority task
3. In task measure the difference in task wakeup time. Look for minuimum/maximum.
4. Repeat this test with and without Ethernet traffic.
Hello Ashwani ,
Here, we can keep source and destination buffer addresses in non-cached memories. So, R5F will directly read data from destination buffer without cache_invalidation after DMA completion. I think this is possible.
Even if I comment out cache invalidation, I still have ISR jiitter of 50 uSec. I have an impression that something related to Ethernet/DMA disables interrupts for a long time or some ISR (like Udma_eventIsrFxn) takes long.
Rasty,
Please look at the image below. Typically when DMA starts operation we disable the all interrupts and after the starting of DMA again we resume the interrupts .This, operation you can see in image below. I assumed that we use the same UDMA API in ethernet driver also to initiate the DMA. So, this could create an issue.
Regards,
S.Anil.
We use udma+cpsw+lwip so I assume that we use the same API.
What would you suggest?
Rasty, I am not familiar to industrial protocols but mostly the same API Udma_ringQueueRaw function is used in the entire MCU+SDK to initiate the DMA.
And, we need to check what the peripherals are being used in your Applications. Since , we are doing same the interrupts disabling or resuming after some critical operations. So, if you share all the details about what peripherals being used in your applications, it is really helpful to debug the issue further.
We use udma+cpsw+lwip so I assume that we use the same API.
My assumption is also same .
Ashwani , can you confirm here ?
Regards,
S.Anil.
In general I do not undertand why TI drivers use HwiP_disable without any wrappers that allow disabling only selected peripheral interrupts.
In general I do not undertand why TI drivers use HwiP_disable without any wrappers that allow disabling only selected peripheral interrupts.
For any atomic operation to happen, global interrupts need to be disabled.
Regards
Karan
I did some measurement
Enqueueand Dequeue contributes to jitter 8 and 9 uSecs each other.
Udma_eventIsrFxn tooks 54 uSecs (!).
From my perspective it is design flaw.
ISR shall not do such a major work - must be threaded, work shall be done in task not in ISR.
In that case you would not need to disable interrupt globally, it would be enough to have mutex.
In any case I do not see a reason for global disable of interrupts, software shall mask only DMA interrupt.
You can refer "enet_lwip_cpsw" example for polling method implementation.
file:///C:/Repos/mcu_plus_sdk/docs/api_guide_am243x/EXAMPLES_ENET_LWIP_CPSW.html
Hi Rasty,
You can use above example as a reference of polling method.
In case of polling mode, application can periodically call EnetDma_retrieve* APIs to get TX free and RX full packets.
Code snippet below shows polling mode usage for receive operation. The periodic task retrieves packets from Enet DMA and passes it to processing stack periodically.
void EnetApp_periodicTask(void) { /* Receive packets from DMA */ while (true) { status = EnetDma_retrieveRxPktQ(hRxCh, pRetrieveQ); /* Processes the received packets and enqueues into freeQ */ process(pRetrieveQ); status = EnetDma_submitRxPktQ(hRxCh, pFreeQ); sleep(100); } }
For TX, polling can be used to retrieve transmission complete packets.
void EnetApp_sendPkt(void) { /* Submit TX ready packets for transmission */ status = EnetDma_submitTxPktQ(hTxCh, pSubmitQ); } void EnetApp_periodicTask(void) { /* Retrieve free TX packets from DMA */ while (true) { sleep (100); status = EnetDma_retrieveTxPktQ(hTxCh, pRetrieveQ); } }
Best Regards
Ashwani
Actions items from today Meeting:
Regards
Ashwani
In task measure the difference in task wakeup time.
Which API are you using to get this value ?
Regards
Ashwani
Hi Rasty,
To reproduce the issue on my setup below steps I follow
Please review the changes and let me know if these are okay to reproduce your setup on my board ?
Next Steps:
1. Get the jitter values with Ethernet code enable
2. Get the jitter values with Ethernet code disable
Regards
Ashwani
Task shall look like.
You sould not print messages from this task.
volatile uint32_t _diffCounterMax=0;
void empty_hop2(void *args) __attribute__((section("CRITICAL_TEXT_SECTION1")))
{
uint32_t endCounter,diffCounter;
startCounter = CycleCounterP_getCount32();
while (1)
{
xSemaphoreTake( Sem, portMAX_DELAY); /* wait for wake up from TimeTick */
endCounter = CycleCounterP_getCount32();
diffCounter = endCounter - startCounter;
if (diffCounter > _diffCounterMax)
_diffCounterMax = diffCounter;
startCounter = endCounter;
}
}
Hi Rasty Slutsker ,
Yesterday, I was out of office.
Here is updated main.c file while app_main.c file is same as previously shared (excluding Ethernet code)
/cfs-file/__key/communityserver-discussions-components-files/81/4064.main.c
We created 3 Tasks
Timer ISR trigger = > Task 1: Main Task give semaphore to ==> Task 2: Empty Task give semaphore to ==> Task 3: Jitter Task
I am seeing ~1 us jitter in this test.
Now where you want me to add/ enable ethernet code (Task-1 or Task-2 or Task-3) and get new jitter value with ethernet enablement?
Here
Regards
Ashwani
Suggested patch approach to DMA driver.
From 59a64ce5ba034b13cd7c711192c23007154c40c3 Mon Sep 17 00:00:00 2001 From: rasty slutsker <rasty.slutsker@servotronix.com> Date: Thu, 21 Mar 2024 11:43:59 +0200 Subject: [PATCH 1/3] 1. Threaded DMA interrupt --- .../drivers/makefile.am243x.r5f.ti-arm-gcc | 4 + .../mcu_plus_sdk/source/drivers/udma/udma.c | 7 + .../source/drivers/udma/udma_event.c | 133 ++++++++++++++++-- .../source/drivers/udma/udma_ring_common.c | 36 +++-- .../source/drivers/udma/udma_utils.c | 8 +- 5 files changed, 165 insertions(+), 23 deletions(-) diff --git a/ind_comms_sdk_am243x_09_01_00_03/mcu_plus_sdk/source/drivers/makefile.am243x.r5f.ti-arm-gcc b/ind_comms_sdk_am243x_09_01_00_03/mcu_plus_sdk/source/drivers/makefile.am243x.r5f.ti-arm-gcc index f748c2c804..19cccf3708 100644 --- a/ind_comms_sdk_am243x_09_01_00_03/mcu_plus_sdk/source/drivers/makefile.am243x.r5f.ti-arm-gcc +++ b/ind_comms_sdk_am243x_09_01_00_03/mcu_plus_sdk/source/drivers/makefile.am243x.r5f.ti-arm-gcc @@ -196,6 +196,10 @@ FILES_PATH_common = \ INCLUDES_common := \ -I${CG_TOOL_ROOT}/include/c \ -I${MCU_PLUS_SDK_PATH}/source \ + -IFreeRTOS-Kernel/include \ + -I${MCU_PLUS_SDK_PATH}/source/kernel/freertos/config/am243x/r5f \ + -I${MCU_PLUS_SDK_PATH}/source/kernel/freertos/config/am243x/r5f \ + -I${MCU_PLUS_SDK_PATH}/source/kernel/freertos/portable/TI_ARM_CLANG/ARM_CR5F \ DEFINES_common := \ -DSOC_AM243X \ diff --git a/ind_comms_sdk_am243x_09_01_00_03/mcu_plus_sdk/source/drivers/udma/udma.c b/ind_comms_sdk_am243x_09_01_00_03/mcu_plus_sdk/source/drivers/udma/udma.c index db4945a294..b9fda63011 100644 --- a/ind_comms_sdk_am243x_09_01_00_03/mcu_plus_sdk/source/drivers/udma/udma.c +++ b/ind_comms_sdk_am243x_09_01_00_03/mcu_plus_sdk/source/drivers/udma/udma.c @@ -64,6 +64,8 @@ /* ========================================================================== */ /* Global Variables */ /* ========================================================================== */ +SemaphoreP_Object dmaPollMutex; +static int32_t once=1; /* None */ @@ -82,6 +84,11 @@ int32_t Udma_init(Udma_DrvHandle drvHandle, const Udma_InitPrms *initPrms) DebugP_assert(sizeof(Udma_EventObjectInt) <= sizeof(Udma_EventObject)); DebugP_assert(sizeof(Udma_RingObjectInt) <= sizeof(Udma_RingObject)); DebugP_assert(sizeof(Udma_FlowObjectInt) <= sizeof(Udma_FlowObject)); + if (once) + { + retVal = SemaphoreP_constructMutex(&dmaPollMutex); + once = 0; + } if((drvHandle == NULL_PTR) || (initPrms == NULL_PTR)) { diff --git a/ind_comms_sdk_am243x_09_01_00_03/mcu_plus_sdk/source/drivers/udma/udma_event.c b/ind_comms_sdk_am243x_09_01_00_03/mcu_plus_sdk/source/drivers/udma/udma_event.c index 3065c21dcc..370442f16a 100644 --- a/ind_comms_sdk_am243x_09_01_00_03/mcu_plus_sdk/source/drivers/udma/udma_event.c +++ b/ind_comms_sdk_am243x_09_01_00_03/mcu_plus_sdk/source/drivers/udma/udma_event.c @@ -40,9 +40,13 @@ /* ========================================================================== */ /* Include Files */ /* ========================================================================== */ - +#include <stdio.h> #include <drivers/udma/udma_priv.h> - +#include <kernel/dpl/TaskP.h> +#include <kernel/dpl/SemaphoreP.h> +#include <kernel/freertos/FreeRTOS-Kernel/include/FreeRTOS.h> +#include <kernel/freertos/FreeRTOS-Kernel/include/queue.h> +#include <kernel/freertos/FreeRTOS-Kernel/include/task.h> /* ========================================================================== */ /* Macros & Typedefs */ /* ========================================================================== */ @@ -52,7 +56,32 @@ /* ========================================================================== */ /* Structure Declarations */ /* ========================================================================== */ +typedef struct Udma_Event +{ + Udma_EventCallback callback; + Udma_EventHandle eventHandle; + uint32_t eventType; + void *appData; +} Udma_Event; +#define EVENT_Q_LEN 256 +typedef struct Udma_EventPollObject_t +{ + /* + * Handle to Qeueue + */ + QueueHandle_t dmaPollQ; + + /* + * Handle to input task that sends polls the link status + */ + TaskP_Object dmaPollTaskObj; + uint8_t dmaPollTaskStack[1024]; + char dmaPollTaskName[64]; + Udma_EventObjectInt *handle; + Udma_Event eventQ[EVENT_Q_LEN]; + StaticQueue_t xStaticQueue; +} Udma_EventPollObject; /* None */ /* ========================================================================== */ @@ -80,7 +109,7 @@ static void Udma_eventResetSteering(Udma_DrvHandleInt drvHandle, /* ========================================================================== */ /* Global Variables */ /* ========================================================================== */ - +extern SemaphoreP_Object dmaPollMutex; /* None */ /* ========================================================================== */ @@ -424,13 +453,33 @@ void UdmaEventPrms_init(Udma_EventPrms *eventPrms) return; } - +uint32_t dmaisr_max = 0; +uint32_t * pdmaisr_max= &dmaisr_max; +uint32_t CycleCounterP_getCount32(void); +void dmaPoll(void *args) +{ + Udma_EventPollObject* obj = (Udma_EventPollObject*)args; + Udma_EventHandleInt eventHandle = obj->handle; + while (1) + { + BaseType_t stat; + Udma_Event event; + stat = xQueueReceive(obj->dmaPollQ, &event, SystemP_WAIT_FOREVER ); + SemaphoreP_pend(&dmaPollMutex,SystemP_WAIT_FOREVER); + if ( pdPASS == stat && event.callback) + { + event.callback(event.eventHandle, event.eventType, event.appData); + } + SemaphoreP_post(&dmaPollMutex); + } +} static void Udma_eventIsrFxn(void *args) { uint32_t vintrBitNum; uint32_t vintrNum; uint32_t teardownStatus; - Udma_EventHandleInt eventHandle = (Udma_EventHandleInt) args; + Udma_EventPollObject* obj = (Udma_EventPollObject*)args; + Udma_EventHandleInt eventHandle = obj->handle; Udma_DrvHandleInt drvHandle; Udma_EventPrms *eventPrms; Udma_RingHandleInt ringHandle; @@ -440,6 +489,8 @@ static void Udma_eventIsrFxn(void *args) drvHandle = eventHandle->drvHandle; vintrNum = eventHandle->vintrNum; DebugP_assert(vintrNum != UDMA_EVENT_INVALID); + uint32_t start, end, dif; + start = CycleCounterP_getCount32(); /* Loop through all the shared events. In case of exclusive events, * the next event is NULL_PTR and the logic remains same and the while breaks */ while(eventHandle != NULL_PTR) @@ -486,8 +537,26 @@ static void Udma_eventIsrFxn(void *args) { if((Udma_EventCallback) NULL_PTR != eventPrms->eventCb) { - eventPrms->eventCb( - eventHandle, eventPrms->eventType, eventPrms->appData); + BaseType_t xHigherPriorityTaskWoken = pdFALSE; + BaseType_t stat; + + /* We have not woken a task at the start of the ISR. */ + Udma_Event event; + event.callback = eventPrms->eventCb; + event.eventHandle = eventHandle; + event.eventType = eventPrms->eventType; + event.appData = eventPrms->appData; + stat = xQueueSendFromISR( obj->dmaPollQ, &event, &xHigherPriorityTaskWoken ); + /* Now the buffer is empty we can switch context if necessary. */ + if( pdPASS == stat && xHigherPriorityTaskWoken ) + { + /* Actual macro used here is port specific. */ + portYIELD_FROM_ISR (xHigherPriorityTaskWoken); + } +/* + eventPrms->eventCb( + eventHandle, eventPrms->eventType, eventPrms->appData); + */ } } } @@ -496,6 +565,12 @@ static void Udma_eventIsrFxn(void *args) /* Move to next shared event */ eventHandle = eventHandle->nextEvent; } + end = CycleCounterP_getCount32(); + dif = end - start; + if (dif > dmaisr_max) + { + dmaisr_max = dif; + } return; } @@ -748,7 +823,7 @@ static int32_t Udma_eventAllocResource(Udma_DrvHandleInt drvHandle, if(UDMA_SOK == retVal) { /* Do atomic link list update as the same is used in ISR */ - cookie = HwiP_disable(); + SemaphoreP_pend(&dmaPollMutex,SystemP_WAIT_FOREVER); /* Link shared events to master event */ eventHandle->prevEvent = (Udma_EventHandleInt) NULL_PTR; @@ -766,7 +841,7 @@ static int32_t Udma_eventAllocResource(Udma_DrvHandleInt drvHandle, eventHandle->prevEvent = lastEvent; lastEvent->nextEvent = eventHandle; } - HwiP_restore(cookie); + SemaphoreP_post(&dmaPollMutex); } if(UDMA_SOK == retVal) @@ -813,7 +888,7 @@ static void Udma_eventFreeResource(Udma_DrvHandleInt drvHandle, uintptr_t cookie; /* Do atomic link list update as the same is used in ISR */ - cookie = HwiP_disable(); + SemaphoreP_pend(&dmaPollMutex,SystemP_WAIT_FOREVER); /* * Remove this event node - link previous to next @@ -831,7 +906,7 @@ static void Udma_eventFreeResource(Udma_DrvHandleInt drvHandle, eventHandle->nextEvent->prevEvent = eventHandle->prevEvent; } - HwiP_restore(cookie); + SemaphoreP_post(&dmaPollMutex); if(NULL_PTR != eventHandle->hwiHandle) { @@ -865,7 +940,8 @@ static void Udma_eventFreeResource(Udma_DrvHandleInt drvHandle, return; } - +Udma_EventPollObject pollpool[20]; +int poolidx=0; static int32_t Udma_eventConfig(Udma_DrvHandleInt drvHandle, Udma_EventHandleInt eventHandle) { @@ -1058,6 +1134,37 @@ static int32_t Udma_eventConfig(Udma_DrvHandleInt drvHandle, } } + { + int32_t retVal = SystemP_SUCCESS; + TaskP_Params params; + poolidx++; + pollpool[poolidx].handle = eventHandle; + /*Initialize semaphore to call synchronize the poll function with a timer*/ + pollpool[poolidx].dmaPollQ = xQueueCreateStatic(EVENT_Q_LEN,sizeof(Udma_Event), + (uint8_t*)&pollpool[poolidx].eventQ, &pollpool[poolidx].xStaticQueue ); + + if(NULL == pollpool[poolidx].dmaPollQ) + { + DebugP_logError("[UDMA] Event Q create failed!!!\r\n"); + } + { + /* Initialize the poll function as a thread */ + TaskP_Params_init(¶ms); + sprintf(pollpool[poolidx].dmaPollTaskName,"DMA_poll_irq_%d",(int)eventHandle->coreIntrNum); + params.name = pollpool[poolidx].dmaPollTaskName; + params.priority = 9; //todo ???? + params.stack = pollpool[poolidx].dmaPollTaskStack; + params.stackSize = sizeof(pollpool[poolidx].dmaPollTaskStack); + params.args = &pollpool[poolidx]; + params.taskMain = &dmaPoll; + + retVal = TaskP_construct(&pollpool[poolidx].dmaPollTaskObj, ¶ms); + if(SystemP_SUCCESS != retVal) + { + DebugP_logError("[UDMA] Poll task create failed!!!\r\n"); + } + } + } if(UDMA_SOK == retVal) { /* Register after programming IA, so that when spurious interrupts @@ -1070,7 +1177,7 @@ static int32_t Udma_eventConfig(Udma_DrvHandleInt drvHandle, HwiP_Params_init(&hwiPrms); hwiPrms.intNum = coreIntrNum; hwiPrms.callback = &Udma_eventIsrFxn; - hwiPrms.args = eventHandle; + hwiPrms.args = &pollpool[poolidx]; hwiPrms.priority = eventHandle->eventPrms.intrPriority; retVal = HwiP_construct(&eventHandle->hwiObject, &hwiPrms); if(SystemP_SUCCESS != retVal) diff --git a/ind_comms_sdk_am243x_09_01_00_03/mcu_plus_sdk/source/drivers/udma/udma_ring_common.c b/ind_comms_sdk_am243x_09_01_00_03/mcu_plus_sdk/source/drivers/udma/udma_ring_common.c index bcd8c9bf49..e271eae966 100644 --- a/ind_comms_sdk_am243x_09_01_00_03/mcu_plus_sdk/source/drivers/udma/udma_ring_common.c +++ b/ind_comms_sdk_am243x_09_01_00_03/mcu_plus_sdk/source/drivers/udma/udma_ring_common.c @@ -42,6 +42,7 @@ /* ========================================================================== */ #include <drivers/udma/udma_priv.h> +#include <kernel/dpl/SemaphoreP.h> /* ========================================================================== */ /* Macros & Typedefs */ @@ -69,7 +70,7 @@ static inline void Udma_ringAssertFnPointers(Udma_DrvHandleInt drvHandle); /* ========================================================================== */ /* None */ - +extern SemaphoreP_Object dmaPollMutex; /* ========================================================================== */ /* Function Definitions */ /* ========================================================================== */ @@ -350,6 +351,11 @@ int32_t Udma_ringDetach(Udma_RingHandle ringHandle) return (retVal); } +uint32_t dmaenq_max = 0; +uint32_t * pdmaenq_max= &dmaenq_max; +uint32_t dmadeq_max = 0; +uint32_t * pdmadeq_max= &dmadeq_max; +uint32_t CycleCounterP_getCount32(void); int32_t Udma_ringQueueRaw(Udma_RingHandle ringHandle, uint64_t phyDescMem) { @@ -377,11 +383,17 @@ int32_t Udma_ringQueueRaw(Udma_RingHandle ringHandle, uint64_t phyDescMem) if(UDMA_SOK == retVal) { - cookie = HwiP_disable(); - + SemaphoreP_pend(&dmaPollMutex,SystemP_WAIT_FOREVER); + uint32_t start, end, dif; + start = CycleCounterP_getCount32(); retVal = drvHandle->ringQueueRaw(drvHandle, ringHandleInt, phyDescMem); - - HwiP_restore(cookie); + end = CycleCounterP_getCount32(); + dif = end - start; + if (dif > dmaenq_max) + { + dmaenq_max = dif; + } + SemaphoreP_post(&dmaPollMutex); } return (retVal); @@ -413,11 +425,17 @@ int32_t Udma_ringDequeueRaw(Udma_RingHandle ringHandle, uint64_t *phyDescMem) if(UDMA_SOK == retVal) { - cookie = HwiP_disable(); - + uint32_t start, end, dif; + SemaphoreP_pend(&dmaPollMutex,SystemP_WAIT_FOREVER); + start = CycleCounterP_getCount32(); retVal = drvHandle->ringDequeueRaw(drvHandle, ringHandleInt, phyDescMem); - - HwiP_restore(cookie); + end = CycleCounterP_getCount32(); + dif = end - start; + if (dif > dmadeq_max) + { + dmadeq_max = dif; + } + SemaphoreP_post(&dmaPollMutex); } return (retVal); diff --git a/ind_comms_sdk_am243x_09_01_00_03/mcu_plus_sdk/source/drivers/udma/udma_utils.c b/ind_comms_sdk_am243x_09_01_00_03/mcu_plus_sdk/source/drivers/udma/udma_utils.c index 5c988d3bcc..4ec3284527 100644 --- a/ind_comms_sdk_am243x_09_01_00_03/mcu_plus_sdk/source/drivers/udma/udma_utils.c +++ b/ind_comms_sdk_am243x_09_01_00_03/mcu_plus_sdk/source/drivers/udma/udma_utils.c @@ -247,7 +247,13 @@ uint64_t Udma_defaultVirtToPhyFxn(const void *virtAddr, uint32_t chNum, void *appData) { - return ((uint64_t) virtAddr); +#if defined (__aarch64__) + uint64_t temp = virtAddr; +#else + /* R5 is 32-bit machine, need to truncate to avoid void * typecast error */ + uint32_t temp = (uint32_t) virtAddr; +#endif + return ((uint64_t) temp); } void *Udma_defaultPhyToVirtFxn(uint64_t phyAddr, -- 2.27.0.windows.1
Hi Rasty,
An update:
Experiment_1:
4 Tasks running (Empty_P2, Ethernet_P2, Main_P2, Jitter_P1) + 125-us-ISR
Time duration: 10 minutes
Max Jitter calculation w,r,t, 125us as below:
Experiment_2:
3 Tasks running (Empty_P2, Ethernet_P2, Main_P2, Jitter_P1) + 125-us-ISR
Time duration: 10 minutes
Max Jitter calculation w,r,t, 125us as below:
Next Step:
Let me know if you have any inputs here ?
Question:
Based on your inputs, we will tune the example settings.
Regards
Ashwani
Hi Ashwani
No requirerements to ethernet traffic , best efforts.
Expected high priority task jitter withing 5 uSec under maximum Ethernet load. TCP/IP communication maybe slow, but connection drops are not allowed.
Best regards
Rasty
Hi Rasty,
An update: Running this test for long duration and get the jitter values.
Experiment_1:
4 Tasks running (Empty_P2, Ethernet_P2, Main_P2, Jitter_P1) + 125-us-ISR
Time duration: 60 minutes
Max Jitter calculation w,r,t, 125us as below:
Experiment_2:
3 Tasks running (Empty_P2, Ethernet_P2, Main_P2, Jitter_P1) + 125-us-ISR
Time duration: 10 minutes
Max Jitter calculation w,r,t, 125us as below:
Next Step:
Let me know if you have any inputs here ?
These results are without Ethernet cable connected.
+/- 80 uSec.
This jitter you are seeing with Ethernet cable connected ?
I have not seen below function hit without ethernet cable connected.
void EnetUdma_txCqIsr(Udma_EventHandle hUdmaEvt,
uint32_t eventType,
void *appData)
{
Regards
Ashwani
Please run with Ethernet cable and some TCP/IP traffic.
Test without Ethernet cable plugged makes not sense.
My input is that 25 uSec jitter is not acceptable and need to be solved.
Please run with Ethernet cable and some TCP/IP traffic.
Thanks Rasty for confirmation.
Re-run above experiments with Ethernet cable connected.
We are seeing same 15-25us jitter in steady state (1-2 outlier as well).
Next Steps:
Regards
Ashwani
Hi Rasty,
Can you make changes on your setup and provide updated results?
Regards
Ashwani
Hi Ashwani
With my patch I archive much lower jitter - 5-7 uSecs
Jitter of 20 uSec is not what we're looking for.
I see that you only handle TX path with flag and polling, what is about RX?
rasty
what is about RX?
We have added same function at Rx side as well with this patch.
Driver build is successful.
C:\ti\mcu_plus_sdk_am64x_09_00_00_35\source\networking\enet> gmake -s -f .\makefile.cpsw.am64x.r5f.ti-arm-clang PROFILE=debug
We are seeing task jitter ~15us.
Below changes on application side:
Note:
Global interrupt is not disabled in our setup.
Regards
Ashwani