Part Number: AM2434
Other Parts Discussed in Thread: SYSCONFIG
System
AM2434 with 1GB external DDR
SDK 11.0.0.15
CCS 20.3.1.5__1.9.1
Compiler: TI CLANG v4.0.4 LTS
sysconfig: 1.25.0
Issue
Randomly we get the following asserts:
- "pbuf_free: p->ref > 0" line 755 of pbuf.c
- "tcp_receive: valid queue length" line 1135 in tcp_in.c
The two issues are grouped in the same topic since they seem to be related due to the nature of this bug- Issues are described as random since any change in the code could change the behaviour of this
bug: if few lines of code are added, if a task is declared static or dynamic or any kind of modification could let the issue disappear.
Actions taken
- Allocated all LWIP buffers to a specific memory location and via MPU setting marked as non-cacheable, shareable, cacheable. Result: no improvements
- Allocated all LWIP buffers in memory location on DDR or MSRAM. For both cases, via MPU settings these locations were marked as non-cacheable, shareable, cacheable. Result: no improvements
- Disabled all types of cache in the core with this issue. Result: application is running slow but with no blocks
- Used current traces to track the issue. Result: no findings probably due to the limits of our tracing tool
- Instrumented with additional traces to understand when the problem arise. Result: some of the new traces probably perturb the code execution and the issue is not visible anymore
- Invalidated cache after Rx packets were received. Added CacheP_inv in line 1236 of lwip2enet.c for all the list->bufPtr. Result: application is running but does not explain why by setting the memory area as non-cacheable is not solving the issue
- Increased number of pools in lwippools.h. Result: application works but there's no evidence about a memory issue
- Checked with AI agent if in our application TCP core is locked/unlocked properly to prevent concurrent access to the memory. Result: The analysis has revealed that all the guards are implemented properly and no one are missing in our code
Question
We believe that the possible solutions found are not robust enough to prevent this kind of issue. Since we are running out of ideas, we want to know if there is some way to better understand the source
of the issues and how to fix them