This thread has been locked.
If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.
Happy new year! We are trying to chase down a memory error that is a little strange.
As part of our standard DVT suite, we use Linux on the A72 cores to run both memtester and stressapptest over large allocations of memory (ideally, over 85% of LPDDR4) in order to validate the memory subsystem.
We have encountered a strange issue with SDK9.0. When we run our test suite on tisdk-adas-image (vision apps) over these large allocations, we encounter some errors. In contrast, we do not see these same errors when running tisdk-default-image. This leads us to believe that we could be encountering a problem where one of the other processors (MCUs, DSPs) or a DMA engine is accessing memory that is allocated/under test by the Linux application.
We encountered similar errors in SDK8.6, but they were extremely rare and thus difficult to chase down (even to the point that we wondered if they were a real error). With SDK9, the failures are easily replicated.
There have been no changes to memory allocation in the vision apps, and our product has 8GB memory standard. There has been no modification of the DDRSS configuration between SDk8.6 and SDK9.0 for our board.
I have several questions:
Thanks,
John
Hi John,
Couple of questions:
-1- Can you tell us if this was on the TI EVM or a custom platform?
-2- Do let us know the parameter passed memtester on the E2E. I am leaving it to the expert to comment, but note that in the tisdk-adas-image, there would be a lot of reserved memory sections that are use by the remote cores (R5Fs and C7x) and are we sure if the memtester tool is honoring these reservations when allocating the memory buffers for its testing?
Thanks.
As a followup, examples of the test commends we are running are:
stressapptest -M 7000 -s 43200 -c 2 -m 2 -W
memtester 7000m
Hi Praveen,
These are being performed on our product based on TDA4VM.
Thanks!
John
Your comment regarding reserved memory sections is along the lines of our current theory. In a heterogeneous multiprocessor use-case, how would each OS (in addition to Linux) know which memory regions are reserved (off limits)?
Hi John,
From Linux perspective the DTS explicitly calls out reserved memory regions. From vision_apps perspective I will let my colleague comment on it.
It depends on the vision_apps firmware binary.
Can you also share the error logs?
- Keerthy
Additionally, what we have discovered so far is that if we move all kernel driver modules (e.g. moving them from the normal location in /lib/modules so that they are not loaded) is that our memory tests can pass.
I'll send you an email with the log information directly. Praveen put is in contact.
How does the kernel manage the reserved memory regions to prevent applications from allocating memory in
In addition, how does TI perform memory tests for LPDDR4 over large regions of memory under stress?
Hi,
When we run our test suite on tisdk-adas-image (vision apps) over these large allocations, we encounter some errors. In contrast, we do not see these same errors when running tisdk-default-image.
The tisdk-adas-image is available from SDK 9.0 whereas tisdk-default-image is in SDK 8.6.
Could you please confirm if it working on SDK 8.6 with tisdk-default-image and not working with tisdk-adas-image in SDK 9.0?
Could you also please confirm if there is vision_apps dtbo file in the uEnv.txt in the boot partition in SDK 9.0?
The reserved region for vision_apps firmwares are available in this device tree overlay, which will be mentioned in uEnv.txt
Regards,
Nikhil
All tests have been performed using SDK9.0. In the last couple of days, we've also been using the SD card images from SDK9.1.
We've also been performing additional tests to further isolate the cause of the problem.
It was found that if we move the vision_apps firmware files like this (see below), so that they are not loaded, that we do not encounter any memory errors during our runs of memtester and stressapptest.
mv /lib/firmware/vision_apps_eaik /lib/firmware/vision_apps_eaik_moved
From this, we can conclude that it is likely something to do with the one of the other SOC cores.
How does Linux know which memory regions are already allocated?
Is it possible that the memory reservations being established in the vision apps dtbo are not being properly managed?
Thanks for the feedback.
As what you say, vision_apps dtbo is not loaded in our system.
After the vision_apps dtbo is loaded, this issue disappears.
But the usable memory reduce from 7.6GB to 6GB.
Before loading vision_apps dtbo:
root@j721e-evm:/opt/edgeai-gst-apps# cat /proc/meminfo
MemTotal: 7929112 kB
MemFree: 7471852 kB
MemAvailable: 7601052 kB
After loading vision_apps dtbo:
root@j721e-evm:/opt/edgeai-gst-apps# cat /proc/meminfo
MemTotal: 6321484 kB
MemFree: 6124320 kB
MemAvailable: 6052724 kB
I will take a look at reserve memory defined in "arch/arm64/boot/dts/ti/k3-j721e-rtos-memory-map.dtsi".
Thanks for clarify this issue.
BR,
Richard
Hi Richard,
Thank you for the update.
If this resolves the issue, could we go ahead and close the thread?
Regards,
Nikhil