Other Parts Discussed in Thread: SYSBIOS
Hi TI,
We are working with a TMS320C6678 that has the below major software component versions running.
SYSBIOS # bios_6_46_04_53
NDK # ndk_2_25_01_11
PDK # pdk_c667x_2_0_6
IPC # ipc_3_44_01_01
CCS Version # 7.1.0.00016
DSP Clock Speed# 1200MHz clock.
Ethernet(SGMII0) Link Speed# CSL_SGMII_1000_MBPS
We are facing issues with NDK performance. Our application streams a file from an external PC using TFTP. Every ms we download a couple of tftp blocks. We have modified the NDK to add this functionality. The problem we are seeing is that our application is not able to download the blocks fast enough to keep pace with the rest of the application. From our profiling it takes around 250-300K cycles to download one tftp block of 1468 bytes. The DSP gets an interrupt every 1 ms and we have a processing budget of around 600K cycles for the tftp stream task. As a result we can only download 2 tftp blocks every ISR and we are not able to keep up. Is this the expected level of performance we can expect for the NDK?
We tried most of the possibilities to achieve better performance, but nothing helped.
- Cache is enabled for the external DDR to get cache on L2 SRAM and optimization ON.
- Placed the NDK libraries(.text) into the L2SRAM. This is to confirm that cache is not an issue.
- Replaced the CSL API's which are using in NIMU driver for cache validation with SYSBIOS API'S (https://e2e.ti.com/support/dsp/c6000_multi-core_dsps/f/639/p/271367/960470#960470) .
- Commented out TCP/IP server section of the code and enables only TFTP download logic, but nothing got worked.
After making the above changes since nothing is helped out we came to a conclusion that that the memory placement is not an issue it might be something related to either SYSBIOS or NDK configuration. So request you to help in resolving this issue.
How can we optimize DSP application either in SYSBIOS/NDK for low latency?
Is there any settings which we missed to configure?
Thanks,
-Pavan