This thread has been locked.
If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.
Part Number: TMDXIDK5718
I was trying to calculate the execution time of a task running inside M4 with bare-metal code(No- OS), how I can do that.
I tried with example provided in C:\ti\pdk_am57xx_1_0_15\packages\ti\csl\example\timer\timer_app\main_m4.c by using timers, but I am getting huge time period values. And I am getting 120 clock ticks for 2 successive TIMERCounterGet calls, why this much of huge values? Is any thing I am doing wrong?
SYS_CLK1 is the clock signal for the timer and in IDK board it's 20 MHz.
Below are the steps I have followed:
1. Started a timer4 with 60 sec timeout in interrupt mode
2. Reading the TIMERCounterGet before and after the task execution, and calculating the number of clock ticks for executing the task
3. Calculating the time period from the difference in clock ticks,
time period = (clock ticks * 0.05 Micro sec)
As frequency is 20 Mhz, one clock tick will be 0.05 Micro Sec.
And one more doubt, as M4 is not supporting Hardware FPU's what is the added delay if I use software FPU library in the application execution time.
I think it's OK to use a timer for profiling M4 code. I need to check the M4 bare-metal timer example to see how it operates.
I have a few question for you:
I don't have any information on M4 floating-point performance without an FPU. I'll investigate this further internally to see if can locate any data. Does your application require floating-point data? Can you use the DSP extensions for your calculations?
We are glad that we were able to resolve this issue, and will now proceed to close this thread.
If you have further questions related to this thread, you may click "Ask a related question" below. The newly created question will be automatically linked to this question.
In reply to Frank Livingston:
Below are my answers to your questions.
1. I was using the bare-metal application with CSL timer drivers. I am using CCS to test the code, first connecting to A15 then DDR memory config followed by enabling all cores from scripts tab in CCS, then I am connecting and loading the code onto M4.
In M4 I was initializing the uart, pinmux conf, module clock. Then I am initializing the Timers and trying to capture the execution time.
boardCfg = BOARD_INIT_MODULE_CLOCK | BOARD_INIT_PINMUX_CONFIG | BOARD_INIT_UART_STDIO | BOARD_INIT_UNLOCK_MMR;
status = Board_init(boardCfg);
I was using the Example file as attached in , 6087.main_m4.c
So I suppose the IPU SS Clock and IPU SS MMU initialization are done by CCS from GEL script files during initialization.
2. I am not configuring any unicache inside the M4 code, could you please suggest the required changes inside the code.
3. I am trying to calculate the execution time of a function.
4. No, I am not doing any peripheral read/write operations.
5. Yes, when I calculated time delay b/w 2 successive TIMERCounterGet calls, it's 6 Micro Sec. As M4 is running with 212 Mhz, each instruction will only take 4.7 Nano sec, with in 6 micro sec it can execute almost 1200 instructions, so why I am reading this much high values?
The input clock for timer is configured for SYSCLK1 and I suppose it's 20 MHZ external source on the AM5718IDK board.
Please suggest me what are the changes required to calculate the execution time of a task, we have to take the architectural decision so please reply me asap.
In reply to Naveen B:
I am waiting for your reply, have tested the code I have shared with you?
I've received the code. I'll take a look as soon as possible.
Sorry for the delayed response on this. I've started looking at the example code on M4. Before proceeding, though, I want to make sure this is still a concern for you. Are you still facing this issue?
After enabling the M4 IPU_UNICACHE the time periods reduced drastically from 210 micro sec to 20 micro, but we are not OK with that time periods. As M4 doesn't support FPU and seeing these time periods, we are thinking how to proceed.further.
And by enabling MMU will reduce duration any further? or is any other configurations I have missed?
Are your updated IPU_UNICACHE settings contained in the code you shared? If not, can you please provide an update?
What is your time period budget?
I enabled the uni cache from CCS,
Connect to M4 core -> IPU1_UNICAHE_CFG -> CACHE_CONFIG -> BYPASS -> BYPASS_1
Enabling inside the M4 code will not work I suppose, we have to configure from A15 or boot-loader. If it's possible inside the M4 core, kindly share it the code.
We want a 10 to 20 micro sec range for doing a some conditional checks, what is the through put of the M4 core.
This thread concerning UNICACHE configuration for reducing code cycle counts may be of help: https://e2e.ti.com/support/processors/f/791/t/711906
All content and materials on this site are provided "as is". TI and its respective suppliers and providers of content make no representations about the suitability of these materials for any purpose and disclaim all warranties and conditions with regard to these materials, including but not limited to all implied warranties and conditions of merchantability, fitness for a particular purpose, title and non-infringement of any third party intellectual property right. No license, either express or implied, by estoppel or otherwise, is granted by TI. Use of the information on this site may require a license from a third party, or a license from TI.
TI is a global semiconductor design and manufacturing company. Innovate with 100,000+ analog ICs andembedded processors, along with software, tools and the industry’s largest sales/support staff.