This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

C6748 Slow Speed?

Hello,

I have C6748_EVM by LOGIC.

I found that it's instruction cycle is very slow.

All codes and datas are in L2RAM.

I checked the PLL using OBSCLK, and I found that it worked on 300MHZ SYSCLK1.

I saw them(SYSCLK1 and SYSCLK2 and SYSCLK4) by oscilloscope.

But the cpu was very slow, and I don't know the reason.

The instruction execution speed seems like not 300MHZ but 24MHZ.(The EVM has 24MHZ OSCIN).

I had experience with C6424 and DM648 and C6747.

Please help me. 

 

  • These are very odd results you have seen. If SYSCLK1 is 300MHz, then that is the speed the DSP is being clocked at and that is the instruction cycle speed.

    Here is a way to find out how fast the DSP is being clock. Insert the following code at the top of main(), with the #include at the top of the file, of course:

    CPU clock measurement said:
    #include <c6x.h>

        unsigned long long ullTimeStart, ullTimeDiff, ullTimeEnd;
       
        TSCL = 0;
       
        ullTimeStart = TSCL;
        ullTimeStart += (unsigned long long)TSCH<<32;
       
        while (1)
        {
            ullTimeEnd = TSCL;
            ullTimeEnd += (unsigned long long)TSCH<<32;
            ullTimeDiff = ullTimeEnd - ullTimeStart;
        }

    1. Set a breakpoint at the TSCL=0; line and run to there.
    2. Open a watch window to see the local variables, in particular ullTimeDiff.
    3. Using a clock with a seconds hand (like the Windows clock if you double click the time on the taskbar), get ready to time 10 seconds on the clock.
    4. When you are ready, click run.
    5. After 10 seconds, click halt.

    The value of ullTimeDiff will be approximately 10 times the CPU clock frequency. To get better accuracy by eliminating CCS+emulator time delays, let the timer run longer and divide ullTimeDiff by the number of seconds you let it run.

    Please report back the clock rate that you determine. Then also add some information about what leads you to say that the "cpu was very slow". Please include the version of CCS you are using and the emulator.

  • Thank you.

    I checked the value of ullTimeDiff, it was about 3,000,000,000 after 10s and maybe it was good result. It says 300MHz.

    I use CCS3.3 and XDS510USB and  Code Generation Tools 6.1.9.

    I have tested another method.

    Using EVM by LOGIC, I view timing in Oscilloscope.

    The signal was out on GPIO6.14, TP6  in EVM. 

    The test code is below. It was modified from \C6748_dsp_1_00_00_11\pspdrivers_01_30_01\packages\ti\pspiom\cslr\evm6748\examples\gpio

    And you can see the execution time(it was checked by Oscilloscope.)

    The time was real time except loop and CSL_FINS operations.

    I think that the C6748DSP is under 300MIPS. ( no Pipeline?)

    The speed result was same in another sample using BIOS.

    I captured the asm code in mixed mode. But I can't attach because it is jpg file .

    (   v[0] = v[0]+v[1]+v[2]-v[3]+v[4]+v[5]+v[6]-v[7]+v[8]-v[9]; //: 68ns ) is 24 lines in asm code.

    === C code ============= 

    main()

    int v[10]; // local variable in main()

    ...

    while(1)
     {
      v[0]=1;
      CSL_FINS(gpioRegs->BANK[3].OUT_DATA,GPIO_OUT_DATA_OUT14,1);

    //  v[0] = v[0]*v[1]+v[2]-v[3]+v[4]*v[5]+v[6]-v[7]+v[8]*v[9]; //: 68ns
      v[0] = v[0]+v[1]+v[2]-v[3]+v[4]+v[5]+v[6]-v[7]+v[8]-v[9]; //: 68ns

    //  for (i=0;i<2;i++) v[0] = v[0]*v[1]+v[2]-v[3]+v[4]*v[5]+v[6]-v[7]+v[8]*v[9]; //: 300ns

      CSL_FINS(gpioRegs->BANK[3].OUT_DATA,GPIO_OUT_DATA_OUT14,0);
       }

  • You originally said that you thought your DSP was running too slow. You have now verified that it is running at 300 MHz as expected.

    Your problem has now been determined to be the performance of your C code and your implementation of it.

    Please search the Wiki Pages for information on optimizing your project. Look for compiler switches to use, cache settings, and memory location tradeoffs.

  • Thanks for Advices.

    All of the memory used was in L2RAM.

    L1P and L1D are all cache.

    I checked the assemble code.

    Do you help me to check the registers and conditions in C6748?

  • Hi

    I don't have any additional recommendation above and beyond what Randy mentioned. It seems like you are atleast running at 300 MHz and running everything from internal memory with cache in default configuration. You mentioned in your first post that you have experience with c6747, I am curious if you are seeing a difference in performance for the similar code going from c6747 to c6748. I expect "similar" performance on both these devices, for identical code running from DSP side from internal DSP memory, if somehow you are getting better perceived performance on c6747 then may there is still some debug left on c6748?

    Regards

    Mukul