This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

C6747 v Intel (round 2)

Other Parts Discussed in Thread: OMAP-L137, CCSTUDIO

Hi,

Following on from a previous post:

http://e2e.ti.com/support/dsp/tms320c6000_high_performance_dsps/f/115/p/37185/130298.aspx

I have a similar situation: A piece of code that runs on an intel based PC under a real time Linux OS in 21us, takes 996us to run on a C6747 (actually the C6747 core of a OMAP-L137 on a EVM board). That is nearly 50 times slower. The clock of the C6747 is 300MHz and it is running DSPBIOS v5. All of the code and data resides in internal RAM (L2), I haven't set any MAR registers but I don't think I need to as there is nothing in external RAM. I have removed all debug symbols from the compilation and set optimisation level 2 (was the default anyway). I haven't actively enabled any cache, but I think this should be automatically enabled for internal RAM (right?).

Are there any other things should I check or set to help my code run faster?

Thanks,

John

 

  • John,

    I think we will need some more details about what your setup is doing.  Would it be possible to provide a cut-down project with just the code snippet that your are interested in?

    -Tommy

  • Hi,

    Further to my previous post, profiling of my code has narrowed down my speed problem to a few double precision divides. Specifically, they are reciprocals. And I think that they are taking about 900 cycles each to complete.

    I noticed that the fastRTS67x library has a double precision reciprocal function. I have downloaded this but I cannot link it to my project in CCS4. Can anyone help? Here are some details:

    I have copied ‘fastrts67x.h’ and ‘recip.h’ and ‘fastrts67x.lib’ to my project folder and ‘#include’ the dot h files in my source. In Build settings I add reference to the dot lib file.

    The project compiles OK but I have unresolved symbols on the linking stage: as if it cannot link to the fastRTS library:

    <Linking>

     undefined       first referenced

      symbol             in file    

     ---------       ----------------

     recipdp(double) ./Matrix6x6.obj

    error: unresolved symbols remain

    error: errors encountered during linking; "BenchMark.out" not built

    I read in some Forum posts that you have to specify the link order:

    http://e2e.ti.com/support/microcontrollers/tms320c2000_32-bit_real-time_mcus/f/171/p/42613/149124.aspx#149124

    http://e2e.ti.com/support/development_tools/code_composer_studio/f/81/p/63363/227867.aspx#227867

    I have tried this, but I can find no reference to ‘Generated Linker Command files’ in the CCS4 Build Settings page.

     

  • Hi John,

    Can you elaborate on the method you used to reference the library file in the build settings?

    If you have not already done so, try including the library the following way:

    Right click on project
    Select "Build Properties"
    Under the "Tool Settings" tab, select "File Search Path"
    Look for the box labeled "Include library file or command file as input" and click the icon with a green plus sign
    Add the library file by file system or using macros. If using a macro, ensure the path resolves to the correct location.

    The following documentation may also be useful to you, see chapter 8.
    http://www.ti.com/litv/pdf/spru187q

    -Kevin

  • Hello John, your approach is correct. By using the fastMath lib for C67x, you can reduce the double precision divide time from ~900 clocks to ~61 clocks. Please see the post here: http://e2e.ti.com/support/dsp/tms320c6000_high_performance_dsps/f/115/p/66690/241417.aspx#241417 on information on the upcoming update of C67x fastMath library. Among the many updates, one update is to include support of CCS4 and includes example projects that demonstrate the usage inside CCS4. If an early drop of this SW package will be useful to you, please do contact us at the information provided here: http://processors.wiki.ti.com/index.php/Software_libraries#Feedback_and_Support_on_DSP_Software_Libraries

    Regards,
    Gagan

  • Kevin,

    I did exactly as you suggested, to try to link the library, with one extra step:

    1. Under CCS Build Settings (in Project Build Properties), select the 'Link Order' tab.

    2. Click 'Add...' and select the fastRTS and rts6740 libraries

    3. Ensure they are in the correct order (fastRTS first)

    I get the linker error whether I do this step or not.

    Gagan,

    I don't mind waiting for the update to fastRTS if it is only going to be a couple of weeks. In the meantime, it would be great to know why the existing library won't work with my project.

    Cheers,

    John.

     

     

  • John,

    Based on the following desciption you provided

    "I have copied ‘fastrts67x.h’ and ‘recip.h’ and ‘fastrts67x.lib’ to my project folder and ‘#include’ the dot h files in my source. In Build settings I add reference to the dot lib file."

    Can you verify that "${PROJECT_ROOT}" is included in your File Search Path in the C6000 linker settings. Since the project finds the header file during compilation you can check the Macro that was used to specify the path in the 'Include options' of Build Settings and use the same if your library lies along the same path. In order to verify the path set on the Macro check the Macros option in build settings. If this does not help, Can you go to the C6000 Linker option and provide the 'All options:' configured for the linker that you see on your project for further insight.

    Also, the additional step of specifying the rts library is not really necessary as CCSv4 automatically picks the appropriate rts library. Check the following thread for more details

    http://e2e.ti.com/support/development_tools/code_composer_studio/f/81/p/48840/172886.aspx 

    Regards,

    Rahul

  • John,

    For your reference here is a test project that I created to show the usage of recipdp in  fastrts67x library. The project replicates what you are trying to do by copying over the files from the fastrts library.

    6874.test.zip

    Let us know if this helps you figure out the issue that you were seeing on your project.

    Also I noticed from your earlier post that you are trying to accelerate some project that uses Matrices. I would like to inform you that TI provides a DSPLIB for C674x family of devices that have several floating point matrix and vector operations which probably might help you in your project.

    Documentation and download link for DSPLIB C674x can be found here.

    http://processors.wiki.ti.com/index.php?title=C674x_DSPLIB

    Regards,

    Rahul

     

  • Rahul,

    Thanks for your response. I did not have “${PROJECT_ROOT}” in the File Search Path for the linker – however, I had included the specific file (including the file system path) in the --library section. Putting “${PROJECT_ROOT}” on the search path yields the same results, i.e. the library is still not found by the linker. Note that “${PROJECT_ROOT}” is NOT included on the search path for the compiler – I guess that this is implied? Here are the ‘All Options:’ for the original way I set it:

    -m"BenchMark.map" --warn_sections -i"C:/Program Files/Texas Instruments2/ccsv4/tools/compiler/c6000/lib" -i"C:/Program Files/Texas Instruments2/ccsv4/tools/compiler/c6000/include" -i"C:/Program Files/Texas Instruments2/bios_5_40_02_22/packages/ti/rtdx/lib/c6000" -i"C:/Program Files/Texas Instruments2/bios_5_40_02_22/packages/ti/bios/lib" --reread_libs --rom_model

    I’ve never looked at this before but look: there appears to be no reference to my file. I checked this twice, it is definitely in the ‘File Search Path’ window under --library, but doesn’t appear in ‘All Options’. Here they are if I, instead, put “${PROJECT_ROOT}” in the search path:

    -m"BenchMark.map" --warn_sections -i"C:/OMAPL137/DSPBIOSTraining/BenchMark" -i"C:/Program Files/Texas Instruments2/ccsv4/tools/compiler/c6000/lib" -i"C:/Program Files/Texas Instruments2/ccsv4/tools/compiler/c6000/include" -i"C:/Program Files/Texas Instruments2/bios_5_40_02_22/packages/ti/rtdx/lib/c6000" -i"C:/Program Files/Texas Instruments2/bios_5_40_02_22/packages/ti/bios/lib" --reread_libs --rom_model

    The path is there this time, but the library is still not found. The additional steps I mentioned were because the other threads I quoted in my previous post said that it was important to specify the link order – specifically that the fast library was linked before the standard one. The posts mentioned “Generated Linker Command Files” as an option for link order, but I could not find that.

    I looked at your example project but CCS would not import the build settings, is it a version issue? Mine is 4.0.1.01001.

    DSPLIB: I checked this out. The C674x version only has single precision functions in it. The C67x version has double precision, but I need to multiply a matrix by a vector and the matrix multiply function specifically excludes matrices with either dimension = 1. Actually, the biggest drain in my code is a matrix inverse - and there is no such thing in DSPLIB.

    Thanks again for your help.

    Cheers,

    John.

  • John,

    Sorry to know that you have to face this issue when infact you seem to be doing all the right steps to link the fastrts library. I have few more suggestions that you can try and see if that resolves your issue. Right click on your project and from the "Add files to project ", add the fastrts67x library to your project(if it doesn`t exist already). Once you see the library in the project, right click on the library and pick the "Exclude the file from Build" option and try to rebuild the project and notice if you see the library is being linked in the linking process. Here is the snapshot of what you should be seeing in the project and the console window.

    If this does not resolve the issue,the only way I see to sort this is for you to replicate my project for which you might have to update your version of CCSv4 as the version that you are using is a pretty old one. You can do this by going to the "Help" ->"Software Updates"->"Find and Install".

    Also, since you mentioned you are trying to compute the matrix inverse, I would like to inform you that we have an optimized matrix inverse code that we use internally. We do not release with the C674x DSPLIB library for IP reasons. But the source is made available to customers after some simple formalities. Please drop a note to the developer list at: http://processors.wiki.ti.com/index.php?title=Software_libraries#Developer_Mailing_List

    They should be able to guide you through the process that is generaly very quick

    Regards,
    Rahul

  • Hi Rahul,

    Thanks for the advice. I tried your suggestion but to no avail. I didn't realise I could paste screen shots before, so I have included the result of my latest attempt below. I have highlighted the areas you said to look for - I think that they are all present and correct. I am now attempting to download the latest version of CCS, but it is taking a while as I have a relatively slow connection. Also, the first attempt at download resulted in a corrupted zip file. So I am trying a second time.

    The matrix inverse code sounds interesting, I will check this out once the fastRTS problem is resolved, it may lead to even more speed increase.

    Thanks again for your help, I'll let you know if I make better progress with the new CCS.

    Cheers,

    John.

    Here is the screenshot:

     

     

  • Hello John, getting the latest tools and looking at the example Rahul provided is the right step. Let me also suggest a quick thing that you can do to get going. Note the fastRTS release does come with entire source. So you can add the needed source to your project direct. See below:
    On the CMD prompt:

    > cd C:\CCStudio_v3.3\c6700\mthlib\lib
    > C:\CCStudio_v3.3\C6000\cgtools\bin\ar6x.exe -x  fastrts67x.src

    Add the file C:\CCStudio_v3.3\c6700\mthlib\lib\recipdp.asm to your project. That way you will definitely have the function in your applictaion.

    Thanks,
    Gagan

  • Hi Rahul, Gagan,

     

    Got It!

     

    As always in these situations it was, in hindsight, a simple problem that I, embarrassingly, should have noticed at the start. The issue was that the fastRTS is a C library and my program is C++. The header file supplied does not include the extern "C" definition. The problem is solved by adding the following to the top of the 'fastrts67x.h' file:

     

    #ifdef __cplusplus

    extern "C"

    {

    #endif /* __cplusplus */

     

    and this to the bottom:

     

    #ifdef __cplusplus

    }

    #endif /* __cplusplus */

     

    So in fact, I don't even have to list the library in the linker options - just add it to the project folder and it gets linked. I do have to specify the link options so that 'fastrts67x.lib' is linked before 'Generated Linker Command Files' - the latter has now appeared as an option in the list in the latest CCS. I now have CCS v4.2, I had trouble downloading v4.1.3 as it kept resulting in invalid or corrupted zip files.

     

    Also, I don't need to specifically call the 'recip' function, the compiled seems to work out to use the fast library if the code is written naturally (i.e. 1.0 / <double_variable>)

     

    The upshot of all this is that my function that used to run in 965us, now runs in 499us. This is a big improvement but still 24 times slower than the Intel processor which runs the same code in 21us. So I am now profiling the code to see where the big delays are now that the double math has been speeded up.

     

    Thanks again for your help, I’ll let you know how I get on.

     

    Cheers,

     

    John.