This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

CCS/TMS320C6678: dgemm_test build using CCS v8

Part Number: TMS320C6678

Tool/software: Code Composer Studio

Hi

After successfully building the dgemm example found in linalg_1_2_0_0/examples/dsponly/dgemm_test via Makefile, The program built using the Makefile also successfully runs on the DSP and yields remarkable performance. However, I am experiencing different issues using CCS.

This is how I tried to build it from CCS:
1. I copy-pasted a project I previously created as described in "http://downloads.ti.com/mctools/esd/docs/openmp-dsp/building_openmp_app.html"

2. I copied dgemm_test.c, fc_config_c6678.c, ticblas_config.c and omp_config.cfg into my project folder

3. I added the necessary products via Project -> Properties.

4. If I build now, it throws the following error:
"/home/idris/ti/libarch_1_0_0_0/packages/ti/libarch/src/lib_cachecfg.h", line 38: fatal error #35: #error directive: "Unsupported OS! Please specify either LIB_OPENCL or LIB_RTOS"

I therefore add -DLIB_RTOS to the C6000 compiler flags under Project->Build->C6000 Compiler. This flag is also set when building via Makefile.

5. If I build now, I get:

"/home/idris/ti/libarch_1_0_0_0/packages/ti/libarch/src/lib_cachecfg.h", line 90: fatal error #35: #error directive: "Unsupported TARGET"

So I also add -DSOC_C6678, also passed in the Makefile.

6. It now finishes building but the Linker throws the following error:

 undefined             first referenced    
  symbol                   in file         
 ---------             ----------------    
 cblas_dgemm           ./dgemm_test.obj    
 lib_L1D_config_SRAM   ./ticblas_config.obj
 lib_get_L1D_SRAM_size ./ticblas_config.obj
 lib_get_L2_SRAM_size  ./ticblas_config.obj
 tiCblasDelete         ./ticblas_config.obj
 tiCblasGetSizes       ./ticblas_config.obj
 tiCblasInit           ./ticblas_config.obj
 tiCblasNew            ./ticblas_config.obj
 
error #10234-D: unresolved symbols remain

And I unfortunately don't have any answer to this. Could maybe someone give me some hints on what I am doing wrong? Any help is appreciated. Please tell me if you need some additional information. Thank you very much.

Edit 1:

I was building in Debug Mode. However, changing to release did not help. Here the complete console output:

---------------------------------------------------------------------------------------

**** Build of configuration Release for project omp_linalglib4 ****

/home/idris/ti/ccsv8/utils/bin/gmake -k -j 4 all -O 
 
Building file: "../dgemm_test.c"
Invoking: C6000 Compiler
"/home/idris/ti/ccsv8/tools/compiler/ti-cgt-c6000_8.2.4/bin/cl6x" -mv6600 -O2 --include_path="/home/idris/ti/openmp_dsp_c667x_2_06_02_01/packages/ti/runtime/openmp" --include_path="/home/idris/ti/openmp_dsp_c667x_2_06_02_01/packages/ti/runtime/openmp/platforms" --include_path="/home/idris/workspace_v8/omp_linalglib4" --include_path="/home/idris/ti/ccsv8/tools/compiler/ti-cgt-c6000_8.2.4/include" --define=LIB_RTOS --define=SOC_C6678 --diag_warning=225 --diag_wrap=off --display_error_number --openmp --preproc_with_compile --preproc_dependency="dgemm_test.d_raw" --cmd_file="configPkg/compiler.opt" "../dgemm_test.c"
Finished building: "../dgemm_test.c"
 
Building target: "omp_linalglib4.out"
Invoking: C6000 Linker
"/home/idris/ti/ccsv8/tools/compiler/ti-cgt-c6000_8.2.4/bin/cl6x" -mv6600 -O2 --define=LIB_RTOS --define=SOC_C6678 --diag_warning=225 --diag_wrap=off --display_error_number --openmp -z -m"omp_linalglib4.map" -i"/home/idris/ti/ccsv8/tools/compiler/ti-cgt-c6000_8.2.4/lib" -i"/home/idris/ti/ccsv8/tools/compiler/ti-cgt-c6000_8.2.4/include" --priority --reread_libs --diag_wrap=off --display_error_number --warn_sections --xml_link_info="omp_linalglib4_linkInfo.xml" --rom_model -o "omp_linalglib4.out" "./dgemm_test.obj" "./fc_config_c6678.obj" "./ticblas_config.obj" -l"configPkg/linker.cmd" -llibc.a 
<Linking>
"configPkg/linker.cmd", line 131: warning #10068-D: no matching section
warning #10247-D: creating output section ".tdata" without a SECTIONS specification
warning #10247-D: creating output section ".tbss" without a SECTIONS specification
 
 undefined             first referenced    
  symbol                   in file         
 ---------             ----------------    
 cblas_dgemm           ./dgemm_test.obj    
 lib_L1D_config_SRAM   ./ticblas_config.obj
 lib_get_L1D_SRAM_size ./ticblas_config.obj
 lib_get_L2_SRAM_size  ./ticblas_config.obj
 tiCblasDelete         ./ticblas_config.obj
 tiCblasGetSizes       ./ticblas_config.obj
 tiCblasInit           ./ticblas_config.obj
 tiCblasNew            ./ticblas_config.obj
 
error #10234-D: unresolved symbols remain
error #10010: errors encountered during linking; "omp_linalglib4.out" not built
 
gmake[1]: *** [omp_linalglib4.out] Error 1
>> Compilation failure
makefile:140: recipe for target 'omp_linalglib4.out' failed
gmake: *** [all] Error 2
makefile:136: recipe for target 'all' failed

**** Build Finished ****

----------------------------------------------------------------

Edit 2:

Creating a makefile-project resolves the matter - still it would be amazing to find another solution.

  • Hi,

    The engineer supporting on LINALG is out of office.

    Your issue seems to be a project can be built by makefile, but you can't build it with a CCS project. We dealt with many such issues. The typical way is:
    - save the makefile build log
    - check what the compiler and linker options, pre-defined symbols
    - check the include path
    - check the linked library
    - check the liner command file

    Those can be found out from the build log or inside makefile. Then in your CCS project, you need to add all of them into it, like you find "-DSOC_C6678" " -DLIB_RTOS".

    You already in the final steps, the linking error indicates you missed some libraries. You need find them and just add into CCS project (for example, in the makefile approach, you have a *.map file, you can search what library has those functions).

    Regards, Eric
  • Hi Eric,

    Please excuse my late answer and thank you for your hints. I will try your suggestions and hopefully be able to post how I resolved the issue shortly.

    Best wishes,

    Idris

  • No problem, let me know the status!

    Regards, Eric
  • Hi ,you need to add the lib below. then the error is gone. 

    -l"C:/ti/linalg_1_2_0_0/packages/ti/linalg/lib/libcblas.ae66" \
    -l"C:/ti/libarch_1_0_0_0/packages/ti/libarch/lib/libarch.ae66" \

    so, I loud the .out file to my evm6678. but it the console show that:

    "
    [C66xx_0] Hello world from thread = 0
    Number of threads = 1
    L2 SRAM size is 393216, total L2 size is 524288."
    it seems that when the program go to the "tiCblasNew();",it runs crazy. 
    If you have any good tricks ,connect me by yufujianwork@163.com
  • Hi Fujian,

    Thanks a lot for your hint. Adding these two libraries to Project Properties -> Build -> C6000 Linker -> File Search Path -> Include Library file made it compile. I currently don't have the EVM with me but I will post my results as soon as I get it back.

    Best wishes,

    Idris

     EDIT: Without having my EVM at hand, it seems to me that you only loaded the program onto one core. I thought the program would be supposed to say hello from all cores C66XX_0-7.

  • Hi,Idris Kempf:

    how can I load the out file to all the cores?

    I go to "project-->debug as-->debug configurations->program" and set all the cores will load the program. 

    but it still dont work.

    I need your help. 

  • Hi,

    turns out that I have the same problem, the device seems to be hanging at tiCblasNew(). I can't recall what I did when I ran the program the first time a few months ago. But the program hangs for both the executable built via makefile and the one built via ccs.

    I now rebuilt the libarch and linalg libraries as described in http://processors.wiki.ti.com/index.php/Processor_SDK_Linear_Algebra_Library but the problem persists.

    I also connect/launch the program with the options described in http://downloads.ti.com/mctools/esd/docs/openmp-dsp/building_openmp_app.html

    I connect via my target config file which executes the setup script for core 0, I load the program onto all cores. When they all point to _c_int00 I am starting them and getting the output:

    [C66xx_0] Hello World from thread = 0
    Number of threads = 8
    [C66xx_1] Hello World from thread = 1
    [C66xx_2] Hello World from thread = 2
    [C66xx_3] Hello World from thread = 3
    [C66xx_4] Hello World from thread = 4
    [C66xx_5] Hello World from thread = 5
    [C66xx_6] Hello World from thread = 6
    [C66xx_7] Hello World from thread = 7
    L2 SRAM size is 393216, total L2 size is 524288.

    It then hangs at the the function tiCblasNew() and when I pause core 0 I see:

    GOMP_critical_start() at tomp_util.h:150
    bli_init() at bli_init.c55

    Any hints on that @TI please?

    Thank you very much for your help.

    Best wishes,

    Idris

  • Hi, Eric,Idris and I both got into the same problem , it seems the tiCblasNew() function didn't work at all. rebuild the lib didn't work, too.
    I think maybe the cmd file is the key to that problem , but i don't know how to config the file right , Can you help us?
  • Hi Yu,

    Try pressing Warm Reset (the middle button next to the ethernet plug). That helped in my case. However, it then blows up configuring the memory. Could you maybe give us your output to:
    printf("BLAS memory requirements - vfast size: %d, fast size: %d, medium size: %d, slow size: %d.\n", smem_size_vfast, smem_size_fast, smem_size_med, smem_size_slow);

    Thanks
  • Hi,

    I cannot get to line 162,it blows up at line 159.

    the printf code is in the config_mem_for_ticblas() function. I can't get there, so didn't print anything.

    second , I wanna ask you the cmd is generated automatically ,right?,Do I have to adjust a cmd file. 

  • Did you already try to only use the makefile example? Right lick on target configuration -> launch, connect to cores and run-> load program.

    What I also tried was reducing the problem size to 100 and pressing the full reset button next to the ethernet connection. This way you will maybe get to that print statement?

    Linker.cmd is certainly generated automatically. About linker_fc.cmd, I asked TI. That seems to be done by hand.

    Best wishes,

    Idris

  • Hi, friend, Can you show me your cmd file,I want to know if we have the same cmd file.
  • Hi:

    When I connect all the 8 cores, the console prints things below:

    "

    C66xx_0: GEL Output: Setup_Memory_Map...
    C66xx_0: GEL Output: Setup_Memory_Map... Done.
    C66xx_1: GEL Output: Setup_Memory_Map...
    C66xx_1: GEL Output: Setup_Memory_Map... Done.
    C66xx_2: GEL Output: Setup_Memory_Map...
    C66xx_2: GEL Output: Setup_Memory_Map... Done.
    C66xx_3: GEL Output: Setup_Memory_Map...
    C66xx_3: GEL Output: Setup_Memory_Map... Done.
    C66xx_4: GEL Output: Setup_Memory_Map...
    C66xx_4: GEL Output: Setup_Memory_Map... Done.
    C66xx_5: GEL Output: Setup_Memory_Map...
    C66xx_5: GEL Output: Setup_Memory_Map... Done.
    C66xx_6: GEL Output: Setup_Memory_Map...
    C66xx_6: GEL Output: Setup_Memory_Map... Done.
    C66xx_7: GEL Output: Setup_Memory_Map...
    C66xx_7: GEL Output: Setup_Memory_Map... Done.
    C66xx_0: GEL Output:
    Connecting Target...
    C66xx_0: GEL Output: DSP core #0
    C66xx_0: GEL Output: C6678L GEL file Ver is 2.0
    C66xx_0: GEL Output: Global Default Setup...
    C66xx_0: GEL Output: Setup Cache...
    C66xx_0: GEL Output: L1P = 32K
    C66xx_0: GEL Output: L1D = 32K
    C66xx_0: GEL Output: L2 = ALL SRAM
    C66xx_0: GEL Output: Setup Cache... Done.
    C66xx_0: GEL Output: Main PLL (PLL1) Setup ...
    C66xx_0: GEL Output: PLL1 Setup for DSP @ 1000.0 MHz.
    C66xx_0: GEL Output: SYSCLK2 = 333.333344 MHz, SYSCLK5 = 200.0 MHz.
    C66xx_0: GEL Output: SYSCLK8 = 15.625 MHz.
    C66xx_0: GEL Output: PLL1 Setup... Done.
    C66xx_0: GEL Output: Power on all PSC modules and DSP domains...
    C66xx_0: GEL Output: Security Accelerator disabled!
    C66xx_0: GEL Output: Power on all PSC modules and DSP domains... Done.
    C66xx_0: GEL Output: PA PLL (PLL3) Setup ...
    C66xx_0: GEL Output: PA PLL Setup... Done.
    C66xx_0: GEL Output: DDR3 PLL (PLL2) Setup ...
    C66xx_0: GEL Output: DDR3 PLL Setup... Done.
    C66xx_0: GEL Output: DDR begin (1333 auto)
    C66xx_0: GEL Output: XMC Setup ... Done
    C66xx_0: GEL Output:
    DDR3 initialization is complete.
    C66xx_0: GEL Output: DDR done
    C66xx_0: GEL Output: DDR3 memory test... Started
    C66xx_0: GEL Output: DDR3 memory test... Passed
    C66xx_0: GEL Output: PLL and DDR Initialization completed(0) ...
    C66xx_0: GEL Output: configSGMIISerdes Setup... Begin
    C66xx_0: GEL Output:
    SGMII SERDES has been configured.
    C66xx_0: GEL Output: Enabling EDC ...
    C66xx_0: GEL Output: L1P error detection logic is enabled.
    C66xx_0: GEL Output: L2 error detection/correction logic is enabled.
    C66xx_0: GEL Output: MSMC error detection/correction logic is enabled.
    C66xx_0: GEL Output: Enabling EDC ...Done
    C66xx_0: GEL Output: Configuring CPSW ...
    C66xx_0: GEL Output: Configuring CPSW ...Done
    C66xx_0: GEL Output: Global Default Setup... Done.
    C66xx_1: GEL Output:
    Connecting Target...
    C66xx_1: GEL Output: DSP core #1
    C66xx_1: GEL Output: C6678L GEL file Ver is 2.0
    C66xx_1: GEL Output: Global Default Setup...
    C66xx_1: GEL Output: Setup Cache...
    C66xx_1: GEL Output: L1P = 32K
    C66xx_1: GEL Output: L1D = 32K
    C66xx_1: GEL Output: L2 = ALL SRAM
    C66xx_1: GEL Output: Setup Cache... Done.
    C66xx_1: GEL Output: Global Default Setup... Done.
    C66xx_2: GEL Output:
    Connecting Target...
    C66xx_2: GEL Output: DSP core #2
    C66xx_2: GEL Output: C6678L GEL file Ver is 2.0
    C66xx_2: GEL Output: Global Default Setup...
    C66xx_2: GEL Output: Setup Cache...
    C66xx_2: GEL Output: L1P = 32K
    C66xx_2: GEL Output: L1D = 32K
    C66xx_2: GEL Output: L2 = ALL SRAM
    C66xx_2: GEL Output: Setup Cache... Done.
    C66xx_2: GEL Output: Global Default Setup... Done.
    C66xx_3: GEL Output:
    Connecting Target...
    C66xx_3: GEL Output: DSP core #3
    C66xx_3: GEL Output: C6678L GEL file Ver is 2.0
    C66xx_3: GEL Output: Global Default Setup...
    C66xx_3: GEL Output: Setup Cache...
    C66xx_3: GEL Output: L1P = 32K
    C66xx_3: GEL Output: L1D = 32K
    C66xx_3: GEL Output: L2 = ALL SRAM
    C66xx_3: GEL Output: Setup Cache... Done.
    C66xx_3: GEL Output: Global Default Setup... Done.
    C66xx_4: GEL Output:
    Connecting Target...
    C66xx_4: GEL Output: DSP core #4
    C66xx_4: GEL Output: C6678L GEL file Ver is 2.0
    C66xx_4: GEL Output: Global Default Setup...
    C66xx_4: GEL Output: Setup Cache...
    C66xx_4: GEL Output: L1P = 32K
    C66xx_4: GEL Output: L1D = 32K
    C66xx_4: GEL Output: L2 = ALL SRAM
    C66xx_4: GEL Output: Setup Cache... Done.
    C66xx_4: GEL Output: Global Default Setup... Done.
    C66xx_5: GEL Output:
    Connecting Target...
    C66xx_5: GEL Output: DSP core #5
    C66xx_5: GEL Output: C6678L GEL file Ver is 2.0
    C66xx_5: GEL Output: Global Default Setup...
    C66xx_5: GEL Output: Setup Cache...
    C66xx_5: GEL Output: L1P = 32K
    C66xx_5: GEL Output: L1D = 32K
    C66xx_5: GEL Output: L2 = ALL SRAM
    C66xx_5: GEL Output: Setup Cache... Done.
    C66xx_5: GEL Output: Global Default Setup... Done.
    C66xx_6: GEL Output:
    Connecting Target...
    C66xx_6: GEL Output: DSP core #6
    C66xx_6: GEL Output: C6678L GEL file Ver is 2.0
    C66xx_6: GEL Output: Global Default Setup...
    C66xx_6: GEL Output: Setup Cache...
    C66xx_6: GEL Output: L1P = 32K
    C66xx_6: GEL Output: L1D = 32K
    C66xx_6: GEL Output: L2 = ALL SRAM
    C66xx_6: GEL Output: Setup Cache... Done.
    C66xx_6: GEL Output: Global Default Setup... Done.
    C66xx_7: GEL Output:
    Connecting Target...
    C66xx_7: GEL Output: DSP core #7
    C66xx_7: GEL Output: C6678L GEL file Ver is 2.0
    C66xx_7: GEL Output: Global Default Setup...
    C66xx_7: GEL Output: Setup Cache...
    C66xx_7: GEL Output: L1P = 32K
    C66xx_7: GEL Output: L1D = 32K
    C66xx_7: GEL Output: L2 = ALL SRAM
    C66xx_7: GEL Output: Setup Cache... Done.
    C66xx_7: GEL Output: Global Default Setup... Done.
    C66xx_0: GEL Output: Invalidate All Cache...
    C66xx_0: GEL Output: Invalidate All Cache... Done.
    C66xx_0: GEL Output: GEL Reset...
    C66xx_0: GEL Output: GEL Reset... Done.
    C66xx_0: GEL Output: Disable all EDMA3 interrupts and events.
    C66xx_1: GEL Output: Invalidate All Cache...
    C66xx_1: GEL Output: Invalidate All Cache... Done.
    C66xx_1: GEL Output: GEL Reset...
    C66xx_1: GEL Output: GEL Reset... Done.
    C66xx_2: GEL Output: Invalidate All Cache...
    C66xx_2: GEL Output: Invalidate All Cache... Done.
    C66xx_2: GEL Output: GEL Reset...
    C66xx_2: GEL Output: GEL Reset... Done.
    C66xx_3: GEL Output: Invalidate All Cache...
    C66xx_3: GEL Output: Invalidate All Cache... Done.
    C66xx_3: GEL Output: GEL Reset...
    C66xx_3: GEL Output: GEL Reset... Done.
    C66xx_4: GEL Output: Invalidate All Cache...
    C66xx_4: GEL Output: Invalidate All Cache... Done.
    C66xx_4: GEL Output: GEL Reset...
    C66xx_4: GEL Output: GEL Reset... Done.
    C66xx_5: GEL Output: Invalidate All Cache...
    C66xx_5: GEL Output: Invalidate All Cache... Done.
    C66xx_5: GEL Output: GEL Reset...
    C66xx_5: GEL Output: GEL Reset... Done.
    C66xx_6: GEL Output: Invalidate All Cache...
    C66xx_6: GEL Output: Invalidate All Cache... Done.
    C66xx_6: GEL Output: GEL Reset...
    C66xx_6: GEL Output: GEL Reset... Done.
    C66xx_7: GEL Output: Invalidate All Cache...
    C66xx_7: GEL Output: Invalidate All Cache... Done.
    C66xx_7: GEL Output: GEL Reset...
    C66xx_7: GEL Output: GEL Reset... Done.

    Then I load the .out file, I found that when the load is done, the core0 is suspend, while core1~core7 all are running , I think after loading the file , all the cores should be suspend, they should not run until I press F8(resume) 

    I still don't have any output from core1~core7, what I got is still :

    "

    [C66xx_0] Hello World from thread = 0
    Number of threads = 1
    L2 SRAM size is 393216, total L2 size is 524288.

    "

  • I read the website"downloads.ti.com/.../building_openmp_app.html" again, found i fogot to
    (1)Enable the --openmp compiler option (under Advanced options ‣ Advanced Optimizations)
    (2)Enable the –priority linker option (under C6000 Linker ‣ File Search Path ‣ Search libraries in priority order)

    after that , I got the output
    "
    [C66xx_0] Hello World from thread = 0
    [C66xx_1] Hello World from thread = 1
    [C66xx_3] Hello World from thread = 3
    [C66xx_7] Hello World from thread = 7
    [C66xx_0] Number of threads = 8
    [C66xx_6] Hello World from thread = 6
    [C66xx_5] Hello World from thread = 5
    [C66xx_2] Hello World from thread = 2
    [C66xx_4] Hello World from thread = 4
    [C66xx_0] L2 SRAM size is 393216, total L2 size is 524288.

    "
    the same output as yours.
  • Well that's something already. I did not change the linker_fc.cmd at all. It is the same than the one for the example. Does it then still blow up at ticblasnew?

  • yeah, still blow up at ticblasnew().

    I saw the introduction in "downloads.ti.com/.../configuring_runtime.html

    configuring Memory Regions

    The OpenMP runtime needs to know the memory ranges corresponding to the various memory regions described in the platform file. It uses this information to set the appropriate caching attributes for the regions.

    Keystone and Keystone II processors have onchip MSMC memory. Part of this memory is configured as non-cacheable and is used by the OpenMP runtime to store shared state.

    // Pull in memory ranges described in Platform.xdc to configure the runtime

    var ddr3       = Program.cpu.memoryMap["DDR3"];

    var msmc       = Program.cpu.memoryMap["MSMCSRAM"];

    var msmcNcVirt = Program.cpu.memoryMap["OMP_MSMC_NC_VIRT"];

    var msmcNcPhy  = Program.cpu.memoryMap["OMP_MSMC_NC_PHY"];

    while my .cfg file have extra line  "var ddr3_nc    = Program.cpu.memoryMap["DDR3_NC"]";

    I think maybe .cfg should be adjusted.

  • I didn't put the linker_fc.cmd in my project.
    because if I include it in my project, it build with errors, saying "#10008-D cannot find file "edma3_lld_rm.ae66""
    but I already add the lib. so how do you handle the problem.????

    forget to say , it seems the edma3_lld_rm.ae66 is the package of the edmamgr.ae66、ecpy.ae66、edma3Chan.ae66、edma3.ae66。。。。

    so, I changed the cmd file like this below

    // "edmamgr.ae66" (.fardata)
    // "ecpy.ae66" (.fardata)
    // "edma3Chan.ae66" (.fardata)
    // "edma3.ae66" (.fardata)
    // "rman.ae66" (.fardata)
    // "nullres.ae66" (.fardata)
    // "fcsettings.ae66" (.fardata)*/
    "edma3_lld_rm.ae66" (.fardata)
  • I just added the absolute path to all these libraries in the linker_fc.cmd, e.g. instead of

    "edmamgr.ae66" (.fardata)

    I have

    "/home/ti/framework_components_3_40_02_07/packages/ti/sdo/fc/edmamgr/lib/debug/edmamgr.ae66" (.fardata)

    for all .ae66 files.
  • Think you for your help, I have run the example successfully, finally.

    the last question I faced is I have to use the original-version lib(libcblas.ae66, libarch.ae66) to compile the project. 

    If I rebuild the lib using the method in processors.wiki.ti.com/.../Processor_SDK_Linear_Algebra_Library

  • Hi Yu,

    I had exactly the same problem. If I use the library I built myself, it will complain about that BLAS does not fit on the memory - is it the same for you? I was lucky enough to save the libraries TI delivers.

    What was the problem in the end/what fixed your problem?

    Best,

    Idris