This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Compiler/AM5708: Rebuilding OpenCL

Part Number: AM5708

Tool/software: TI C/C++ Compiler

Hi,

I am using custom am5708 board and ti-processor-sdk-linux-am57xx-evm-04.03.00.05.

I have an edge detection application that depends on opencl components. when i use dra7-dsp1-fw.xe66.opencl-monitor and dra7-dsp2-fw.xe66.opencl-monitor  that comes with the SDK, the application is running fine at 1GB DDR3 board.

But our custom board is designed with 512MB DDR3, which need to modify the opencl resource table( cmem address and size, 0xA0000000 - > 0x90000000), After I rebuild the https://git.ti.com/cgit/opencl/ti-opencl/, tag : v01.01.14.10, this application is not crash due to illegal access, but it can't not do edge detection.

We use OpenEmbedded to complete opencl binary, It will show the same phenomenon, even if I have not changed any source code.

Can you tell me the general reason for this problem, or how is the opencl firmware in the SDK compiled and generated?

Thanks.

  • Hello CY,

    1) Please try rebuilding OpenCL to run on the TI board with no modifications to make sure that your build process is not causing issues.

    2) Please provide more information on "this application is not crash due to illegal access, but it can't not do edge detection". Is there terminal output that would be useful for understanding the problem?

    Regards,

    Nick

  • Hi CY,

       You mentioned that you patch cmem address and size in resource table and device tree, moving from 0xA000_0000 to 0x9000_0000.  Please make sure you also updated the OpenCL host side (see enclosed).

    - Yuan

    diff --git a/host/src/core/dsp/dspmem.h b/host/src/core/dsp/dspmem.h
    index a148750..20b2fb9 100644
    --- a/host/src/core/dsp/dspmem.h
    +++ b/host/src/core/dsp/dspmem.h
    @@ -67,7 +67,7 @@
      * resource table defined in monitor_vayu/custom_rsc_table_vayu_dsp.h
      * The values of the defines here must match definitions in the resource table.
      ***************************************************************************/
    -#define AM57_DSP_PHY_ADDR       (0xA0000000)
    +#define AM57_DSP_PHY_ADDR       (0x90000000)
     #define AM57_DSP_VIRT_ADDR      (0x80000000)
     #define AM57_DSP_V2P_OFFSET     (AM57_DSP_PHY_ADDR - AM57_DSP_VIRT_ADDR)
     

  • Hi Yuan,

    I checkout v01.01.14.10 tag code, dspmem.h do not have similar code in your screenshot.

    #define DSP_36BIT_ADDR          0x800000000ULL
    #define MPAX_USER_MAPPED_DSP_ADDR   0x840000000ULL
    #define ALL_PERSISTENT_MAX_DSP_ADDR 0x880000000ULL
    
    /*****************************************************************************
     * AM57 - DSP Device Memory Physical Addreess
     * 0x0:A000_0000 - 0x0:A00F_FFFF: 16MB shared heap
     * 0x0:A010_0000 - 0x0:A01F_FFFF: 16MB DDR no cache
     * 0x0:A020_0000 - 0x0:B020_0000: 256MB - 32MB General Purpose CMEM memory
     *
     * The first 32MB of CMEM are reserved for the monitor.
     *****************************************************************************/
    #define RESERVED_CMEM_SIZE      (0x02000000)
    
    #define ROUNDUP(val, pow2)   (((val) + (pow2) - 1) & ~((pow2) - 1))
    #define MIN_BLOCK_SIZE                 128
    #define MIN_CMEM_ONDEMAND_BLOCK_SIZE  4096
    #define MIN_CMEM_MAP_ALIGN            4096
    #define MAX_CMEM_MAP_ALIGN            (256*1024*1024)

  • Hi Nick,

    Sorry for the delay reply.

    Q1. I don’t have a TI board now, but I think our custom borad is fine. when we use  opencl-monitor/1.1.14.10-r0.0/files, the application is fine.

    Q2: Sorry my description is not clear. 

    When I use a 512MB board, the physical address range is 0x80000000-0x9FFFFFFF, but the cmem address is set to 0xA0000000. When running the program, an illegal access error occurs, but this problem does not occur with a 1GB board.

    I currently use a 1GB board, and check out 1.1.14.10-r0.0 have not changed the code of ti-opencl and rebuild it, The edge acquisition program will have an abnormal situation, which is reflected in the second picture above, and only some acquisition points will appear.

    I think the main problem at present is the compilation problem. Can you give the compilation environment and compile command for reference?

    Thanks.

  • Hi CY,

        You are right.  The v01.01.14.10 code has changed from the old diff.  You do need to make a modification to the platform definition:

    diff --git a/monitor/platforms/am57x/Platform.xdc b/monitor/platforms/am57x/Platform.xdc
    index 2e47673..9abd0f6 100644
    --- a/monitor/platforms/am57x/Platform.xdc
    +++ b/monitor/platforms/am57x/Platform.xdc
    @@ -57,7 +57,7 @@ config ti.platforms.generic.Platform.Instance CPU =
     
           /* Non-cached DDR */
           [ "DDR3_NC",   { name: "DDR3_NC",
    -                          base: 0xA0000000,
    +                          base: 0x90000000,
                               len:  0x01000000,
                               space: "code/data",
                               access: "RWX", } ],
    @@ -66,13 +66,13 @@ config ti.platforms.generic.Platform.Instance CPU =
     
           /* Stack for ocl_service_omp task - 0x10000 for each core */
           [ "DDR3_STACK", { name: "DDR3_STACK",
    -                          base: 0xA1000000,
    +                          base: 0x91000000,
                               len:  0x00020000,
                               space: "data",
                               access: "RWX", } ],
     
           [ "DDR3_HEAP", { name: "DDR3_HEAP",
    -                          base: 0xA1020000,
    +                          base: 0x91020000,
                               len:  0x00FE0000,
                               space: "code/data",
                               access: "RWX", } ],
    

        The official Processor SDK build is using yocto: http://software-dl.ti.com/processor-sdk-linux/esd/docs/latest/linux/Overview_Building_the_SDK.html

        Nick has some good suggestions:

        1) Build a default OpenCL, with nothing changed, see if it works on your 1GB board,

        2) Modify OpenCL source for 512MB memory, build OpenCL, see if it works on your 1GB board,

        3) Take OpenCL build from 2), test it on your 512MB board.

        Some additional comments:

        4) When testing, re-compile and test some simple OpenCL examples on the EVM filesystem first, e.g. null, vecadd

    - Yuan

  • Hi Yuan,

    Thanks for your reply.

    In fact, I have modified the platform files and custom_rsc_table_vayu_dsp.h as you explained. it run failed at 1GB board and 512MB board.

    Yes, null, vecadd and other opencl-examples-1.1.14.10 examples is running fine.

    There is a strange phenomenon here, I use the same code, but rebuild after multiple cleans. There is a chance that the application will run normally,but there is no stable probability.

  • Hi CY,

        Let's confirm what's working and what's not working.

        What's working:

        - You modified platform file and resource table, rebuilt OpenCL, deployed to both 1GB and 512MB custom board,

        - All opencl examples (/usr/share/ti/example/opencl/) works after recompile.

            - cd /usr/shar/ti/examples/opencl; make test EXAMPLE_SET=EXCLUDE_PERSISTENT

        - If this is the case, then you do have a working OpenCL on your 512MB custom board

        What's not working:

        - Your edge detection application didn't work as expected with the newly built OpenCL on the 512MB board

        Debugging:

        - Where are you building the edge detection app?  Are you building it on the EVM?  Are you cross compiling it?

        - If cross compiling, make sure "/usr/share/ti/opencl/" deployed on the EVM from newly built OpenCL is copied to your Linux host machine, set TI_OCL_INSTALL properly.

        - To avoid confusion, maybe try a clean build of your edge detection app on the EVM

        - Add a few "printf"s in your edge detection kernel program, see if the kernel is actually invoked on the DSP

    - Yuan

  • Hi Yuan,

    Sorry for delay.

    I have tested the opecl program according to the instructions. It seems to be normal. Detailed log information is saved in log.txt.

    software info: ti-opencl v01.01.14.10; modify the Makefile such as env.patch; build command: make BUILD_AM57 = 1 MAKECMDGOALS = arm -j 8.

    I currently do not have a 512MB board, so I use a 1GB board and have not changed the source code of opencl. I want to use this method to reduce troubleshooting points.

    log.txt

    root@am57xx-evm:/usr/share/ti/examples/opencl# ./run.sh
    ===============  platforms  =================
    Compiling main.cpp
    Running platforms
    [  145.157933] omap_hwmod: mmu0_dsp2: _wait_target_disable failed
    [  145.163833] omap-iommu 41501000.mmu: 41501000.mmu: version 3.0
    [  145.171626] omap-iommu 41502000.mmu: 41502000.mmu: version 3.0
    [  145.185079] omap_hwmod: mmu0_dsp1: _wait_target_disable failed
    [  145.190968] omap-iommu 40d01000.mmu: 40d01000.mmu: version 3.0
    [  145.196978] omap-iommu 40d02000.mmu: 40d02000.mmu: version 3.0
    ===============  ccode  =================
    Compiling main.cpp
    Compiling ccode.c
    Running ccode
    ===============  offline  =================
    Compiling main.cpp
    Compiling vadd.cl
    Running offline
    ===============  dgemm  =================
    Compiling main.cpp
    Compiling cblas_dgemm.cpp
    [  172.086554] omap_hwmod: mmu1_dsp1: _wait_target_disable failed
    [  172.086585] omap_hwmod: mmu1_dsp2: _wait_target_disable failed
    [  172.093129] omap_hwmod: mmu0_dsp2: _wait_target_disable failed
    [  172.110704] omap_hwmod: mmu0_dsp1: _wait_target_disable failed
    Compiling init.cpp
    ar: creating libcblas_dgemm_dsp.a
    Running dgemm
    [  189.258238] omap_hwmod: mmu0_dsp2: _wait_target_disable failed
    [  189.264133] omap-iommu 41501000.mmu: 41501000.mmu: version 3.0
    [  189.270083] omap-iommu 41502000.mmu: 41502000.mmu: version 3.0
    [  189.283526] omap_hwmod: mmu0_dsp1: _wait_target_disable failed
    [  189.289410] omap-iommu 40d01000.mmu: 40d01000.mmu: version 3.0
    [  189.295411] omap-iommu 40d02000.mmu: 40d02000.mmu: version 3.0
    ===============  simple  =================
    Compiling simple.cpp
    Running simple
    ===============  float_compute  =================
    Compiling main.cpp
    Compiling dsp_compute.cl
    Running float_compute
    ===============  matmpy  =================
    Compiling main.cpp
    Compiling ccode.c
    Compiling kernel.cl
    Running matmpy
    ===============  sgemm  =================
    Compiling sgemm.c
    Compiling sgemm_kernel.c
    [  221.286627] omap_hwmod: mmu1_dsp2: _wait_target_disable failed
    [  221.299159] omap_hwmod: mmu1_dsp1: _wait_target_disable failed
    [  221.311675] omap_hwmod: mmu0_dsp1: _wait_target_disable failed
    [  221.324148] omap_hwmod: mmu0_dsp2: _wait_target_disable failed
    Compiling data_move.c
    Compiling main.cpp
    Running sgemm
    [  230.478898] omap_hwmod: mmu0_dsp2: _wait_target_disable failed
    [  230.484801] omap-iommu 41501000.mmu: 41501000.mmu: version 3.0
    [  230.490894] omap-iommu 41502000.mmu: 41502000.mmu: version 3.0
    [  230.504280] omap_hwmod: mmu0_dsp1: _wait_target_disable failed
    [  230.510171] omap-iommu 40d01000.mmu: 40d01000.mmu: version 3.0
    [  230.516094] omap-iommu 40d02000.mmu: 40d02000.mmu: version 3.0
    ===============  dsplib_fft  =================
    Compiling fft_ocl.cpp
    Running dsplib_fft
    ===============  vecadd_openmp  =================
    Compiling main.cpp
    Compiling vadd_openmp.c
    Running vecadd_openmp
    ===============  buffer  =================
    Compiling main.cpp
    Running buffer
    ===============  vecadd  =================
    Compiling main.cpp
    Compiling main_md.cpp
    Running vecadd
    ===============  abort_exit  =================
    Compiling kernel.cl
    Compiling main.cpp
    Running abort_exit
    ===============  vecadd_openmp_t  =================
    Compiling main.cpp
    Compiling vadd_openmp.c
    Running vecadd_openmp_t
    ===============  edmamgr  =================
    Compiling kernel.cl
    Compiling main.cpp
    Running edmamgr
    ===============  null  =================
    Compiling main.cpp
    Running null
    ===============  ooo_callback  =================
    Compiling ooo_callback.cpp
    Running ooo_callback
    ===============  timeout  =================
    Compiling kernel.cl
    Compiling main.cpp
    Running timeout
    ===============  offline_embed  =================
    Compiling vadd.cl
    Compiling main.cpp
    Running offline_embed
    ===============  monte_carlo  =================
    Compiling dsp_ccode.c
    [  305.119603] EXT4-fs (mmcblk0p2): error count since last fsck: 3
    [  305.125563] EXT4-fs (mmcblk0p2): initial error at time 946684809: ext4_mb_generate_buddy:758
    [  305.134057] EXT4-fs (mmcblk0p2): last error at time 1578215643: ext4_mb_generate_buddy:758
    [  307.136603] omap_hwmod: mmu1_dsp2: _wait_target_disable failed
    [  307.150067] omap_hwmod: mmu1_dsp1: _wait_target_disable failed
    [  307.162987] omap_hwmod: mmu0_dsp2: _wait_target_disable failed
    [  307.175889] omap_hwmod: mmu0_dsp1: _wait_target_disable failed
    [  310.239605] EXT4-fs (sda2): error count since last fsck: 1501
    [  310.245395] EXT4-fs (sda2): initial error at time 1565249617: ext4_iget:4697: inode 193
    [  310.253449] EXT4-fs (sda2): last error at time 1565248313: ext4_lookup:1611: inode 1221
    Compiling dsp_kernels.cl
    Compiling cpu_main.cpp
    Running monte_carlo
    [  359.638776] omap_hwmod: mmu0_dsp2: _wait_target_disable failed
    [  359.644670] omap-iommu 41501000.mmu: 41501000.mmu: version 3.0
    [  359.651948] omap-iommu 41502000.mmu: 41502000.mmu: version 3.0
    [  359.665095] omap_hwmod: mmu0_dsp1: _wait_target_disable failed
    [  359.670976] omap-iommu 40d01000.mmu: 40d01000.mmu: version 3.0
    [  359.678578] omap-iommu 40d02000.mmu: 40d02000.mmu: version 3.0
    root@am57xx-evm:/usr/share/ti/examples/opencl# [  370.407055] omap_hwmod: mmu1_dsp1: _wait_target_disable failed
    [  370.419965] omap_hwmod: mmu0_dsp1: _wait_target_disable failed
    [  370.433652] omap_hwmod: mmu1_dsp2: _wait_target_disable failed
    [  370.447293] omap_hwmod: mmu0_dsp2: _wait_target_disable failed
    
    root@am57xx-evm:/usr/share/ti/examples/opencl#

    env.patch:

    diff --git a/host/Makefile.inc b/host/Makefile.inc
    index addf8cc..7d60b5a 100644
    --- a/host/Makefile.inc
    +++ b/host/Makefile.inc
    @@ -19,31 +19,31 @@ OCL_VERSIONED_NAME=$(OCL_DPKG_NAME)_$(OCL_VER)
     
     OCL_FULL_VER=$(OCL_MAJOR_VER).$(OCL_MINOR_VER).$(OCL_RELEASE_VER).$(OCL_PATCH_VER)
     
    -SDOMC_SHARED?=/cgnas
    +SDOMC_SHARED?=/home/cy/sdk/am57x/ti-processor-sdk-rtos-am57xx-evm-04.03.00.05
     
     # XDC (not included in CoreSDK)
    -XDC_DIR?=$(SDOMC_SHARED)/xdctools_3_32_01_22_core
    +XDC_DIR?=$(SDOMC_SHARED)/xdctools_3_50_03_33_core
     
     # TI C6x CGT
     
     ifeq (,$(findstring arm, $(UNAME_M)))
    -    TI_OCL_CGT_INSTALL?=$(SDOMC_SHARED)/ti-cgt-c6000-8.2.0-release-linux
    +    TI_OCL_CGT_INSTALL?=$(SDOMC_SHARED)/ti-cgt-c6000_8.2.2
     else
    -    TI_OCL_CGT_INSTALL?=$(SDOMC_SHARED)/ti-cgt-c6000-8.2.0-release-armlinuxa8hf
    +    TI_OCL_CGT_INSTALL?=$(SDOMC_SHARED)/ti-cgt-c6000_8.2.2
     endif
     
     # LLVM
     
     ifeq ($(BUILD_OS), SYS_BIOS)
    -    ARM_LLVM_DIR=$(SDOMC_SHARED)/llvm-3.6.0-20170608-sysbios
    +    ARM_LLVM_DIR=/home/cy/git-store/am57x/ti-llvm/install.arm
     else
    -    ARM_LLVM_DIR=$(SDOMC_SHARED)/llvm-3.6.0-20170608-arm
    +    ARM_LLVM_DIR=/home/cy/git-store/am57x/ti-llvm/install.arm
     endif
     
     ifeq (,$(findstring x86_64, $(UNAME_M)))
    -    X86_LLVM_DIR?=$(SDOMC_SHARED)/llvm-3.6.0-20170608-x86
    +    X86_LLVM_DIR?=/home/cy/git-store/am57x/ti-llvm/install.x86
     else
    -    X86_LLVM_DIR?=$(SDOMC_SHARED)/llvm-3.6.0-20170608-x86_64
    +    X86_LLVM_DIR?=/home/cy/git-store/am57x/ti-llvm/install.x86
     endif
     
     CORESDK_VERSION ?= ti2017.04-rc2
    @@ -57,7 +57,7 @@ else ifeq ($(BUILD_K2E),1)
     else ifeq ($(BUILD_K2G),1)
     	CORE_SDK?=$(SDOMC_SHARED)/ti-processor-sdk-linux-k2g-evm-$(CORESDK_VERSION)
     else ifeq ($(BUILD_AM57),1)
    -	CORE_SDK?=$(SDOMC_SHARED)/ti-processor-sdk-linux-am57xx-evm-$(CORESDK_VERSION)
    +	CORE_SDK?=/home/cy/sdk/am57x/ti-processor-sdk-linux-am57xx-evm-04.03.00.05
     else
     endif
     
    @@ -69,12 +69,12 @@ endif
     
     # ARM GCC
     ifeq (,$(findstring arm, $(UNAME_M)))
    -    ARM_GCC_DIR?=$(SDOMC_SHARED)/gcc-linaro-6.2.1-2016.11-x86_64_arm-linux-gnueabihf
    -    GCC_ARM_NONE_TOOLCHAIN?=$(SDOMC_SHARED)/gcc-arm-none-eabi-4_9-2015q3
    +    ARM_GCC_DIR?=$(SDOMC_SHARED)/gcc-linaro-7.2.1-2017.11-x86_64_arm-linux-gnueabihf
    +    GCC_ARM_NONE_TOOLCHAIN?=$(SDOMC_SHARED)/gcc-arm-none-eabi-6-2017-q1-update
     endif
     
     # RTOS packages (BIOS, IPC, FC, EDMA3_LLD etc.)
    -RTOS_INSTALL_DIR?=$(LINUX_DEVKIT_ROOT)/usr/share/ti
    +RTOS_INSTALL_DIR?=/home/cy/sdk/am57x/ti-processor-sdk-linux-am57xx-evm-04.03.00.05/linux-devkit/sysroots/armv7ahf-neon-linux-gnueabi/usr/share/ti
     
     # OMP 
     OMP_DIR?=$(RTOS_INSTALL_DIR)/ti-omp-tree
    

  • Hi Yuan,

    I cross compiling the edge detection app, but i will try to build it on the EVM, and let you know the result.

    Thanks.