This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Linux/AM5728: OpenCV DSP build error

Part Number: AM5728

Tool/software: Linux

#include <unistd.h>
#include <time.h>
#include <stdio.h>
#include <opencv2/opencv.hpp>
#include <opencv2/core/ocl.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#include <opencv2/highgui/highgui.hpp>

#ifndef _OCL_HPP_
#include "opencv2/core/ocl.hpp"
#endif

using namespace cv;
using namespace std;


int main(){
VideoCapture cap;
cap.open("1.avi");

if(!cap.isOpened())
{
cout << "colud not load vodeo...."<<endl;
return -1;
}

ocl::setUseOpenCL(true);
Mat matSrc;
while(cap.read(matSrc))
{
cvtColor( matSrc , matSrc , CV_RGB2GRAY );

Mat padded;
int m = getOptimalDFTSize(matSrc.rows); // Return size of 2^x that suite for FFT
int n = getOptimalDFTSize(matSrc.cols);
// Padding 0, result is @padded
copyMakeBorder(matSrc, padded, 0, m-matSrc.rows, 0, n-matSrc.cols, BORDER_CONSTANT, Scalar::all(0));

// Create planes to storage REAL part and IMAGE part, IMAGE part init are 0
Mat planes[] = {Mat_<float>(padded), Mat::zeros(padded.size(), CV_32F) };

UMat complexI;
merge(planes, 2, complexI);
cv::dft(complexI, complexI);
}
ocl::setUseOpenCL(false);

return 0;
}




We tried to use DSP accelerated cv::dft() 

It would compiled the opencl kernel for about 5 mins and return a compile error :

>>> Optimizer terminated abnormally
>>>> in function ifft_multi_radix_rows()

>> Compilation failure
OpenCL program build log: -D LOCAL_SIZE=50 -D kercn=10 -D FT=float -D CT=float2 -D RADIX_PROCESS='fft_radix2_B5(smem,twiddles+0,ind,1,25);fft_radix5_B2(smem,twiddles+1,ind,2,1T

>> Compilation failure

It will succeed in ARM .My SDK is 4.2.

Please help us ,we need dsp to accelerate dft();

  • Hi,

    It will succeed in ARM .My SDK is 4.2.
    

    I am confused here. Are you saying that it compiles successfully using SDK 4.2 or successfully when it is compiled natively on the ARM platform but failed when cross-compile?

    Rex

  • It would compiled the opencl kernel for about 5 mins and return a compile error :

    >>> Optimizer terminated abnormally
    >>>> in function ifft_multi_radix_rows()

    >> Compilation failure
    OpenCL program build log: -D LOCAL_SIZE=50 -D kercn=10 -D FT=float -D CT=float2 -D RADIX_PROCESS='fft_radix2_B5(smem,twiddles+0,ind,1,25);fft_radix5_B2(smem,twiddles+1,ind,2,1T

    >> Compilation failure

    It will succeed in ARM .My SDK is 4.2.

    Please help us ,we need dsp to accelerate dft();
  • So, you are saying it works if you compile the application on the EVM natively, but failed cross-compiling using Linux Host PC?
  • Hi,

    I just checked our OpenCV expert. In OpenCV, we are always doing native (ARM) compilation of OpenCL kernels. OpenCV triggers OpenCL online compilation only when specific kernels is dispatched first time. That’s why you see long execution time initially (since ARM needs to compile the kernels).

    Rex


  • Hi,

    I tried on my Linux host to cross-compile your code. I am able to build without any errors. My build command is as following. I didn't bother to check which libraries are not needed but throw everything in

    ~/work/issues/opencv-build-error$ arm-linux-gnueabihf-g++ -I~/work/ti-processor-sdk-linux-am57xx-evm-05.00.00.15/linux-devkit/sysroots/armv7ahf-neon-linux-gnueabi/usr/include/opencv -I~/work/ti-processor-sdk-linux-am57xx-evm-05.00.00.15/linux-devkit/sysroots/armv7ahf-neon-linux-gnueabi/usr/include/opencv2 -L~/work/ti-processor-sdk-linux-am57xx-evm-05.00.00.15/linux-devkit/sysroots/armv7ahf-neon-linux-gnueabi/usr/lib -g -o test test.cpp -lrt -lopencv_core -lopencv_imgproc -lopencv_video -lopencv_videoio -lopencv_features2d -lopencv_imgcodecs
    a0850461@udb0850461:~/work/issues/opencv-build-error$ ll
    total 268
    drwxr-xr-x 2 user cleartnp 4096 Oct 17 15:43 ./
    drwxr-xr-x 15 user cleartnp 4096 Oct 17 14:18 ../
    -rwxr-xr-x 1 user cleartnp 234564 Oct 17 15:43 test*
    -rw-r--r-- 1 user cleartnp 1066 Oct 17 13:45 test.cpp

    I don't see any issue for compiling on host.
  • its compile the DSP kernels error!!!

    export TI_OCL_LOAD_KERNELS_ONCHIP=Y
    export TI_OCL_CACHE_KERNELS=Y
    export OPENCV_OPENCL_DEVICE='TI AM57:ACCELERATOR:TI Multicore C66 DSP'
    ./test

    run "test" on your ARM Linux, It would compiled the opencl kernel for about 5~10 mins and return a compile error :

    >>> Optimizer terminated abnormally
    >>>> in function ifft_multi_radix_rows()

    >> Compilation failure
    OpenCL program build log: -D LOCAL_SIZE=50 -D kercn=10 -D FT=float -D CT=float2 -D RADIX_PROCESS='fft_radix2_B5(smem,twiddles+0,ind,1,25);fft_radix5_B2(smem,twiddles+1,ind,2,1T

    >> Compilation failure
  • I don't see the OpenCL compilation error. 

    root@am57xx-evm:~/opencv_dsp_build_error# cat run.sh
    export TI_OCL_LOAD_KERNELS_ONCHIP=Y
    export TI_OCL_CACHE_KERNELS=Y
    export OPENCV_OPENCL_DEVICE='TI AM57:ACCELERATOR:TI Multicore C66 DSP'
    echo "OpenCL on, canny"
    ./test
    export OPENCV_OPENCL_DEVICE='disabled'
    echo "OpenCL off, canny"
    ./test

    root@am57xx-evm:~/opencv_dsp_build_error# ./run.sh
    OpenCL on, canny
    OpenCL off, canny
    root@am57xx-evm:~/opencv_dsp_build_error#

    Did you run the script on Linux host? 

    Rex

  • cat run.sh
    export TI_OCL_LOAD_KERNELS_ONCHIP=Y
    export TI_OCL_CACHE_KERNELS=Y
    export OPENCV_OPENCL_DEVICE='TI AM57:ACCELERATOR:TI Multicore C66 DSP'
    echo "OpenCL on!"
    ./test

    run "test" on Linux host,The code will process "1.avi" Endless loop! because --------while(cap.read(matSrc))

    It would compiled the opencl kernel for about 5~10 mins and return a compile error :

    >>> Optimizer terminated abnormally
    >>>> in function ifft_multi_radix_rows()

    >> Compilation failure
    OpenCL program build log: -D LOCAL_SIZE=50 -D kercn=10 -D FT=float -D CT=float2 -D RADIX_PROCESS='fft_radix2_B5(smem,twiddles+0,ind,1,25);fft_radix5_B2(smem,twiddles+1,ind,2,1T

    >> Compilation failure

    since ARM needs to compile the DSP kernels, but its return a compile error!!!
    OpenCV triggers OpenCL online compilation, its online compile error!!! understand??
  • I think you use it wrongly. It shouldn't be run on Linux host. The online compilation should be done on ARM platform, the EVM.

    There are 2 stages of compilation for OpenCV + OpenCL.
    1. The ARM application is compiled to generate a binary
    2. When the ARM binary is run, the C6x OpenCL-C compiler is invoked to compile OpenCL kernels.
    This mode is called “online compilation”.
    (By the way, the ARM application won't run on Linux host)

    For details, see downloads.ti.com/.../compilation.html, section “Create an OpenCL program from source, with source in a file”

    Hope this clears the confusion you have. If it resolved your question, please click "Resolved".

    Rex
  • I did not describe it clearly, because the ARM platform is also a Linux system, I thought this is a Linux host... Don't worry about my binary executable running the wrong platform (on the wrong platform, ARM application can't run) .
    When my ARM binary ran for 5-10 minutes, It returned this compilation error, Really online compilation error!!

    The code will process "1.avi" Endless loop! because --------while(cap.read(matSrc))

    It would compiled the opencl kernel for about 5~10 mins and return a compile error :

    >>> Optimizer terminated abnormally
    >>>> in function ifft_multi_radix_rows()

    >> Compilation failure
    OpenCL program build log: -D LOCAL_SIZE=50 -D kercn=10 -D FT=float -D CT=float2 -D RADIX_PROCESS='fft_radix2_B5(smem,twiddles+0,ind,1,25);fft_radix5_B2(smem,twiddles+1,ind,2,1T

    >> Compilation failure
  • Hi, 

    Please see my Oct 18 post in which I posted my logs and couldn't reproduce the issue using your OpenCV application. I built your application and named it test.

    Here is the repost:

    I don't see the OpenCL compilation error. 
    
    root@am57xx-evm:~/opencv_dsp_build_error# cat run.sh
    export TI_OCL_LOAD_KERNELS_ONCHIP=Y
    export TI_OCL_CACHE_KERNELS=Y
    export OPENCV_OPENCL_DEVICE='TI AM57:ACCELERATOR:TI Multicore C66 DSP'
    echo "OpenCL on, canny"
    ./test
    export OPENCV_OPENCL_DEVICE='disabled'
    echo "OpenCL off, canny"
    ./test
    
    root@am57xx-evm:~/opencv_dsp_build_error# ./run.sh
    OpenCL on, canny
    OpenCL off, canny
    root@am57xx-evm:~/opencv_dsp_build_error#

  • Hi,

    Here is a re-run. To show that it is your application running, I renamed the image file to a different name so it prints the error message "could not load video" in the code. I then rename the file back to run it and it is successful. I don't see the compilation error. I use ProcSDK 5.0 (Kernel 4.14.40) as shonw in the last line of uname command.

    root@am57xx-evm:~/opencv_dsp_build_error# ls
    2.avi run.sh test test.cpp
    root@am57xx-evm:~/opencv_dsp_build_error# cat test.cpp
    #include <unistd.h>
    #include <time.h>
    #include <stdio.h>
    #include <opencv2/opencv.hpp>
    #include <opencv2/core/ocl.hpp>
    #include <opencv2/imgproc/imgproc.hpp>
    #include <opencv2/highgui/highgui.hpp>

    #ifndef _OCL_HPP_
    #include "opencv2/core/ocl.hpp"
    #endif

    using namespace cv;
    using namespace std;


    int main(){
    VideoCapture cap;
    cap.open("1.avi");

    if(!cap.isOpened())
    {
    cout << "colud not load vodeo...."<<endl;
    return -1;
    }

    ocl::setUseOpenCL(true);
    Mat matSrc;
    while(cap.read(matSrc))
    {
    cvtColor( matSrc , matSrc , CV_RGB2GRAY );

    Mat padded;
    int m = getOptimalDFTSize(matSrc.rows); // Return size of 2^x that suite for FFT
    int n = getOptimalDFTSize(matSrc.cols);
    // Padding 0, result is @padded
    copyMakeBorder(matSrc, padded, 0, m-matSrc.rows, 0, n-matSrc.cols, BORDER_CONSTANT, Scalar::all(0));

    // Create planes to storage REAL part and IMAGE part, IMAGE part init are 0
    Mat planes[] = {Mat_<float>(padded), Mat::zeros(padded.size(), CV_32F) };

    UMat complexI;
    merge(planes, 2, complexI);
    cv::dft(complexI, complexI);
    }
    ocl::setUseOpenCL(false);

    return 0;
    }

    root@am57xx-evm:~/opencv_dsp_build_error# cat run.sh
    export TI_OCL_LOAD_KERNELS_ONCHIP=Y
    export TI_OCL_CACHE_KERNELS=Y
    export OPENCV_OPENCL_DEVICE='TI AM57:ACCELERATOR:TI Multicore C66 DSP'
    echo "OpenCL on, canny"
    ./test
    export OPENCV_OPENCL_DEVICE='disabled'
    echo "OpenCL off, canny"
    ./test

    root@am57xx-evm:~/opencv_dsp_build_error# ./run.sh
    OpenCL on, canny
    GStreamer: Error opening bin: unexpected reference "1" - ignoring
    colud not load vodeo....
    OpenCL off, canny
    GStreamer: Error opening bin: unexpected reference "1" - ignoring
    colud not load vodeo....

    root@am57xx-evm:~/opencv_dsp_build_error# mv 2.avi 1.avi

    root@am57xx-evm:~/opencv_dsp_build_error# ./run.sh
    OpenCL on, canny
    OpenCL off, canny
    root@am57xx-evm:~/opencv_dsp_build_error#

    root@am57xx-evm:~/opencv_dsp_build_error# uname -a
    Linux am57xx-evm 4.14.40-g4796173fc5 #1 SMP PREEMPT Wed Jul 25 17:05:51 UTC 2018 armv7l GNU/Linux
  • Hi,
    We focus on this issue for a long time with no progress.
    Do you get any sugestions or updates we can reference ?
  • Duplicate Post
  • Hi,

    I used the prebuilt images, cross-compiled your application on the Linux PC, ran the same script on AM5728 GP EVM, but I don't see any issue.
    If I can't reproduce your issue, I have no idea what the problem is. Do you have anything different from my set up?

    Rex
  • Hi,

    Have you compared the differences on setup between yours and mine? Could you reproduce the successful run using the exact setup as mine?
    If you get it run successfully using the same setup as mine, then the next step to narrow down what causes the failure in your setup.

    Rex
  • Although the details of our environment and code are different, we have this same error:

    >>> Optimizer terminated abnormally
    >>>> in function ifft_multi_radix_rows()

    The error is produced during openCL compilation for DSP, which happens when the executable is run on ARM.

    What is the secret sauce necessary to get cv::dft(umat, umat) to compile/run on DSP?

  • Hi, Kurt,

    Please see my post on 10/12 in this thread. I am able to compile and run the application calling cv::dft() in the original post. As I mentioned on 10/24, if I can't reproduce the issue, I don't know what causes the failure in your set up. I suggest you look into the differences in the setup from mine.

    Rex
  • Hi,

    we are also affected by this problem. And I think OpenCL and OpenCV behave differently with the pre-built (on the included SD card) and self-built SDK. With the prebuilt SDK we were not able to reproduce the problem because we were not able to activate OpenCL in OpenCV. With the selfbuilt SDK OpenCL is working and most of the functions in OpenCV are using the DSP, but cv::dft() will crash the OpenCL compiler.

    We used this C++ code to utilize OpenCV with OpenCL:

    #include <iostream>
    #include <opencv2/imgproc/imgproc.hpp>
    #include <opencv2/core/ocl.hpp>
    
    int main(int argc, char* const argv[]) {
        if (cv::ocl::useOpenCL()) {
            std::cout << "OpenCL is activated" << std::endl;
        } else {
            std::cout << "OpenCL is not used!" << std::endl;
        }
        cv::UMat _mat (64, 64, CV_32F);
        cv::dft(_mat, _mat, 0, _mat.rows);
    }


    Compile with: g++ -lopencv_core -lopencv_imgproc -o test_dft [file.cpp]

    And this script to run setup the OpenCL environment:

    #!/bin/bash
    
    export OPENCV_OPENCL_DEVICE='disabled'
    echo "OpenCL off"
    ./test_dft
    
    export TI_OCL_LOAD_KERNELS_ONCHIP=Y
    export TI_OCL_CACHE_KERNELS=Y
    export OPENCV_OPENCL_DEVICE='TI AM57:ACCELERATOR:TI Multicore C66 DSP'
    echo "OpenCL on"
    ./test_dft
    



    With the prebuilt SDK we get the following output:

    OpenCL off
    OpenCL is not used!
    OpenCL on
    OpenCL is not used!


    With the selfbuilt SDK we get the following output:

    OpenCL off
    OpenCL is not used!
    OpenCL on
    OpenCL is activated

    >>> Optimizer terminated abnormally
    >>>>    in function ifft_multi_radix_rows()

    >> Compilation failure
    OpenCL program build log: -D LOCAL_SIZE=64 -D kercn=8 -D FT=float -D CT=float2 -D RADIX_PROCESS='fft_radix8(smem,twiddles+0,ind,1,8);fft_radix8(smem,twiddles+7,ind,8,8);' -D REAL_INPUT -D COMPLEX_OUTPUT -D NO_CONJUGATE

    >> Compilation failure

  • Hi, Erik,

    I am not sure if your issue is the same as that of original posted which didn't mention which filesystem was used. I couldn't reproduce the issue using prebuilt images and released filesystem with the custom application. Yours seems to work with released filesystem, but fail with the filesytem you built. If that is the case, it sounds to me that it is the filesystem issue. Could you compare the two filesystems to see if OpenCL libraries is missing in your filesystem? If you still have issue, please open a new thread for it. Thanks!

    Rex
  • Hi,

    I checked again with the latest prebuilt filesystem/rootfs. The one that was on the included SD card seems to be very old and does not contain the OpenCL libraries. But even with the latest rootfs the problem persists.

    Yours seems to work with released filesystem, but fail with the filesytem you built.

    That's not quite true. There is, in fact, no error when I'm running on the (old) rootfs. But as you can see, OpenCL is not utilized when the environmental variables are set. When OpenCL is used - that's what we want - the compilation failure occurs. Both with the selfbuilt and the prebuilt filesystem.

    You can test this with the example code and the script I provided in my last post.

    "OpenCL off" should result in "OpenCL is not used!".
    "OpenCL on" should result in "OpenCL is activated" and there should be no error stating "Optimizer terminated abnormally".


  • Hello Erik,

    Rex is is not available today and he should be able to give you feedback by early next week.

    Regards,
    Krunal
  • Hi, Erik,

    Sorry for the slow response. I have been out since Thanksgiving.

    TI's released filesystem was built using Yocto, so it shouldn't make any difference either using the released one or the one built later.
    I used your script except changing your application, test_dft, to the example, canny_ex1. I still can't reproduce the issue. Please see my logs:

    root@am57xx-evm:~/OpenCL_canny# vi issue.sh
    root@am57xx-evm:~/OpenCL_canny# chmod +x issue.sh
    root@am57xx-evm:~/OpenCL_canny# ./issue.sh
    OpenCL off
    BGR2GRAY tdiff=19.705277 ms
    GaussBlur tdiff=27.881229 ms
    Canny tdiff=59.255075 ms
    OpenCL on
    BGR2GRAY tdiff=26391.439110 ms
    GaussBlur tdiff=14262.671690 ms
    Canny tdiff=17510.216196 ms
    root@am57xx-evm:~/OpenCL_canny#
    root@am57xx-evm:~/OpenCL_canny# ./issue.sh
    OpenCL off
    BGR2GRAY tdiff=12.353232 ms
    GaussBlur tdiff=15.495138 ms
    Canny tdiff=56.830528 ms
    OpenCL on
    [ 2969.074535] omap-iommu 58882000.mmu: 58882000.mmu: version 2.1
    BGR2GRAY tdiff=11.315906 ms
    GaussBlur tdiff=5.877635 ms
    Canny tdiff=4.433805 ms
    root@am57xx-evm:~/OpenCL_canny#
    root@am57xx-evm:~/OpenCL_canny#
    root@am57xx-evm:~/OpenCL_canny# cat issue.sh
    #!/bin/bash

    export OPENCV_OPENCL_DEVICE='disabled'
    echo "OpenCL off"
    ./canny_ex1
    export TI_OCL_LOAD_KERNELS_ONCHIP=Y
    export TI_OCL_CACHE_KERNELS=Y
    export OPENCV_OPENCL_DEVICE='TI AM57:ACCELERATOR:TI Multicore C66 DSP'
    echo "OpenCL on"
    ./canny_ex1