This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Compiler/AM5728: OpenCV DSP Offloading Copy Error

Part Number: AM5728

Tool/software: TI C/C++ Compiler

Hi,

I has written a OpenCV example to test DSP offloading correctness. I try to copy a UMat to another one, unfortunately the operation is not correct.

#include <opencv2/opencv.hpp>
#include <limits>
#include <cstdlib>
#include <time.h>
#include <unistd.h>


using namespace cv;

using Pixel = unsigned char;

/* Time difference calculation, in ms units */
double tdiff_calc(struct timespec &tp_start, struct timespec &tp_end)
{
  return (double)(tp_end.tv_nsec -tp_start.tv_nsec) * 0.000001 + (double)(tp_end.tv_sec - tp_start.tv_sec) * 1000.0;
}

bool checkZero(const Pixel* buffer, int size)
{
  for (int i = 0; i < size; ++i) {
    if (buffer[i]) {
      return false;
    }
  }

  return true;
}

void writeZero(Pixel* buffer, int size)
{
  for (int i = 0; i < size; ++i) {
    buffer[i] = 0;
  }
}

int main(int argc, const char** argv)
{
  int width = 21;
  int height = 21;

  if (argc != 3) {
    std::cout << "Invalid parameters!" << std::endl;

    return -1;
  }

  try {
    width = std::stoi(argv[1]);
    height = std::stoi(argv[2]);

    std::cout << "Width = " << width << std::endl;
    std::cout << "Height = " << height << std::endl;
    std::cout << "---" << std::endl;
  }
  catch (const std::exception& e) {
    std::cerr << e.what() << '\n';
    return -2;
  }

  int size = width * height;
  
  auto inputBuffer = new Pixel[size]{0};
  auto outputBuffer = new Pixel[size]{0};
  writeZero(inputBuffer, size);
  writeZero(outputBuffer, size);

  if (checkZero(outputBuffer, size)) {
    printf("Zero, Ok\n");
  }
  else {
    printf("Not Zero, Not Ok!\n");
    return -1;
  }

  struct timespec tp0, tp1, tp2, tp3;

  Mat input(height, width, CV_8UC1, inputBuffer);
  Mat output(height, width, CV_8UC1, outputBuffer);

  randu(input, Scalar::all(std::numeric_limits<Pixel>::min()), Scalar::all(std::numeric_limits<Pixel>::max()));
  

  UMat in;
  UMat out;

  clock_gettime(CLOCK_MONOTONIC, &tp0);

  // input.copyTo(in);
  in = input.getUMat(ACCESS_RW);

  clock_gettime(CLOCK_MONOTONIC, &tp1);

  // output.copyTo(out);
  out = output.getUMat(ACCESS_RW);

  clock_gettime(CLOCK_MONOTONIC, &tp2);

  in.copyTo(out);

  clock_gettime(CLOCK_MONOTONIC, &tp3);

  bool equalFlag = true;
  for (unsigned i = 0; i < size; ++i) {
    if (inputBuffer[i] != outputBuffer[i]) {
      equalFlag = false;
      break;
    }
  }
  if (equalFlag) {
    printf("Equal!\n");
  }
  else {
    printf("Not Equal!\n");
  }

  printf("---\n");
  printf("in = input.getUMat(ACCESS_RW);   -> tdiff=%lf ms \n", tdiff_calc(tp0, tp1));
  printf("out = output.getUMat(ACCESS_RW); -> tdiff=%lf ms \n", tdiff_calc(tp1, tp2));
  printf("in.copyTo(out);                  -> tdiff=%lf ms \n", tdiff_calc(tp2, tp3));

  return 0;
}

If I disable the offloading, the input and output UMats are same. If I enable the offloading these aren't same.

You can also find the example via the link below:

https://gitlab.com/mustafa-gonul/examples/ti-processor-sdk-linux-am57x/opencv-2/-/tree/master/000-template-cv

I appreciate if you can help me regarding the issue.

Kind regards,

Mustafa

  • Hi,

    I have forgotten to mention that I am using and tested against SDK Version 6.02 & 6.03.

    Kind regards,

    Mustafa 

  • Mustafa,

    Apologies for the delay, I will start looking into this on Monday.

    Could you please let me know if you have been able to make progress or you still need help? I will continue based on your update or from the previous if I don't hear back before Monday.

    Regadrs

    Karthik

  • Mustafa,

    If this is still open, can you please add more details on "If I disable the offloading, the input and output UMats are same. If I enable the offloading these aren't same."?

    Regards

    Karthik

  • Hi Karthik,

    Sorry for the late response. The issue is still open, we have no fix at our side.

    First of all, I would like to explain what is enabling and disabling the offloading:

    By default, TI version of OpenCV has no offloading capability to the DSP cores. You need to enable the offloading with some environment variables like:

    export TI_OCL_KEEP_FILES=Y
    export TI_OCL_LOAD_KERNELS_ONCHIP=Y
    export TI_OCL_CACHE_KERNELS=Y
    export TI_OCL_VERBOSE_ERROR=Y
    
    export OPENCV_OPENCL_DEVICE='TI AM57:ACCELERATOR:TI Multicore C66 DSP'
    # export OPENCV_OPENCL_DEVICE='disabled'

    For further information please see: https://software-dl.ti.com/processor-sdk-linux/esd/docs/latest/linux/Foundational_Components_OpenCV.html

    Secondly, in the example, the matrix is copied from one to another. In the end, it is expected that the the image data should be the same. If the offloading is disabled, if ARM cores are used to copy the data, data are same. But if I use offloading they aren't same. I expect same behavior regardless of offloading to DSP.

    If you have further questions, please ask. I know there are several parts of the problems, I cannot think/foresee everything what you need in the first place. 

    Kind regards,

    Mustafa