TI C66x Experts-
We have ported OpenCV to c66x and it's working well (we replaced mem management in rts6600_elf.lib and we mapped some OpenCV functions to VLIB functions, among other things). Now we need to enable HAVE_OPENMP and use the c66x compiler's OpenMP capability under these conditions:
-cores 0 to N-1 are running H.264 encoder, which
is provided by TI as highly optimized for multicore
operation (N is 2 to 6)
-we need to be able to control the number of
cores allocated for OpenMP threads
We basically only need nested for-loop support. OpenCV has a file called parallel.cpp that's used across several modules if HAVE_OPENMP is defined. All that's really happening is a C++ class that can re-factor for-loops depending on available platform multicore options (OpenMP, OpenCL, CUDA, etc). I'd say at least 80% of the full OpenMP capability is not required.
We've been looking at the omp_hello example and have the following questions:
1) Why is MCSM mapped to DDR3 memory in the omp_hello example RTSC platform file? Is it possible to not do this? MCSM memory is used extensively by H.264.
2) It looks like L2 mem is reserved for stack and .threadprivate section. We're currently using 64 KB for L2 cache, and most of remaining L2 mem for H.264, streaming, and network I/O code (we have about 90 KB available). If we don't have enough L2 mem for OpenMP threads, is it possible to map .threadprivate to DDR3 mem; i.e. to separate areas used by each core?
3) To make fundamental changes in how OpenMP uses memory, is there library source code that we can modify? The OpenMP user guide mentions rebuilding the OMP lib with different settings / paths, following steps in the IPC Users Guide. Is this sufficient or are there additional libs we may need to modify?
Thanks.
-Jeff
Signalogic
PS. We're about to start working with H.265, and we also need combined OpenMP functionality in that case also.