This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

OMP threadprivate ignored?

Hi,

When running the following code on my EVM6678 using the "new" OMP-Runtime and the 7.4.1 cgt, I get the following output:


#pragma omp threadprivate(esvPrfLastStop)
static long long esvPrfLastStop = 0;

void main()
{
    int i;
    #pragma omp parallel for
    for(i=0; i < 2; i++) {
        printf("Value is: %lld, address: %x \n", esvPrfLastStop, &esvPrfLastStop);
    }
}


[C66xx_0] Value is: 2, address: 82000248
[C66xx_1] Value is: 2, address: 82000248

It seems the final openmp runtime + cgt-7.4 ignore the threadprivate pragma and locate the variable in shared DDR3 memory using the vm6678_ddr platform.

When, however using cgt-7.4.0B2 and the beta-openmp runtime, I get the threadprivate variable located in L2SRAM as expected:


[C66xx_0] Value is: 3546716855134859842, address: 82bc48
[C66xx_1] Value is: 3546716855134859842, address: 82bc48

I've uploaded a self-containing testcase here: 1325.ThreadprivateTest.zip

Is this a problem with my setup, or a bug in the OMP runtime?

Thanks, Clemens

  • Clemens,

    This is not a problem with your setup or a bug - it's a change in behavior in the compiler/runtime. Previous versions of OpenMP implemented threadprivate by mapping such variables to L2SRAM. This approach fails when the threadprivate section is large or if there is more than one OpenMP thread per core. The current OpenMP runtime implements threadprivate by allocating memory for each thread during thread startup. This allocation is done out of the shared region specified for HeapOMP in the configuration file.

    Ajay

  • Hi Ajay,

    Thanks for the clarification regarding memory placement.

    However what I still don't understand is why both threads actually share the *same* memory location, although its annotated as threadprivate.
    As you can see in the output of the program run, for both threads esvPrfLastStop is located at 8200024 and it also behaves like a shared variable.

    When executing the same code on my linux machine with gcc, for both threads the variable is located at a different adress and therefor what I understand under "private":

    [
    ce@localhost ~]$ ./omp
    Value is: 0, address: 84e65778
    Value is: 0, address: 84e646f8

    Thanks, Clemens

  • Clemens,

    Noticed that your config file does not enable caching or write-through for the memory range used by the application heap:

    // Enable Cache Write-back for HEAPOMP
    var Cache        = xdc.useModule('ti.sysbios.family.c66.Cache');
    Cache.setMarMeta(0x80000000, 0x20000000, 0);

    Since the shared region containing the application heap enables the cache (via  cacheEnable: true) this line should instead be:

    Cache.setMarMeta(0x80000000, 0x20000000, Cache.PC | Cache.WTE)

    Another potential reason you're seeing invalid addresses for threadprivate:

    Threadprivate support relies on a function, __c6xabi_get_tp() – this function returns a pointer to a thread’s thread private storage. There is a weak definition of this function in the C6000 runtime and a definition in the OpenMP runtime. By default, the definition of __c6xabi_get_tp used depends on the order in which objects and libraries are specified on the command line. With CCS, the order is:

     -l"./configPkg/linker.cmd"  "./omp_matvec.obj" -l"libc.a"

     This resulted in the call to __c6xabi_get_tp() from omp_matvec.obj to be resolved against the incorrect definition in the rts6600_elf_mt.lib (via libc.a).

    Specifying ‘--priority’ forces the linker to resolve references using the order of libraries/objects specified on the command line. This results in the call to __c6xabi_get_tp being resolved against the definition from the OpenMP runtime library specified in configPkg/linker.cmd.

    The '--priority' option can be enabled via [Project Properties] -> CCS Build -> C6000 Linker -> File Search Path

    Ajay

  • Hi Ajay,

    Thanks for the in-depth analysis and explanation of my  thread-private issue, I'll validate setting linker priority tomorrow.
    Also thanks for pointing out the mis-configuration regarding caching, I disabled it intentionally to make sure I am not running into some edge-cache case.

    In the meantime I worked the issue by locating the data-structures that should be threadprivate in L2SRAM.

    Thanks, Clemens

  • Just verified today, that it was indeed the linker-problem you mentioned.

    Linking with --priority does resolve the issue, and each threadprivate-annotated variable is assigned its own, private memory location.

    Thanks a lot, Clemens