This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Issues with Code Gen tools migration to 7.2.0B2,7.2.2



Hello Sir,

 

We are working on C64x+ DSP, Earlier we were using code gen tools 6.1.x later we moved to 7.2.0B2 version for ELF support.

Unfortunately few projects will work fine with out any issue, but some projects it gives incorrect code when -O3 is enabled along with --opt_for_speed=5.

We debugged and found issue in below code

for (i=0; i<numTables; i++) {
       
        pThis->m_tableInfo[i].bits  = pThis->m_initInfo[i].bits;
        pThis->m_tableInfo[i].table = pThis->m_decInfo+pThis->m_numDecEntries;
        pThis->m_numDecEntries     += 1<<pThis->m_initInfo[i].bits;
 }

Later came to know that there were some fixes in latest code gen tools 7.2.2 version and migrated to the same. But still we observe this issue....

Can anybody guide us fix this issue? Please note that the same code works fine with no optimizations "-o0"

During our trials we also found that the code generated is inconsistent with 7.2.0B2. :(

 

Best Regards

Rama

 

 

 

 

  • Unfortunately, there is no way to respond without a test case.  Please submit a test case which can be compiled down to object code.  It does not have to link and run.  Please see the last part of the forum welcome message for more details on submitting a test case.

    Thanks and regards,

    -George

     

  • Hi George,

    Thank you for response. Please find attached source code with all the dependencies.
    4010.vc1vdec_ti_huffdec_wmv.txt

     

    The problematic code is shown from 10449 to 10483. It contains working code under #if 0....

    Below are the compiler options we are using

              -mv64+ \
             --define=ADVANCED_PROFILE \
             --define=C64X \
             --define=STDTYPES="<xdc/std.h>" \
             --define=xdc_target_name__="C64P" \
             --define=xdc_target_types__="ti/targets/std.h" \
             --define=_HARDWARE \
             --define=_DAVINCI \
             --define=_TI_B_FRAME_FLOW \
             --define=_DEBUG \
             --diag_warning=225 \
             --obj_directory="$(OBJ_DIR)" \

            --abi=eabi -g -O3 --opt_for_speed=5


    Please let me know if you need any more information.

    Best Regards

    Rama

  • Thank you for the test case.  I built it and took a quick look at the resulting assembly.  I couldn't see anything wrong.  Still, I turned in SDSCM00039960 to the SDOWP system, so the right experts will take a look.  You can track that issue with the SDOWP link in my sig below.

    Thanks and regards,

    -George

  • Well, we have done all the analysis we can, and we still cannot find anything wrong with the generated assembly.  This is a bit unusual.  We often find the problem with such analysis.

    Is there any way you could be more specific about where the error occurs?  Have you tried single stepping the code through the suspect instructions?  Does your code run under a simulator, or can you modify it to run under a simulator?  If so, would you be willing to send it all to us so we could try to run it under a simulator?

    Thanks and regards,

    -George

  • After several discussions, we confirmed the problem is caused by a compiler bug.

    The problem is not with the shift instruction, but the memory access order after the following loop is SPLOOPed.

        pThis->m_numDecEntries = 0;

        for (i=0; i<numTables; i++) {

            pThis->m_tableInfo[i].bits  = pThis->m_initInfo[i].bits;

            pThis->m_tableInfo[i].table = pThis->m_decInfo+pThis->m_numDecEntries;

            pThis->m_numDecEntries += 1<<pThis->m_initInfo[i].bits;

        }

    After the above loop is SPLOOPed, the 2nd iteration access pThis->m_numDecEntries before the 1st iteration write to it, so final result on pThis->m_numDecEntries is wrong.

    This problem is caused by a wrong compiler memory data dependence analysis. We will have it fixed.

    Here is a work around you can use to move on your project before the fix is delivered.

        I32WMV x = 0;

        for (i=0; i<numTables; i++) {

            pThis->m_tableInfo[i].bits  = pThis->m_initInfo[i].bits;

            pThis->m_tableInfo[i].table = pThis->m_decInfo+x;

            x += 1<<pThis->m_initInfo[i].bits;

        }

        pThis->m_numDecEntries = x;

    In this approach, you can promote the expensive memory access to a local variable which will be placed in a register. In this way, we avoid the memory analysis problem.

    This promotion will also increase the speed of this loop.

    Wei