CCSv4 MQT_1_21 evmC6474 Problem running smmqttest example

Eddie3909

Hi all

I am trying to get the MQT package built and running on the C6474.

I was able to build libraries and smmqttest_64P.pjt by creating the CCSv4 projects from scratch (I tried importing but ran into migration log errors that didn't make any sense to me). Actually, not totally from scratch. I borrowed the .tcf and tci files from the original MQT_1_21 package so that I didn't have to worry about re-mapping the memory or creating tasks.

I'm able to get the smmqttest_64P example to load to main, run the the end of main, but then it will not hit break points the worker or boss tasks (I also put in log_printfs() that are not outputting either). When I stop the emulator, its stuck in HSEMCRIT_getLock(). Any thoughts of what I should try to do to debug it?

Is HSEMCRIT_getLock() run after main() exits? Knowing that would help me debug the problem.

Cheers

over 15 years ago

0 Eddie3909 over 15 years ago

Guru 10385 points

Now that I've read some of "Shared memory message queue Transport (MQT) sprab09.ped", I think I know what the problem is. If I stop the debug session at a point where the boss thread has the semaphore, the C6474 sem location 0x02B4 0100 is set to 0x0000 0100. This means that core 1 now has the semaphore. If I stop and reload the worker core and then run, it gets hung in hsemcrit_getLock() because the semaphore isn't released.

When I reload the boss thread and run, it is also getting hung at hsemcrit_getLock().

It appears that the only way to reset the semaphore is to power cycle the board.

I'm going to see if I can get the boss core to reset the semaphore in main. Seems like it should be doing this so that we start from a known state.

Cheers

0 RandyP over 15 years ago in reply to Eddie3909

TI__Guru* 84110 points

The next chip we make with hardware semaphores will have a way to completely reset the module with one operation. But this version of the semaphore module can be tricky to do that in software.

Depending on the semaphore-request modes you use, there can be pending requests for semaphores that have to be released before they can be cleared. I wrote the following code to (I believe) fully reset the semaphore module. Any core that may have owned a semaphore may have to run this code simultaneously before all the requests can be cleared. This means that once you load your code and start running again, all three cores must be running and must be able to get to this code while the others are also running this code.

#include <c6x.h>
#include <csl_chip.h>
#include <csl_sem.h>

CSL_SemContext SemContext;
CSL_SemVal response,query;
CSL_InstNum SemInst = 0;
CSL_SemObj pSemObj;
CSL_Status SemStat;
CSL_SemHandle hSemHandle;
CSL_SemParam Param;
CSL_SemFaultStatus FaultStatus = { 0, (CSL_SemError)0, 0 };

    CSL_SemFlagSetClear_Arg FlagSetClear;
    CSL_SemEOISet EOISet;
    unsigned int NoArg;

    /* init CSL semaphore module */
    CSL_semInit(&SemContext);
    Param.flags = 0; // Semaphore number = 0
    hSemHandle = CSL_semOpen(&pSemObj, SemInst,&Param, &SemStat);
    if ((hSemHandle == NULL) || (SemStat != CSL_SOK)) { exit(1); }

    // clear semaphore module
    {
        int nQueryCnt;

        // clear any owned semaphores
        do
        {
            nQueryCnt = 0;

            for ( i = 0; i < 32; i++ )
            {
                volatile unsigned int *pSemQuery = hSemHandle->regs->QUERY;
                unsigned int uQuery = *pSemQuery;

                if ( CSL_FEXT(uQuery, SEM_QUERY_FREE) == 0 ) // a semaphore is owned
                {
                    nQueryCnt++; // keep track of any owned semaphores

                    // if this semaphore is owned by this core, release it
                    if ( CSL_FEXT(uQuery, SEM_QUERY_OWNER) == CSL_chipReadReg( CSL_CHIP_DNUM ) )
                    {
                        hSemHandle->regs->DIRECT[i] = HW_SEM_RELEASE;
                    }
                }
            }

        } while ( nQueryCnt > 0 ); // even if another core owns a semaphore, keep repeating this in case an old request is pending

        // clear any pending semaphore interrupt flags
        FlagSetClear.mask = 0xffffffff;
        FlagSetClear.masterId = (CSL_SemOwnerId)((int)CSL_SEM_ID0+CSL_chipReadReg( CSL_CHIP_DNUM ));
        CSL_semHwControl(hSemHandle, CSL_SEM_CMD_CLEAR_FLAGS, &FlagSetClear);

        // re-arm semaphore module interrupts
        EOISet = (CSL_SemEOISet)((int)CSL_SEM_REARM_SEMINT0+CSL_chipReadReg( CSL_CHIP_DNUM ));
        CSL_semHwControl(hSemHandle, CSL_SEM_CMD_EOI_WRITE, &EOISet);

        // clear any pending error flags
        CSL_semHwControl(hSemHandle, CSL_SEM_CMD_CLEAR_ERR, &NoArg);

        // re-arm error interrupts (every core will do this, possible race condition?)
        EOISet = CSL_SEM_REARM_SEMINT_ALL;
        CSL_semHwControl(hSemHandle, CSL_SEM_CMD_EOI_WRITE, &EOISet);
    }
    /* init CSL semaphore module */

0 Eddie3909 over 15 years ago in reply to Eddie3909

Guru 10385 points

Yiks, this is even scarier than I thought. The boss thread does not initialize the ->num member of the HSEMCRIT_handle in HSEMCRIT_open(). This is not good as this var is used to access the semaphore table. For some reason the passed in "init" veriable is set to zero when HSEMCRIT_open is called.

But for the worker thread, init is set to 1 hence the ->num variable is set properly HSEMCRIT_open() is called.

Now to figure out why the boss thread doesn't have init set to "1".

Cheers

0 Eddie3909 over 15 years ago in reply to Eddie3909

Guru 10385 points

Why hsemcrit_open() does not have "init" set to "1" for the boss thread is because the calling function SMMQT_init() sets it to false when

if (smmqtState.systemConfig->procIdToInit == GBL_getProcId()) {
        /*
         * Determine if local reset for executing core occured.
         * If not, then initialize the smmmqtState array of QUEs.
         */
        resetStat = _SMMQT_isLocalReset(DNUM);
.

}
else {
CRIT_open(smmqtState.systemConfig->critHandle, FALSE); HERE IS WHERE THE BOSS HsemCrit_open() gets calledn
}

So I thought that maybe there is a shared mem structure for *handle in the hsemcrit_open() function below. When I broke in the function, the address of handle and params both point to the 0x0080 0000 L2 RAM. According to the C6474 data sheet (see section below), this address is for shared memory. So I'll assume that there isn't a problem with not setting the "->num" variable in the boss application.

You can see below where I added code to clear the semaphore. Seems to have fixed the problem.

Int _HSEMCRIT_open(Ptr *handle, Ptr params, Bool init)
{

volatile Uint32 *sem = _HSEMCRIT_BASEADDR; // I added to release semaphore
      HSEMCRIT_Handle hsemCritHandle = (HSEMCRIT_Handle)(*handle);

    if (init) {
        hsemCritHandle->num = ((HSEMCRIT_Params *)params)->num;
    }

sem[hsemCritHandle->num] = 1; // I added to release semaphore.

return (SYS_OK);
}

-----------From the C6474 data sheet sprs552d.pdf ------------------------------------------------------------------------------

Megamodule and allows for common code to be run unmodified on multiple cores. For example, address
location 0x10800000 is the global base address for C64x+ Megamodule Core 0's L2 memory. C64x+
Megamodule Core 0 can access this location by either using 0x10800000 or 0x00800000. Any other
master on the device must use 0x10800000 only. Conversely, 0x00800000 can by used by any of the
three cores as their own L2 base addresses. For C64x+ Megamodule Core 0, as mentioned this is
equivalent to 0x10800000, for C64x+ Megamodule Core 1 this is equivalent to 0x11800000, and for
C64x+ Megamodule Core 2 this is equivalent to 0x12800000. Local addresses should only be used for
shared code or data, allowing a single image to be included in memory. Any code/data targeted to a
specific core, or a memory region allocated during run-time by a particular core should always use the
global address only.

0 RandyP over 15 years ago in reply to Eddie3909

TI__Guru* 84110 points

The example code should work as-is without having to do all this debug that you are doing. Why did it not work right out of the box? What changes did you do to it in the first place to get where you are now?

Eddie said:
I was able to build libraries and smmqttest_64P.pjt by creating the CCSv4 projects from scratch (I tried importing but ran into migration log errors that didn't make any sense to me).

I fear that this original statement has led to some problem that is sending you off in too many directions that you should not have to go.

From the C6474 data sheet sprs552d.pdf said:
Local addresses should only be used for shared code or data, allowing a single image to be included in memory.

This is an unfortunate use of the word "shared". In this case, we are referring to the case where you want to build a single program image that can be loaded and run on each of the 3 cores. By using the local L2 addresses starting at 0x00800000, every core will be able to run its own copy of the same program without impacting the data or program on another core.

This is not what you would refer to as "shared memory", but from the system point-of-view this is a way to share the same program and data addresses on each of the cores.

And the comment that this is "a single image ... in memory" means that the same image is copied to all three cores. So there are three identical copies of a single image.

Since this basic point is a confusing one, I would like to recommend that you take a look at some of the online training material we have available. In the Training section of TI.com, there is a training video set for the C6474 at http://focus.ti.com/docs/training/catalog/events/event.jhtml?sku=OLT110002 .

Also, if your questions are really related to the use of the smmqt software, then you may want to summarize your question(s) in a new posting in the Embedded Software -> BIOS forum. I see you have already posted there, so you know about it. It is more likely to be seen by an LLD expert than it is here. And you would not have to put up with my details on hardware initialization, etc.

0 Eddie3909 over 15 years ago in reply to RandyP

Guru 10385 points

Hi Randy

RandyP said:
1. The example code should work as-is without having to do all this debug that you are doing. Why did it not work right out of the box? What changes did you do to it in the first place to get where you are now?

Fortunately, it worked "out of the box" for CCSv3. CCSv4 was another story. The import wasn't smooth at all (see http://e2e.ti.com/support/development_tools/code_composer_studio/f/81/p/35543/124517.aspx#124517).

In the example, if I stopped my debug session when the boss had the semaphore, you can't run the example until you HW reset.

Thanks for the sem reset SW, the training video and pointing me to the BIOS group.

Cheers

0 Kishore over 15 years ago in reply to Eddie3909

TI__Mastermind 39560 points

Hi,

I do not know if this question is trivial. But i thought i will go ahead.

I am trying to syncghronize the 3 cores,but am not using semaphores,instead using shared memory.

I amusing three memory locations ( shared volatile pointer address i mean) for each of the cores to set to 1.

So, each core will be waiting till all of the three locations are 1.

Logic seems fine thus far. But when i debug in ccs, i find the following,

Suppose core 1 is run first, the three locations values are (1,0,0).

Now if core 2 is run, the value becomes ( 0,1,0) instead of (1,1,0).

Is this because, the core 2 starts executing from the first and initilaises eveything to 0 beforehand?

is this approach wrong?

kindly let me know your opinion.

Thanks!

0 Kishore over 15 years ago in reply to Kishore

TI__Mastermind 39560 points

Apologies.

i got the answer...

Thanks

0 RandyP over 15 years ago in reply to Kishore

TI__Guru* 84110 points

M. Faraday, please post your solution for the forum to have. The solution I have used is

volatile int *pCommonMem = (int *)0x80000000; // or whatever address you want, need three words in a row

    // sync all three cores to be starting at the same time
    for ( i = 0; i < 3; i++ ) pCommonMem[i] = 0; // clear 3 words to make sure they're all 0
    while ( pCommonMem[0]+pCommonMem[1]+pCommonMem[2] != 3 ) // wait until all 3 are set to 1
        if ( pCommonMem[CSL_chipReadReg( CSL_CHIP_DNUM )] == 0 ) // if this core's is 0, set to 1
            pCommonMem[CSL_chipReadReg( CSL_CHIP_DNUM )] = 1;

The repeated calls to CSL_chipReadReg() could be optimized a bit, but the optimizer may take care of that anyway. The time to go through the while-loop is the remaining worst-case uncertainty between the three cores.

The volatile memory needs to be non-cached at this point. If you have simple code to force that, it would be nice to show here, too.

0 Kishore over 15 years ago in reply to RandyP

TI__Mastermind 39560 points

Hello Randy,

Even i am using a code snippet similar to yours.

Here it is.

volatile unsigned int *ptr_CoreA = (unsigned int *)0x88000000; //shared memory
volatile unsigned int *ptr_CoreB = (unsigned int *)0x88000004;
volatile unsigned int *ptr_CoreC = (unsigned int *)0x88000008;

inside function

    if( coreID == 0)      // coreID here is read from DNUM
        *ptr_CoreA = 1;
    else if (coreID == 1)
        *ptr_CoreB = 1;
    else{
        *ptr_CoreC = 1;
    }

for(;;)
{
    if( (*ptr_CoreA == 1) & (*ptr_CoreB == 1) & (*ptr_CoreC == 1) )
    break ;
}

the only thing is i am not initialising all 3 locations to 0 beforehand. I know this is not a good practice. But i works.!

Thanks!

0 RandyP over 15 years ago in reply to Kishore

TI__Guru* 84110 points

Since I do not know who you are, I cannot send this privately, so please forgive the critique. Without initializing all 3 locations to 0 beforehand, your solution does *not* work. Consider the case where you load all three cores and run them past this point. If you then halt, edit the code, reload and run again - all 3 locations are still 1 so each core will pass out of the for-loop without synchronization.

It is more common practice to use logical && rather than bit-wise & in the if-statement, is it not? It might be interesting to see how the optimizer handles the two choices differently.

0 Kishore over 15 years ago in reply to RandyP

TI__Mastermind 39560 points

Hello Randy,

Thanks for the critique!!.... It only helps ...:)

Well i had not considered this case. And yes i should have considered the logical AND...dont know how i missed..my bad...

But yes..my code only works if the code is run once....i will make the corrections..

Best Regards,

Kishore Ramaiah. :)

Processors

Processors forum

CCSv4 MQT_1_21 evmC6474 Problem running smmqttest example