OpenEM problem

ruijie yang

Hi everyone,

I'm trying to add an openem demo Example_0 into my project.I almost simply added most of the source files to my project and modified the cfg file and platform file accordingly.

I run my two projects on two cores , the two projects are all the same on openem.

core0 hangs in ti_em_init_global :

/* waits for the RP event */
cdEventDscPtr = NULL;
while (cdEventDscPtr == NULL)
cdEventDscPtr = (volatile Emti_CdEventDsc_s*)Emti_popEvent(Emti_svDspMcb.rpQueueIdx);

the return value of popevent keeps to be null.

I checked the return value by running Example_0 and I found its 0x340B8010 which is on the section of PDSP1D.

Also,I traced varibles and I noticed a suspicious difference that in Example_0,the value of qmssGblCfgParams->pdspcmdreg[0] turns to 1588 after executed the line I marked red below.But in my project, pdspcmdreg[0] always keeps to be 0.well the lvPrivateEventPtr is correct(0x340B8000,same as in Example_0) and Emti_svDspMcb.cdQueueIdx is 1588.But pdspcmdreg[0] never change its value.

/****************************************************************
******************** Command (CD) Queue ***********************
****************************************************************/
{
lvRetStatus = ti_em_hw_queue_open(queueNb);
if (lvRetStatus != EM_OK)
return EM_ERR_ALLOC_FAILED;

/* store cd queue idx in DSP SW context */
Emti_svDspMcb.cdQueueIdx = (uint16_t) queueNb++;
Qmss_queueEmpty(Emti_svDspMcb.cdQueueIdx);

/* retrieve private memory region base address */
lvPrivateEventPtr = Emti_getMemoryRegionBaseAddress(Emti_svDspMcb.cfg.private_region_idx);

/* store cd queue idx in PDSP SW context */
*(uint32_t*)(lvPrivateEventPtr) = (uint32_t) Emti_svDspMcb.cdQueueIdx;

/* move to next private descriptor */
lvPrivateEventPtr += TI_EM_PRIVATE_EVENT_DSC_SIZE;

/* Debug Message: */
Emti_logDbg("Command Queue 0x%x \n", Emti_svDspMcb.cdQueueIdx);
}

so I wonder its a problem of pdsp or what? But I set the PDSP exactly as Example_0.

what 's the problem here??

Thank you!

over 9 years ago

0 ruijie yang over 9 years ago

Intellectual 480 points

Emti_popEvent(queueIdx)

{

Emti_gvQmAddr = (void*)CSL_QM_SS_CFG_QUEUE_DEQUEUE_REGS;

return (void*)Emti_gvQmAddr[queueIdx].QUEUE_REG_D;

}

The queueidx is 12308 which I got in my project. It's the same as in example_0.But I dont know why it's bigger than 8192.

CSL_QM_SS_CFG_QUEUE_DEQUEUE_REGS is defined in qmssGblCfgParams as qmQueMgmtReg. It contains an array of 8192 and all it's REG_D is set to 129.

The 12308 is a little weird but the demo shows it should be that value.And the REGS is normally defined.So I cant figure out what's the problem.

I even tried to write the value of cdEventDscPtr by force in the expression window or set the value of core register A0(because the correct cdEventDscPtr is located in A0).well it can break from the while.But the code goes in a strange way and at last goes somewhere unknow.

0 Raja over 9 years ago in reply to ruijie yang

TI__Guru* 81335 points

Hi,

Please provide below information to help you.

1. DSP Part number.

2. MCSDK package and its version.

3. Any other dependent packages used.

Thank you.

0 Jason Reeder over 9 years ago in reply to ruijie yang

TI__Genius 10415 points

Ruijie,

Can you post your project files here so that I can import them and try to replicate your issue?

You are able to import Example 0 from the MCSDK and run the example project to completion correct?

Jason Reeder

0 ruijie yang over 9 years ago in reply to Raja

Intellectual 480 points

1. I'm working on 6678

2.I'm using mcsdk 2.1.2.6 and openem 1.0.0.2

3.In my origin project I used Bios ,ipc,pdk and ndk which I now unchecked.My project uses messageQ and Srio which I guess may have conflicts when configuring and using qmss and cppi.But now I did not run the part which sets the qmss and cppi.

Actually, I did not let any single line of code of my origin project to be executed. I just changed my original main to a normal function and only start from the main in openem which is exactly the same as Example_0. It's just that I run a pure example_0 in my project.

And the platform file is almost the same as Example_0.

0 ruijie yang over 9 years ago in reply to Jason Reeder

Intellectual 480 points

Jason,

the Example_0 can run completely correct.

my problem posted above is solved now. Master core now won't loop never in the init_global.

But a new problem came up.

I said I got two projects and they are pretty much the same .I loaded them on core0 and core1.

now core1 run dispatch_once for only one time and at the end of dispatch_once it goes somewhere no symbols are defined while core0 keeps calling dispatch_once but svRunningJobNum never reduce. Or sometimes svRunningJobNum somehow becomes zero on core0 and it goes to statJob() ,but the prints in console are nearly all 0.

If I load one same project on core0 and core1. Both cores loops in dispatch_once like forever. Because both cores are return with error on the first line in dispatch_once when they try to pop an event from a pf or sd queue, and it gets null so an err is returned from dispatch_once. Thus, svRunningJobNum never get a minus.

svRunningJobNum is located in MSMCSRAM_NC in section my_svL1Mem begins from 0x88000000. I got a warning says my_svL1Mem should have a STATIC_BASE.I don't really know what is it mean, but I guess it's not the main problem since I just followed Example_0.But I noticed that in the linker.cmd of Example_0, there are my_svL1Mem.1 and my_svL1Mem.2. How does this come from. In my project there's just my_svL1Mem.

well anyway, I checked that svRunningJobNum is shared between cores, so this variable should not be a problem.

My main problem is why it gets failed at dispatch.

Thank you.

0 Jason Reeder over 9 years ago in reply to ruijie yang

TI__Genius 10415 points

Ruijie,

I'm glad you were able to get through your first issue.

I understand the symptoms of your new problem but without more information I cannot tell the root cause. Can you provide your project files or give me instructions on how to modify Example0 from the MCSDK in order to replicate this issue so that I can debug it further?

Jason Reeder

0 ruijie yang over 9 years ago in reply to Jason Reeder

Intellectual 480 points

HI Jason,

Thank you and sorry that I reply ttthis late.Sorry!

My second problem is ok now which turns out to be that I forgot two settings in the project properties. It's all my stupidity and carelessness.

Now I got a new problem and still needs your help. Hope you still care about this thread.

Sorry I can't post my project files here. But you can replicate this problem just with the original demo.

Here's the problem:

1. I got a copy of Example_0 and renamed it as Example_1. I changed the core num to #define MY_EM_CORE_NUM (2) .

2. I loaded Example_0 on core0 and Example_1 on core1. It ran well and correct.

3. I newed two platform files as CORE0_PLATFORM and CORE1_PLATFORM , I kept all the segments of ti.runtime.openem.examples.platforms.c6678_Example_0 in my new platforms and I added one extra segment as DDR_SRAM in both platforms.

In CORE0_PLATFORM DDR_SRAM is defined as 0x90000000 and size is 0x01000000.

In CORE1_PLATFORM DDR_SRAM is defined as 0x92000000 and size is 0x01000000.

4 I set memory sections of Code Memory , Data Memory and Stack Memory to DDR_SRAM.

5. changed the platforms from ti.runtime.openem.examples.platforms.c6678_Example_0 to CORE0_PLATFORM and CORE1_PLATFORM in the project properties .

6 rebuilt and loaded Example_0 on core0 and Example_1 on core1.

well, core0 can run normally but core1 will goes to an address like 0x90xxxxxx. 0x90xxxxxx is only defined in core0 and I check the 0x90xxxxxx in core0's map file and it shows that it's an address where rtsc6000_elf.lib locates.

I guess is it that core0 does some global initializations which stores in its own space and somehow should be shared with core1?

The original setting of memory sections are all in L2SRAM. And L2SRAM is private to each cores. But is it that a core can access another core's L2SRAM through some kind of global address??

I see there's some memory settings in my_initMemGlobal which I don't really clear. Is this , I mean operations in my_initMemGlobal ,has something to do with my problem?

Thank you !

0 Jason Reeder over 9 years ago in reply to ruijie yang

TI__Genius 10415 points

Ruijie,

Can you tell me what you are trying to accomplish by creating separate DDR segments for each core? Once I understand your motivations I may be able to suggest a more simple solution or provide an example to follow.

You asked if each core's L2SRAM memory is accessible by other cores through a global address. The answer is yes, each core has a unique global address for L2 memories that can be accessed by external masters in the system, including other DSP cores. The internal starting address for the DSP L2 memory is 0x0080_0000. In order to make this address global you need to add 0x1000_0000 + [corenum * 0x0100_0000] + address. This sounds more complicated than it is, here's a table showing this:

DSP Local and Global L2 Starting Addresses
Core Number	Local L2 Start Address	Global L2 Start Address
0	0x00800000	0x10800000
1	0x00800000	0x11800000
2	0x00800000	0x12800000
3	0x00800000	0x13800000
4	0x00800000	0x14800000
5	0x00800000	0x15800000
6	0x00800000	0x16800000
7	0x00800000	0x17800000

There is also a function (my_em_makeAddressGlobal) in example0 that converts local L2 addresses to global L2 addresses that can be found in the my_em_init.c file. The function takes the local core number and the local address and converts it to global.

The two memory init functions (my_initMemLocal and my_initMemGlobal) are basically just setting up the cacheability of certain memory sections. The outcome of these two functions is to set up the memory map like the below map file snippet where NC stands for non-cacheable. (note that in order to make a portion of MSMC memory non-cacheable you need to create an aliased memory location which the example does and is shown below at address 0x88000000).

Jason Reeder

0 ruijie yang over 9 years ago in reply to Jason Reeder

Intellectual 480 points

Thank you Jason,

There are mainly two reasons why I create separate DDR segments for each core.

1. The memory sections of my project which I added Example_0 to is a little big. The .text , .stack, .... are bigger than 512k so they can't fit into L2SARM. So I have to put them off chip. At the very first time, I used only one project for both master and slave cores. But since I put my code,data and stack in DDR which means all cores share one copy of code ,conflicts accured.

2. Also, I'm using master-slave model which master does controlling and communication stuff while slave cores only communicate with master core and do algorithm. A little bit like the ti demo image_processing. So I need two projects for a master and a slave.

Here's my platform for core1. The core0 platform is almost the same except that the

DDR_SRAM starts at 0x90000000 and DDR_SRAM_M at 0x91000000.

Sorry the picture is quite blur.

Did you follow the steps of my last post? Did the same thing happened?

0 Jason Reeder over 9 years ago in reply to ruijie yang

TI__Genius 10415 points

Ruijie,

It is strange that core 1 would try to access part of the rts6600_elf.lib that is located in a portion of memory that it should have no knowledge of. Looking at your map files for both the core 0 build and the core 1 build can you confirm that the .text and .data sections are being linked into the correct memory locations as you would expect? Core 0 should have nothing in its map file that is linked to 0x92000000 - 0x93FFFFFF and core 1 should have nothing in its map file that is linked to 0x90000000 - 0x91FFFFFF.

If it is possible I would suggest leaving the Stack Memory section in the L2SRAM of each core since each core will need to modify its stack separately. The L2SRAM stack will also be much faster than a DDR stack. This should only present an issue if you load the same code into mulitple cores (cores 1-7 for instance) since you have separate sections for core 0 and core 1.

I have not had time to follow your steps above in the example yet as I am out of the office for the Thanksgiving holiday.

Let me know if this helps,

Jason Reeder

0 ruijie yang over 9 years ago in reply to Jason Reeder

Intellectual 480 points

Hi Jason,

How's your holiday?

I tried to place stack section in the L2SRAM, but it doesn't work. I also put the task stack into L2SRAM.

The console prints E_spOutOfBounds: Task: 0x90738c54 stack error,SP = 0x84060be8.

I raised the size of program stack and task stack. Then stack overflow came up.

however , core1 still runs to address only core0 maps to . The problem is still there. T T

0 Jason Reeder over 9 years ago in reply to ruijie yang

TI__Genius 10415 points

Ruijie,

My holiday was very relaxing. Thanks for asking!

I have been able to replicate your issue by copying the project and then using two separate platform files like you have recommended.

I think I have an idea of what is going wrong but I will continue looking into it and report on what I find tomorrow.

Thanks,

Jason Reeder

0 Jason Reeder over 9 years ago in reply to Jason Reeder

TI__Genius 10415 points

Ruijie,

Take a look at the my_em_initQueues() function in the my_em_init.c file. This function is where core 0 goes through and sets up the OpenEM pieces for the rest of the example:

em_eo_create - creates an execution object
em_queue_create - creates an EM queue
em_eo_add_queue - adds a queue to an execution object
em_eo_start - starts the execution object

These four functions are used three separates times in the my_em_initQueues() function to set up three different types of operations for the cores to implement throughout the rest of the example. The three functions are:

processing the FFT data ('Proc EO' and 'Proc Queue'),
checking the results of the FFT ('Sink EO' and 'Sink Queue')
exiting the program ('Exit EO' and 'Exit Queue')

If you look at the function responsible for the creation of the execution objects (em_eo_create) you will see that it takes three function pointers as parameters in Example 0. The three function pointers are:

my_em_eoStartDefault - defined in my_em_init.c
- This function is called only once by one core and is intended to initialize any global state that the execution object may need. In our example no global state is needed so the only thing my_em_eoStartDefault does is return a success value, EM_OK. This function will be ran when em_eo_start is called on the execution object.
my_em_eoStopDefault - defined in my_em_init.c
- Similar to the my_em_eoStartDefault function is that it will only get called once from one core after em_eo_stop is called on the execution object. Once again, this is not necessary in our example and just returns a success value, EM_OK.
my_processJob, my_sinkJob, or my_em_exitLocal - defined in either my_em_init.c or my_receive.c
- These are the functions where most of the time in OpenEM will be spent. When an event is taken from a queue by the scheduler and assigned to a core this is the function that the core will run to complete that event.

All of that may have been a review for you but it leads me up to where your error is occurring.

Since core 0 is responsible for initializing the execution objects and event queues, the function pointers that it will use when calling em_eo_create will be the pointers that it is aware of for those functions.

When the code, stack, and data were all stored in local L2SRAM of each core this did not pose a problem because each core had a copy of each function that was stored at the exact same local L2SRAM address as the core 0 functions.

Now that you have created separate DDR sections for each core with different platforms (DDR_SDRAM) and placed the code sections in different memories we are seeing a problem. This is because core 0 is using the address that it is aware of for those functions (found in ~0x90000000 in your case). When core 1 asks for an event from the scheduler the execution object calls the my_processJob (for example) function using the address of the function in core 0's dedicated DDR memory.

In my recent experience, the example actually ran correctly on both cores and printed out the results to the console despite core 1 running the code from core 0's DDR space. The only reason I knew something was wrong is because when core 1 was finished it exited using a function in the 0x90000000 memory range.

There are multiple ways to address this issue. The way that I chose to address it is to create a common memory section in each core's L2SRAM memory called .my_em_svEmFunctions (using the Example_0.cfg file). I then used the CODE_SECTION pragma to place all of the functions mentioned above (my_em_eoStartDefault, my_em_eoStopDefault, my_processJob, my_sinkJob, and my_em_exitLocal) into the .my_em_svEmFunctions memory section. It is important to make sure that the functions get linked to the same physical address for each core or the example will not work for the reasons mentioned above.

To use the CODE_SECTION pragma use the following line above each of the 5 function definitions replacing my_processJob with the function name:

#pragma CODE_SECTION(my_processJob, ".my_em_svEmFunctions")

An alternative way to address the issue would be to use aliasing to allow each core to use the same logical memory address while actually accessing different segments of the DDR (similar to how the local L2SRAM addresses work for each core). I did not try that approach.

Thanks,

Jason Reeder

0 ruijie yang over 9 years ago in reply to Jason Reeder

Intellectual 480 points

Thank you Jason!!
It worked!
But I got one more question.When I tried to add my own functions to eo, like func1 and func2. I put them in L2 using CODE_SETION. But when I checked the map files of two projects, the addresses of func1 are the same while the addresses of func2 are not. And addresses of the my_processJob and my_sinkJob and eostart and eostop functions ,each of them are the same in both projects. Only func2 is 0x00832150 on master while 0x00832148 on slave.
I tried to use the CODE_ALIGN(func, CACHE_L2_LINE) but it didn't change anything. Here, actually, I am not very clear about how to pick up the right number as the align.
However, I moved the definition of func2 from one .c to receive.c and rebuild, I got the same addresses in both projects of func2.
I wonder how can I make sure that a function will link to a same address on L2 in two projects?

0 Jason Reeder over 9 years ago in reply to ruijie yang

TI__Genius 10415 points

Ruijie,

To my knowledge there is no pragma for CODE_ALIGN with the C6000 Compiler. Look at the TMS320C6000 Optimizing Compiler User's Guide (http://www.ti.com/lit/ug/spru187v/spru187v.pdf) section 6.9 for the available pragmas that you can use. The CODE_SECTION pragma is discussed in section 6.9.3.

Ultimately the linker will decide where each function ends up in the memory map. Using pragmas, memory definitions, and sections we can force the linker to place certain functions into certain sections. However, since the linker will go through and link each *.obj file, then if two function are in different *.obj files then it will be more difficult to force the linker to place the function at the same physical address.

My recommendation would be to place all of the functions that will be performed by multiple cores into one file (receive.c in your case) and then use pragmas, sections, and possibly a new memory definition to ensure that receive.obj (.text) will be the first thing (or only thing) linked into that memory section. This should ensure that all functions end up at the same memory location.

Thanks,

Jason Reeder

Processors

Processors forum

OpenEM problem