How can I go with NDK + SRIO

Naoki Kawada

Guru 19890 points

Hello,

I'm thinking about the way how I can go with NDK + SRIO on c6678.
For example, I'm assuming the use case which SRIO processing is on core0-3, and NDK is on another core, let's say, core4.
Today I tried to work NDK example (helloworld from MCSDK) and SRIO example (SRIO_LoopbackTestProject from PDK package) separately, and they are basically working on EVM...but I had a trouble to work helloworld on core4... it worked only on core0.

If NDK helloworld starts working on core4, I will need to consider the next step - how I can go with NDK + SRIO on c6678 ? I would like to have your ideas for this.

These examples are using common LLD for QMSS and CPPI. So I'm wondering if I might have to marge these examples into a single executable because each core needs to share the HW(QMSS and CPPI) resources correctly.
Should I go with this way ? Or, do you have any good ideas when using a separate executable for each core ?

Regards,
Kawada

over 11 years ago

0 lding over 11 years ago

TI__Guru* 95265 points

Kawada,

I thought from another thread: http://e2e.ti.com/support/embedded/bios/f/355/p/291416/1016785.aspx#1016785 you already made NDK client working on other cores than core 0 on 6670 DSP. The changes to client should help for "hello world" example.

Next step for making SRIO + NDK work together, this seems tougher. You can refer to http://e2e.ti.com/support/dsp/c6000_multi-core_dsps/f/639/t/198711.aspx.

Regards, Eric

0 Naoki Kawada over 11 years ago in reply to lding

Guru 19890 points

Hi Eric,

Thank you for your reply.

Well, I decided to run *separate* executables on each core to work with SRIO(core0/1) and NDK (core3) on c6670EVM.

The current status is :
- SRIO : SRIO_loopbackProject is working between core0 and core1 only whey NDK is not running on core3.
- NDK : Client sample is working only when SRIO is not running on core0/1.

So, I must have a problem like HW resource conflict...

What I do for these projects:

- SRIO_loopbackProject

use Qmss_MemRegion_MEMORY_REGION19 in test_main.c because NDK looks like not using this region.
use HW_SEM 4,5 and 6 to protect critical section in test_osal.c . NDK is using HW_SEM 0,1 and 3 by default.
no loading pdsp firmware in Qmss_init (qmssInitConfig.pdspFirmware[0].firmware = NULL; qmssInitConfig.pdspFirmware[0].size = 0; in test_main.c). NDK will have a responseability for this work.
change .cfg file to work only on core0 and core1 (MultiProc.numProcessors = 2; MultiProc.setConfig(null, ["CORE0", "CORE1"]);)
change srio_EXAMPLE_NUM_CORES definition from 4 to 2.

- NDK Client sample project

apply nimu patch to work another core than core0.

Experiments :

I tried to start NDK first on core3 and put break point after Qmss_init(). Now pdsp firmware had been loaded.
Next started SRIO on core0/1 and then everything worked fine.
I tried to start NDK first and confirmed it worked fine on core3. Next started SRIO on core0/1 and then program did not reach to end with success. NDK stopped working also.
I tried to start NDK first and confirmed it worked fine on core3. Next started SRIO on core0/1 and put break point in Qmss_Qmss_insertMemoryRegion(). At this point, NDK was still alive. I stepped into the code and found that NDK stopped working when the following code was executed on core0.
qmssLObj.qmDescReg>MEMORY_REGION_BASE_ADDRESS_GROUP[index].MEMORY_REGION_BASE_ADDRESS_REG
= (uint32_t)descBasePhy;

So, my question is :

Why NDK can stop working when another core touches MEMORY_REGION_BASE_ADDRESS_REG ?
Please note SRIO is now using Qmss_MemRegion_MEMORY_REGION19.

Best Regards,
Kawada

0 lding over 11 years ago in reply to Naoki Kawada

TI__Guru* 95265 points

How do you avoid not having any memory conflicts? I wonder where is qmssLObj, in L2 SRAM, MSMC, or DDR?

- Eric

0 Naoki Kawada over 11 years ago in reply to lding

Guru 19890 points

Hi Eric,

Thank you for your reply.
As you pointed out, NDK and SRIO project can cause memory conflicts when using the default c6670evm platform. NDK and SRIO looks like overlapping in memory usage in MSMC. I'll fix it by creating/applying the customized c6670 platform for NDK and SRIO.
(But I'm wondering why I could not see exception error or system crash on NDK or SRIO... When I halted the cores on CCS, each core looks like alive.. )

Now, I have very urgent other issues to solve on other project/processor...
I'll be back once the issues have been solved.

Best Regards,
Kawada

0 Naoki Kawada over 11 years ago in reply to Naoki Kawada

Guru 19890 points

Hello Eric,

I came back here.
I changed the memory map of NDK client project and now it looks there is no conflict in memory usage between NDK client and SRIO project. But I still have the same problem. When NDK client project is running on core3, and when SRIO loopback project (core0) writes to MEMORY_REGION_BASE_ADDRESS_GROUP[19].MEMORY_REGION_BASE_ADDRESS_REG in Qmss_insertMemoryRegion(), NDK can hung - I mean, core3 looks like alive, but I can't access to web server via PC. I will try to summarize the issue as below:

Platform : c6670 evm
SDKs :
- mcsdk_2_01_02_06 (patched)
- pdk_C6670_1_1_2_6
- NIMU patch discussed before
Projects
- C:\ti\pdk_C6670_1_1_2_6\packages\ti\drv\exampleProjects\SRIO_LoopbackTestProject on core0/1
- C:\ti\mcsdk_2_01_02_06\examples\ndk\client\evmc6670l on core3
Project changes from the default project and its status
- SRIO_LoopbackTestProject : it is working on core0/1 if NDK client is not running on core3
  - using specific memory region (region 19). Note that NDK client is using region 0.
  - generic queue IDs used by the project is : core0 = 1000~, core1 = 2000~. (Assuming NDK is using lower queue ids). QMSS_PARAM_NOT_SPECIFIED is not being used for queue selection in the project. (To be more smart, I may be able to use RM for LLDs, but I'm not doing so to be more simple)
  - HW semaphore changes in test_osal.c not to conflict NDK's one
  - No loading pdsp firmware (option for NDK concurrent work on another core. NDK client project will do that. This can be disabled for the standalone test)
  - No QMSS HW setup option just like linking rams (option for NDK concurrent work on another core. NDK client project will do that. This can be disabled for the standalone test)
- NDK client project : it is working on core3 if SRIO_LoopbackTestProject is not running on core0/1
  - NIMU patches has been applied to work on other cores than core0.
  - Minor changes in client.c to work on other cores than core0.
  - NDK_PACKETMEM is moved from MSMC to external RAM in client.cfg. Now NDK client project is running only on local L2RAM and external RAM (Not using MSMC). On the other hand, SRIO loopback project is running on local L2RAM and MSMC (Not using external RAM)
Sequence to recreate the issue.

1. Run the default gel file on core0 for basic HW initialization.
2. load SRIO executables on both core0 and core1.
3. load NDK executable on core3.
4. put break point at Qmss_insertMemoryRegion() in test_main.c (SRIO_LoopbackTestProject) on core0.
5. run NDK executable on core3. Now webserver is running. You can access to web pages from your PC.
  (Note that I'm using fixed IP address. Change it appropriately for your environment)
6. run SRIO executables on core0/1. Now core0 will break at Qmss_insertMemoryRegion(). Please note NDK client project is still working correctly.
7. step into Qmss_insertMemoryRegion(). You will find qmssLObj.qmDescReg->MEMORY_REGION_BASE_ADDRESS_GROUP[index].MEMORY_REGION_BASE_ADDRESS_REG = (uint32_t)descBasePhy; If you execute it, NDK webserver stops working....

I'm attaching the related code for your reference.

zipped Loopback from C:\ti\pdk_C6670_1_1_2_6\packages\ti\drv\srio\test. To step into Qmss_insertMemoryRegion(), you need to add/link qmss_drv.c to the project and configure required include path.
zipped client from C:\ti\mcsdk_2_01_02_06\examples\ndk

Can you recreate the issue at your end ?

Best Regards,
Kawada

~~~~~~~~~~

Other Reference:

uncomment MRB_PDSP_LOAD definition in test_main.c (SRIO_LoopbackTestProject ) for standalone test.
comment MRB_QUEUE_SELFMANAGEMENT in test_main.c (SRIO_LoopbackTestProject ) to disable use of specific queue id.
change MRB_QUEUE_MEMREGION definition in test_main.c (SRIO_LoopbackTestProject ) for specific memory region. 19 is default.

5270.Loopback.zip

0675.client.zip

0 Yokoyama, Atsushi over 11 years ago in reply to Naoki Kawada

TI__Expert 7180 points

Kawada san,

Please let me input the answers again to share them on E2E.

As already discussed by email, when we allocate descriptor memory regions to QMSS, the base address of the entries must be aligned ascending order for KeyStone 1. (I heard KeyStone 2 doesn't have this limitation.) You can find it in the Multicore Navigator User Guide by the keyword "ascending".

In addition, my understanding is, we must not have an empty entry in the middle of multiple memory region entries. For example, when we have the region #0, #1, and #3, entry #2 should be configured too. (I didn't find this in the User Guide...)

I hope some expert will confirm the above.

Regards,
Atsushi

0 Naoki Kawada over 11 years ago in reply to Yokoyama, Atsushi

Guru 19890 points

Hi Yokoyama-san,

Thank you for sharing.
As per your suggestions, NDK hung-up issue has been solved, but now I have an another issue with NDK + SRIO. I need further analysis for that.

I'll be back here once I get some updates on this.

Best Regards,
Kawada

0 Naoki Kawada over 11 years ago in reply to Naoki Kawada

Guru 19890 points

Well... it looks like I still have a problem in Qmss_insertMemoryRegion().

I have changed the test code(s) to initialize qmss from core0.
I think this is more generic use case on this platform.
So now system start-up sequence looks like below:

Load & Run SRIO loopback executable on core0 and 1. To be simple in the test code, I have changed the code to run only test_rawSockets(). Now core0 will keep running with correct logs on CCS console. Note that I have changed the test code to use the specific queue ids for SRIO as I mentioned before.
Load NDK client executable & Put break point on Qmss_insertMemoryRegion() and then Run on core3. It will not initialize qmss HW, but other code should be almost same with the default sample code bundled in MCSDK.
Now core3 will break at Qmss_insertMemoryRegion(). You can step into Qmss code. Note that SRIO on core0/1 is still running correctly until the following line:

qmssLObj.qmDescReg->MEMORY_REGION_BASE_ADDRESS_GROUP[index].MEMORY_REGION_BASE_ADDRESS_REG = (uint32_t)descBasePhy;

And then SRIO will stop running (stop logging on CCS console) suddenly once you execute this line on core3.
At this point, SRIO seems failing to receive the data from the queue...

Once you confirm the above SRIO hung-up, Run NDK again from that point. NDK will start working correctly.

Please note I've changed the usage of QMSS memory region on the cores (by the QMSS restrictions Yokoyama-san explained).

core0 (SRIO) : QMSS region 0
core3 (NDK) : QMSS region 1
Note that Inserted memories to the regions are mapped on local L2RAM. They are converted to global address before inserting memory. So inserting memory of region 0 should be always smaller than the one of region 1.

I have a complete project set to recreate the issue at your end.
I've attached here:

3348.ndk_srio_test.zip

You can import the project and build. If you follow the sequence mentioned above, I believe you can recreate the issue easily. I want to know why this issue can happen... Could you please check this ?

Lastly, environment :

CCSv5.3
SDK : mcsdk_2_01_02_06 (patched), pdk_C6670_1_1_2_6, NDK patch(discussed on http://e2e.ti.com/support/embedded/bios/f/355/p/291416/1017058.aspx#1017058)
Platform : C6670EVM

Best Regards,
Kawada

0 Naoki Kawada over 11 years ago in reply to Naoki Kawada

Guru 19890 points

I found that core0 (SRIO loopback project) was getting wrong memory (belonging to region0) from the queue just after core3 (NDK client project) inserted the memory to QMSS region1. Please refer to this slide.

5432.Memory leaking in QMSS.pptx

I'm wondering if this might be a bug in QMSS...

Regards,
Kawada

0 Yokoyama, Atsushi over 11 years ago in reply to Naoki Kawada

TI__Expert 7180 points

Kawada san,

I saw the slides.

First, we need to isolate if the problem came from QMSS hardware or QMSS LLD (driver).

Second, you said SRIO sample code got a wrong memory address. Did you find what API provided the wrong memory address to SRIO demo code?

Regards,
Atsushi

0 Naoki Kawada over 11 years ago in reply to Yokoyama, Atsushi

Guru 19890 points

Hi Yokoyama san,

Thank you for the reply.

You will find the following code in test_main.c (from SRIO_loopbackTestProject)

========================

for (count = 0; count < 10; count++)
{
/* Pop off a descriptor */
ptrHostDesc = (Cppi_HostDesc *)Qmss_queuePop(srioTempQueue);
if (ptrHostDesc == NULL)
return -1;

========================

This code is running on core0. Also, please don't forget that core0 is now using region0.
In working case, Qmss_queuePop() is returning the memory (ptrHostDesc) @ 0x1083dxxx. And the address returned from Qmss_queuePop() will be printed like this :

Debug(Core 0): Raw Send 0x@1083dd50 Data Size: 152 Iteration 1 passed.

You can see the same in the slide.
Note that the core0 has inserted the memory (@0x1083dxxx) into region 0 in initialization phase.
So, getting these address via Qmss_queuePop() should be always expected.

But, during core0 is running with the above correct logging, once core3 writes the region base address to region1 (via Qmss_instertMemoryRegion), this Qmss_pueuePop() starts to return the memory around 0x1380xxxx. This address range is for global address of L2RAM on core3 and it is exactly the memory inserted by core3 via Qmss_instertMemoryRegion.

Qmss_queuePop() does not have complicated LLD stack.
Qmss_queuePop() is an inline function like this :

static inline void* Qmss_queuePop (Qmss_QueueHnd hnd)
{
return Qmss_osalConvertDescPhyToVirt((void *) (qmssLObj.qmQueMgmtReg->QUEUE_MGMT_GROUP[hnd].QUEUE_REG_D));
}

As you see, it is just reading QUEUE_REG_D register dedicated for SRIO queue (region 0).

So, the problem is : core0 (belonging to region0) can get the memories for core3 (belonging to region1) via QMSS

Best Regards,
Kawada

0 Naoki Kawada over 11 years ago in reply to Naoki Kawada

Guru 19890 points

Does anybody look into this ?
As I mentioned before, the problem is core0 (belonging to region0) can get the memories for core3 (belonging to region1) via QMSS. I believe this is NOT NDK/SRIO related issue but coming from QMSS HW (bug or something QMSS restriction).

Best Regards,
Kawada

0 Naoki Kawada over 11 years ago in reply to Naoki Kawada

Guru 19890 points

I confirmed that NDK + SRIO could work if they used the same QMSS memory region, let's say region 0. To realize this, I had to do some tricks :

SRIO loopback app (core0) allocates enough memory (for SRIO and NDK) and initializes QMSS memory region 0 with it via Qmss_insertMemoryRegion ()
SRIO loopback app (core0) passes the queue information to NDK client side to share the memory belonging to region 0. And then SRIO loopback app waits for the completion of the NDK initialization.
NDK client app gets the queue id from SRIO side. NDK (core3) does not call Qmss_insertMemoryRegion for region 0, but hacks the internal QMSS LLD structure and registers the passed queue id. After some other initializations, NDK client app notifies the completion to SRIO loopback app.
SRIO loopback app and NDK start its operation.

As you see, this is not very preferable by software design but I must follow this way to share the same region among the cores using the separate executable. If NDK can use another region than region 0, let's say region 1, that should be best, but as you know, I can't follow this way because of the memory leaking issue in QMSS.

I regret that you could not suggest the solution for that.

Regard,

Kawada

Processors

Processors forum

How can I go with NDK + SRIO