Dedicated Memory Region

FeruzM

Intellectual 740 points

Hi,

How can I allocate memory for one variable so that all cores can access to that data, when one code ran to all cores?

PS. preferable variable - arrays, pointers

Thanks,

over 13 years ago

0 twentz over 13 years ago

Intellectual 435 points

There are a couple ways to do this.

Does the allocation need to be static (global array) or dynamic (malloc)?

Are you using a linker file? Are you using a RTSC cfg file?

Are you using SYS/BIOS? Are you using IPC?

How is your memory set up? Do you already have a region of shared memory? (typically MSMCSRAM or DDR). Is the code running out of core-local L2SRAM for each core?

0 FeruzM over 13 years ago in reply to twentz

Intellectual 740 points

Hi,

Does the allocation need to be static (global array) or dynamic (malloc)?

A: dynamic (malloc)

Are you using a linker file? Are you using a RTSC cfg file?

A: linker file

Are you using SYS/BIOS? Are you using IPC?

A: none, but depends on needs

How is your memory set up? Do you already have a region of shared memory? (typically MSMCSRAM or DDR). Is the code running out of core-local L2SRAM for each core?

A: in linker file all areas are set

Thanks!

Regards,

Feruz

0 twentz over 13 years ago in reply to FeruzM

Intellectual 435 points

I am having difficulty imagining how you would need to use dynamic memory allocation with data shared between cores. You can either have the situation where:

1) One core allocates a previously-unknown size of memory. Then all cores access this memory. In this case, a large static buffer works just as well and makes things simpler.

2) Each core allocates a chunk of memory from shared memory. There may or may not be sharing between post-allocated sections.

In both cases:

You need to have a small scratch buffer in shared memory so that cores can communicate addresses of the buffers after they have been allocated.

In the first case:

Using SYS/BIOS, you can change the default heap instance used by BIOS (so you can still use "malloc()" without changing the function. This is the same as relocating .sysmem, I believe) or you can create a SYS/BIOS Heap object (HeapBuf, HeapMem, HeapMultiBuf) with a designated memory address and then use Memory_alloc on that Heap object.

In the second case:

You'll need to use Heap*MP objects since multiple cores need to perform allocations from the same memory pool.

There are more details, but I'm not certain what your needs are.

0 FeruzM over 13 years ago in reply to twentz

Intellectual 740 points

All right I understand your points.

Let's say I have large variable/array (I mean memory) and I need to make some calculation on that variable.

What would you think, is best way to do it efficiently?

Having it in shared memory?! - what if it doesn't fit there.

Allocating from DDR and all cores can have access to that?! - from efficiency point of view it would be slower.

At the same time I don't know if there is way to exchange parts of array (like 1st core does some calculation with its part and sends it over to others, at the end will collect all parts from all cores and combine result)?

Thanks,

0 twentz over 13 years ago in reply to FeruzM

Intellectual 435 points

Both of those issues you raise are fundamental problems of any computer architecture, not just DSPs.

667x has 512KB * # of cores of shared memory. It's not huge, but it's not tiny.

Variables in DDR are cached in L2 and L1 unless caches are turned off. However, when data is cached, it must be written back to memory in order for another core to read the updated value.

If only one core is working on an independent set of data for a long period of time, then caching data in L2 and L1 should be decent. If multiple cores are frequently sharing data, then the data should be located in faster memory such as MSMCSRAM or core-local L2SRAM (as you are discussing in the other thread). If your data is too large for this, then you would need to partition the problem and move parts of the data from DDR to shared memory, work on shared memory, and then move the data back to DDR and bring in another part of the data.

0 RandyP over 13 years ago in reply to twentz

TI__Guru* 84110 points

twentz,

twentz said:

667x has 512KB * # of cores of shared memory. It's not huge, but it's not tiny.

I would like to clarify this on some points:

1. 667x is used confusingly by us at TI, and usually means the devices for which x = 1, 2, 4, 8. In FeruzM's other thread, the device is specified as C6670. When x=0 for the C6670, each CorePac has 1MB L2; when x=1-8, each CorePac has .5MB L2.

2. The MSMC SRAM is shared memory and the DDR3 is shared memory. Each CorePac's L2 can be accessed by other cores but even though it can be accessed by all CorePacs I do not usually consider it to be shared memory, for what that is worth. On older TI multicore devices, there was a speed penalty trying to access another core's L2. I have not measured this myself on the C667x devices, but there is probably a speed penalty here, too, when trying to use another core's L2 as shared memory.

3. For C667x when x=0, the MSMC SRAM is 2MB; when x=1-8, the MSMC SRAM is 4MB.

FeruzM,

As another approach to how twentz is trying to help, I would ask: What type of processing do you want to do with the shared memory that you would allocate?

Regards,
RandyP

0 FeruzM over 13 years ago in reply to RandyP

Intellectual 740 points

Hi,

Thank you for valuable information.

As an example let's consider simple matrix multiplication, shared memory used to hold larger matrices, and then cores get their parts, do calculation, returns to the shared memory final result.

Thanks,

Feruz

Processors

Processors forum

Dedicated Memory Region