Hyperlink latency question

Heinrich Schwarz

Hello,

We are building a framework for distributed image processing and would like to implement communication between the cores on top of Hyperlink using the IPC peripheral (by memory mapping the IPC periperal registers of the other DSP) and by mapping the other core`s memory.

I am currently a bit worried about latency. By running the memoryMappedExample project I can see Hyperlink has round-trip latency of 500-1000 cycles, so a one-way latency somewhere between 250-500 cycles.

One one-way latency is caused by accessing the IPC peripheral of the remote core, after that the remote core will start to access memory of the local core.
What I wonder - does every cache-line fetch (64 byte) introduce a round-trip latency?
If so, would using EDMA3 for larger-than-cache-line access, allow the data transfer to happen at once without additional round-trips in between?

Thx

over 11 years ago

0 lding over 11 years ago

TI__Guru* 95265 points

Can you provide more information: what is the processor you are running the Hyperlink program? Do you intend to move data between cores in the same chip or among different chips? I thought the Hyperlink is for the latter. For the test project, what is the MCSDK/PDK release version?

Regards, Eric

0 Heinrich Schwarz over 11 years ago in reply to lding

Intellectual 450 points

Hi Eric,

We found that for the "hyplnk_exampleProject" example-project shipped with the PDK, most time was spent in the Cache_wb and Cache_inv functions (we are running two C6678 connected over 4 12.5 gbps lanes).
The latency seems to be more like 100ns for write and 200ns for uncached reads which is ok, we will try to design our data-structures so that data is located where it is read to avoid round-trips caused by reads.

However I have one very specific question:

Are writes through HyperLink guaranteed to be executed on the Destination device in the order they were generated on the source device?
We have code like the following snippit:

memcpy(hyperlinkMem, localMem, 64); // Write to remote MSM through uncached Hyperlink
hyperlinkIpcRegs->IPCGR[n] = flags; // Trigger Interrupt on remote device through Hyperlink mapped IPC Registers
// Interrupt handler depends on the data written by memcpy before.

Do we need to execute any flush between the write to the uncached remote memory and generating the interrupt by writing (also through Hyperlink) to the IPC Registers of the remote core?

Thx

0 lding over 11 years ago in reply to Heinrich Schwarz

TI__Guru* 95265 points

If the cache is disabled, the writeback/invalidates aren’t needed.

I am checking if "HyperLink guaranteed to be executed on the Destination device in the order they were generated on the source device?" and will let you know.

Regards, Eric

0 Heinrich Schwarz over 11 years ago in reply to lding

Intellectual 450 points

Hi,

Is there any update on this?

0 lding over 11 years ago in reply to Heinrich Schwarz

TI__Guru* 95265 points

Sorry for the late, as long as the data has been written to the Hyperlink prior to the interrupt, then the remote will look exactly the same.

Regards, Eric

0 Heinrich Schwarz over 11 years ago in reply to lding

Intellectual 450 points

Thanks!

Processors

Processors forum

Hyperlink latency question