This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Atomic operations

I am seeking information on what operations/instructions can be considered atomic on a C6657.

My interest is primarily from a software point of view, so apologies if this seems off topic here.  I tried to get some information in TI compiler forum, but it seemed that my question is made dificult because it involves details of both archtiecture, compiler, etc.  The other discussion is:

http://e2e.ti.com/support/development_tools/compiler/f/343/p/296210/1033219.aspx

 

  • Can you be more specific regarding what you're concerned about happening.  

    Is this in regards to Shared Memories that are cached?  In general these are not Atomic because when cached what gets updated by a CorePacX when writing back is a write into the cached location.  An eviction or cache writeback would need to occur to write this back.  Also, an invalidation of the CorePacY that would be reading this memory would need to occur before reading it (assuming it's in cache) such that it will grab the latest Shared Memory value.

    You can use the Semaphore HW Module for ownership control and caching operations if you need to handle this in guaranteed Atomic fashion.

    Best Regards,

    Chad

  • Thanks for response Mr. Courtney.

    My concerns are a little nebulous.  It wasn't particulary concerned with shared memory that was cached, although that's probably only because I hadn't got around to thinking of how caches play a role.

    Here is a scenario:

    A task is writing an (aligned) 4-byte value to a location in memory.  At the same time there is an interrupt (or any context switch in general) that could lead to the location being read.  Would I ever have to be concerned about the read seeing an incoherent value, e.g. 2 bytes reflect the state of the memory before the write and 2 bytes reflect the state of the memory after the write?  Does the answer change if instead we are dealing with reading/writing a single byte?

    HW Semaphore module sounds interesting, I will have to do some read-up on it.

  • Any store (write) instructions would complete prior to an interrupt being able to access the location.  From that perspective it would be Atomic.

    That said if the code is storing an array (i.e. many successive stores in a row.)  And you interrupt it, then only the part of the Array where the stores had already occurred would have the updated information.  

    Please note that this is still Atomic from an instruction perspective, but not from a function call perspective.  In which case you'd have to disable interrupts to prevent interruption or have some sort of handshaking in place to let the Interrupt Service Routine know that there may be data in flight.

    I'm not aware of any device that would be atomic at a function call level.

    If you have a data type that is a 32bit data type (such as an INT or SP) then the stores (writes) will be done as a 32 bit (or possibly 64 bit if you have successive stores) to memory.  It will not be broken up to a multiple single Byte Writes.  Even if it was to an external memory that only supports 8bit accesses, the full word would be pushed to the buffer.

    I hope this helps your understanding.

    Let me know if you have more questions or if I there's anything I mentioned that needs clarification.

  • Mr. Courtney, great info, thank you.

    One remaining question is if you have any suggested references I could use where I could look-up this type of information on my own rather than using forums.

  • Hi,

    The C6000 EABI §2.3 say that:

    Scalar variables are aligned such that they can be loaded and stored using the native instructions appropriate for their type: LDB/STB for bytes, LDH/STH for halfwords, LDW/STW for words, and so on

    So reading/writing a data up to 64 bits (for C66xx that have the LDDW/STDW) should be implemented as a single load/store, that is atomic.

    If data are not aligned (due to some pack directive), the compiler use non-aligned single access instructions so it should be atomic too, at least at single core level.

    Since the C66xx doen't have single 128 bits load/store, the 128 bits types should not be considered atomic, even if it appears to be somewhere "native".

    In general, bitfield store are not to be considered atomic.

    Every atom should be always declared volatile.

    If the destination location is not on the DDR or internal memory, but a device on the EMIF16 bus with an external competitors (for instance an FPGA), I thinks it depends of the devidc data bus size, since the single write could be broken in multiple writes.

  • Thanks Albert,

    Those are good points from a 'what the compiler' may generated perspective.  A 128bit data type such as a long double float would not be Atomic as there's not a native 128bit store.  

    Bitfield modifications (usually performed on configuration registers) would not in general be considered Atomic, but they also should be only configured by CorePac0 or by Host during boot.  There may be other circumstances but those would be rare.

    I'll need to check on the buffer width of EMIF16.  Though the EMIF data width is narrower, the buffer feeding it would not break up the sequence it comes into the buffer.  So a CPU writes should not be broken up, but other IP's writing directly that have different data bus sizes, may be broken up.  

  • What about the multi-core issues (on DDR and MCSM)? Can we assume that:

    • non cachable regions: atomic up to 64bits access, both for aligned and non-aglined data (LDNW...) even if non aligned can span over two write on the internal bus (256bits, if I remember well)
    • cachable: single cache line atomicity. If a core0 invalidate and read while another is writing-back, will be the cache line read by core0 consistent? Well, anyway the application should use some other mechnism to ensure the coherence...
  • Noncacheable - The compiler doesn't create arrays on alignments that don't match the minimum size of the data type within. The compiler normally uses the non-aligned when you're accesses multiple smaller sized data types (i.e. using LDNW/STNW for 4 Bytes - where the data type is Byte) and this would still be Atomic.     This is really a concern when you're using hand assembly and non-aligned data.  128bit data is an exception to this.

    Cacheable - The cache line would be consistent.  The full cache line that is in the Writeback would fill the entire line prior and the Core0 would stall until it's complete in this situation.