Transactional memory seems to gain more momentum these days. But what is behind this terminology?

Let me try to explain this in my humble and simple words. Basically for me 'Transactional Memory' is an atomic access that is non-blocking and coherency is checked at store time. Typically this is implemented with 3 instructions and flags in memory. The three instructions are:

  • Load Linked Word from Memory (LL)
  • Store Linked Word to Buffer (SL)
  • Commit Store Linked Word to Memory Conditionally (CMTL)

So how does it work?

The LL instruction reads a word of memory and prepares to execute an SL instruction. The LL instruction reads a word from memory with a side effect, a link valid flag is set true and the address is monitored. If any other process stores to that address, the link valid flag is cleared. The SL instruction buffers a word to be stored to memory by the CMTL instruction. It does not commit the change. Finally the CMTL instruction reads the value of the link valid flag. If the link valid flag is true, the data buffered by the SL instruction is written to memory. If the commit fails, the update must be retried.

Because of its non-blocking nature 'Software Transactional Memory' (STM) allows for lock free programming. Traditional lock-based synchronization has known pitfalls. E.g. synchronization is inefficient and code that uses locks is prone to deadlock. Lock free programming allows to concurrently update shared data without the need for critical sections protected by operating system managed locks.

Looking at our latest Keystone family you'll notice that we do not directly support this mechanism. But our Multicore Navigator is a very flexible piece of Hardware that can do a lot of neat things. One is to allow for a more generic 'Software Transactional Memory' (STM) method via the 'Queue manager'. The 'Queue Manager' is a hardware maintaining 8192 linked lists managing atomic access for all cores.

One of the standard examples for a lock free programming is a global counter that is being updated locally on each core. We have a proof of concept code example running on our TMS320C6678 Keystone Multicore DSP and a small application report along with it on our Embedded Processors Wiki.

The other typical use case of lock-free programming is for sharing simple data structures such as LIFO stacks and FIFO queues. Here I do not need to show a code example since this is natively supported by the architecture (The Queue Manager within the Multicore Navigator is a hardware maintaining 8192 linked lists managing atomic access for all cores. For details please see the Multicore Navigator User's Guide).

Since lock free-programming provides some significant advantages to real-time systems like avoiding priority inversion and deadlocks this is an interesting domain especially in our multicore environment. Still lock-free programming isn't often used at present. What's your opinion on transactional memory and 'lock free programming' in multicore environments?

Kind regards,

one and zero

P.S.:  Take care and don't let yourself get locked in ...