This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Atomicity of N2HET instructions

Hi.

I have three questions regarding the TMS570LS31x devices, module N2HET.

First question:
Assume that the following N2HET instruction is part of an N2HET program.

...
; edge counter
L02: ECNT { next = END, cond_addr = END, reqnum = 0, reg = none, request = GENREQ, event = FALL, pin = 0, data = 0 }
...


Assume that the following two actions should be executed in the SAME VCLK2-cycle:
  *) This ECNT instruction should be executed by the N2HET time micromachine.
  *) The CPU wants to do a write access to the Data Field of this ECNT instruction.
Since in spnu499b.pdf, page 773, is stated “N2HET accesses to its own internal RAM are given priority over accesses from an external host ...” it follows that the execution sequence is:

1. The micromachine executes the instruction.
2. The CPU does the write access.


Question 1: Is this correct?



Second and third question:
Assume that the following N2HET instructions are part of an N2HET program.

...
; count up
Q09: CNT   { next = END, reg = NONE, max = 0x1FFFFFF, data = 0}
; count down
Q10: SUB   { next = END, src1 = REM, src2 = IMM, dest = NONE, rdest = REM, remote = Q09, data = 0x1 }
...


The idea of this program: The Data Field of the CNT instruction is modified in two ways:
*) It’s incremented by the CNT instruction itself.
*) It’s decremented by the SUB instruction.

Assume that the following two actions should be executed in the SAME VCLK2-cycle:
*) The SUB instruction should be executed by the N2HET time micromachine.
*) The CPU wants to do a write access to the Data Field of the CNT instruction.
From spnu499b.pdf, page 841, Table 20-56, I get the information that the SUB instruction with this configuration needs 3 VCLK2-cycles (Since no line in this table matches the above configuration, I think that line 6 most likely matches.).
Since the SUB instruction needs 3 VCLK2-cycles, there are IMHO two possibilities how this situation will be handled by the TMS:
First possibility:
It’s not important how many cycles an N2HET instruction needs. I.e. an N2HET instruction is always atomically executed. In this case the execution sequence is:

1. The micromachine executes the complete SUB instruction (i.e. all 3 cycles).
2. The CPU does the write access.



Second possibility:
N2HET instructions are not atomically. In this case the following question arises: Which parts of the SUB instruction are done in which of the 3 cycles? I assume the following partition (referring to spnu499b.pdf, pages 842 to 844):
1st cycle: SOURCE OPERAND DECODING STAGE
2nd cycle: ARITHMETIC / LOGICAL OPERATION STAGE and SHIFT STAGE
3rd cycle: WRITE REGISTER DESTINATION STAGE, WRITE REMOTE DESTINATION STAGE, and UPDATE FLAGS STAGE
From this partition follows, that the SUB instruction is a read-modify-write access.
Assume that the 2nd cycle of the SUB instruction and the CPU write access are executed in
the SAME VCLK2-cycle. In this case the execution sequence is:

1. SUB instruction, 1st cycle: The value of the Data Field of the CNT instruction is read. Let’s call it OLD value.
2. SUB instruction, 2nd cycle: The OLD value is decremented.
3. The CPU does the write access: The Data Field of the CNT instruction will be updated with the NEW value.
4. SUB instruction, 3rd cycle: The Data Field of the CNT instruction will be updated with the decremented OLD value. I.e. the NEW value was overwritten.


Question 2: Is this correct?
In case it is correct: Question 3: Do you have a solution for this overwrite problem?


Thank you and regards
Oliver.

  • Oliver,

    You really have no ability to precisely time writes to the N2HET RAM, so we've never gotten into this level of detail as you can't work with the information anyway.

    I think you are trying to avoid the read-modify-write problem inside the mutliple-cycle SUB.   You can't avoid this hazard, so you need to come up with a scheme that works around it.

    You'd typically do something like guard a MOV32 instruciton with a DJZ.   The MOV32 is where the CPU writes it's data, and the type would in this case be IMMTOREG&REM where the remote address = Q9  (your CNT instruction).   Think about this instruction like you might think about a 'shadow' register or 'buffer' register in a hardware timer.

    Then you can guard the MOV32 with a DJZ instruction so it only executes on a one-shot basis.

    i.e. something like:

    Q11:   DJZ  {reg=NONE, cond_addr = Q13, data=0}

    Q12:   MOV32 {type=IMMTOREG&REM, reg=NONE, remote=Q9, data=0, hr_data=0}

    Q13  ... your next instruction...

    To write to CNT now in a controlled fashion, have the CPU write first to the MOV32 instruction the data value, and then write a '1' to the DJZ instruction's data field ('1' means 0x80 since bits 6:0 are reserved). 

    Now, by positioning Q11, Q12, Q13 above either before or after your SUB instruction, you can at least control which 'sticks' at the end of the loop (SUB or CPU write) and you avoid the R-M-W hazard.

    Hope that helps.

    Thanks for pointing out the deficiency in the excecution cycle count table, I'll look into that one.

     

  • Hi Anthony.

    Thank you for your very fast response!

    Question 2 and 3 (regarding the SUB instruction) are answered --> Nothing more to do on your side.

    But my question 1 is not answered in such a way, that I completely understand it: I read your first sentence, i.e. "You really have no ability to precisely time writes to the N2HET RAM, ...", but even this is not 100% clear for me. Perhaps I didn't explained in detail enough, what information I need. So I try to explain it in more detail:

    This is already stated in my first post (I just repeat it for simplicity): Assume that the following two actions should be executed in the SAME VCLK2-cycle:
      *) This ECNT instruction should be executed by the N2HET time micromachine.
      *) The CPU wants to do a write access to the Data Field of this ECNT instruction.

    Note: The ECNT instruction has one big difference to, e.g. the SUB instruction: The ECNT instruction needs always only ONE execution cycle (see spnu499b.pdf, page 827, Table 20-49).

    For me is now important to understand, which of the following two scenarios could happen. Or if even both of them could happen:

    First scenario:

    The complete ECNT instruction is executed and afterwards the CPU write access is done. Or first the CPU write access is done, and then afterwards the ECNT instruction is executed. I.e. the two operations are NOT intermixed.

    Second scenario:

    The two operations are intermixed, i.e. the following three steps could happen:

    1. In a first step the first "half" of the ECNT instruction will be executed. Let's assume the first "half" contains the first three if-statements which are stated in spnun499b.pdf, page 864, sub-chapter "Execution". (And let's assume also, that the first if-statement, i.e. "If (event occurs)" evaluates to true).

    2. In a second step the CPU write access, to the Data Field, of the ECNT instruction is done.

    3. Now the second "half" of the ECNT instruction will be executed. I.e. the execution is going on with the instruction "Immediate Data Field = Immediate Data Field + 1;" (on page 864).

    Would be great if you can answer if both scenarios, or only one of them, could happen.

    Thank you and regards

    Oliver.

  • Hi Oliver,

    In this case the hardware is going to only read Immediate Data Field once during the clock cycle and the registers & data field will be written back with the same value.   i.e. there is no chance that the statements:

    Selected register[31:7] = Immediate Data Field + 1;   

    Immediate Data Field = Immediate Data Field + 1;

    Would evaluate "Immediate Data Field + 1" differently.    I can see where this might be a concern because the psuedo-code is written in a C-like sequential manner but the hardware implementation is parallel.

     

  • Hi Anthony.

    I really appreciate that you answer that fast. But I really want to get sure, that I understood your answer correctly. That's the reason why I have to bother you again:

    Again the same example:
    Assume that the following two actions should be executed in the SAME VCLK2-cycle:
      *) An ECNT instruction should be executed by the N2HET time micromachine.
      *) The CPU wants to do a write access to the Data Field of this ECNT instruction. And the ECNT instruction only needs 1 VCLK2-cycle.

    I understand the following from your last post:
    *) The Immediate Data Field of the ECNT instruction will only be read ONCE per cycle. Let's call this the "ECNT read access".
    *) Then the calculation is done (i.e. Immediate Data Field + 1). Let's call this the "ECNT modify access".
    *) Then the calculated result will be written back to the selected register and to the Immediate Data Field. Let's call this the "ECNT write access". I understand the same value will be written to the register and to the Data Field.
    That's fine.

    So these three ECNT accesses are IMHO a read-modfiy-write access. Also fine. But what I don't understand: Is this read-modfiy-write access, of the ECNT instruction, atomic or not?
    Cause if this read-modfiy-write access is not atomic, the following can happen:
    1. "ECNT read access". Assume that the value 5 is read.
    2. "ECNT modify access". The result is 5+1 = 6.
    3. CPU does a write access to the ECNT Data field. Assume that the value 20 will be written to the ECNT Data Field.
    4. "ECNT write access". The value 6 will be written to the selected register and the ECNT Data Field. I.e. in the Data Field will the value 20 be overwritten!

    Thank you and regards
    Oliver.

  • Oliver,

    For the ECNT instruction the sequence that you describe above won't occur.

    But let's assume you have the ECNT and the CPU write to the ECNT executing at about the same time.

    The results could be:

      - ECNT ends up with value of 21  (if the CPU write occurs first)

        Register ends up with a value of 21

     

      - ECNT ends up with a value of 20 (if the CPU write occurs 2nd)

       Register ends up with a value of 6


    Just depending on the order of execution, even with an atomically executed ECNT.

    So you really need to implement a synchronization method if you want to control the result.

     

  • Hi Anthony.

    Again, thank you for the fast response! Now it's clear for me.

    And also thank you for the hint regarding the content of the Data Field and register depending on the time the CPU write access is performed. We already thought about this possible situation. But in our case we are in the lucky situation, that the ECNT instruction does not use a register. But to be on the safe side, this fact is documented in the N2HET program.

    Regards

    Oliver.