SK-AM64B: AM64x PRU assembler, how works this bloody carry ?

michel GUENEGO

Prodigy 55 points

Part Number: SK-AM64B
Other Parts Discussed in Thread: OMAP-L138

Hi,

in PRU assembly language, ADD instruction is defined as

Syntax:

ADD REG1, REG2, OP(255)

Operation:

REG1 = REG2 + OP(255)
carry = (( REG2 + OP(255) ) >> bitwidth(REG1)) & 1

SUC instruction (subtract with carry) is defined as (should, because in SPRUIJ2 April 2018, + are used instead of -)

Syntax:

SUC REG1, REG2, OP(255)

Operation:

REG1 = REG2 - OP(255) - carry
carry = (( REG2 - OP(255) - carry ) >> bitwidth(REG1)) & 1

I try this sequence of code

LDI r0.w0, 65534
ADD r0.w0, r0.w0, 1 ; no carry expected
SUC r0.w0, r0.w0, 0 ; so should not change the result, 65535 expected

LDI r0.w0, 65535
ADD r0.w0, r0.w0, 1 ; carry expected
SUC r0.w0, r0.w0, 0 ; so should subtract 1 and return to 65535

surprisingly, after both SUC instruction, the result is the opposite of what was expected, like if subtract use a carry inverted... (in first case result is 65534 and in second case result is 0)

Could TI explain that behavior and correct/update the PRU Assembly Instruction User Guide SPRUIJ2 ?

Thanks in advance.

over 1 year ago

0 Ki over 1 year ago

TI__Guru**** 441801 points

Hello,

I have brought this thread to the attention of the compiler experts. Please note that due to the local holiday it may take until Wednesday for a response.

Thank you for your paitence.

0 Nick Saulnier over 1 year ago

TI__Guru* 89325 points

Hello Michael,

I am reaching out to the PRU hardware designer to get clarification on the expected behavior for the carry bit. Please ping the thread if I do not have a response by the middle of next week.

Regards,

Nick

0 Nick Saulnier over 1 year ago in reply to Nick Saulnier

TI__Guru* 89325 points

Out of curiosity, does it change the behavior if you use different registers for your input and output?

e.g.,

LDI r0.w0, 65534
ADD r1.w0, r0.w0, 1 ; no carry expected
SUC r0.w0, r1.w0, 0 ; so should not change the result, 65535 expected

and

LDI r0.w0, 65535
ADD r1.w0, r0.w0, 1 ; carry expected
SUC r0.w0, r1.w0, 0 ; so should subtract 1 and return to 65535

Regards,

Nick

0 michel GUENEGO over 1 year ago in reply to Nick Saulnier

Prodigy 55 points

The result is the same if I change the registers (using r2 everywhere, r2, r3 ...). I have tried many variation and above, it was one of them.

Probably the subtraction is implemented as an addition with second operand negated, and in this case the carry is inverted:

0 - 1 = 0 + (-1) = 0 + FF = FF and carry is 0. (but should be 1)

1 - 1 = 1 + (-1) = 1 + FF = 0 and carry is 1. (but should be 0)

The problem is in the documentation, which let think that the common carry works in normal way for both addition and subtraction.

Michel

0 Nick Saulnier over 1 year ago in reply to michel GUENEGO

TI__Guru* 89325 points

Hello Michael,

Apologies for the delayed response here. Thank you for getting Guillaume to bring this thread back to my attention, my emails with the PRU developers got lost in my inbox.

Upon taking another look, you are testing with 16 bit numbers. Does the behavior change when using 32 bit numbers instead?

It sounds like my team members expect 32 bit carry to work, but it does not sound like 16-bit carry has been tested.

Regards,

Nick

0 michel GUENEGO over 1 year ago in reply to Nick Saulnier

Prodigy 55 points

Hi Nick

The size of data change nothing:

Executing this on SK-AM64B with the integrated emulator in code composer studio 12 (note: not easy to make it work with a HS-FS AM64x ...):

zero &r0, 120 ; r0..r30 = 0

LDI r0.w0, 65534
ADD r1.w0, r0.w0, 1 ; no carry expected
SUC r1.w0, r1.w0, 0 ; so should not change the result, 65535 expected

LDI r2.w0, 65535
ADD r3.w0, r2.w0, 1 ; carry expected
SUC r3.w0, r3.w0, 0 ; so should subtract 1 and return to 65535

LDI32 r4, 0xfffffffe
ADD r5, r4, 1 ; no carry expected
SUC r5, r5, 0 ; so should not change the result, 0xffffffff expected

LDI32 r6, 0xffffffff
ADD r7, r6, 1 ; carry expected
SUC r7, r7, 0 ; so should subtract 1 and return to 0xffffffff

gives

R0 : 0x0000FFFE

R1: 0x0000FFFE

R2: 0x0000FFFF

R3: 0x00000000

R4: 0xFFFFFFFE

R5: 0xFFFFFFFE

R6: 0xFFFFFFFF

The carry seems inverted for subtract.

Could a Texas PRU designer confirm and someone corrects the PRU Assembly Instruction User Guide SPRUIJ2 ?

Best Regards,

MIchel

0 Nick Saulnier over 1 year ago in reply to michel GUENEGO

TI__Guru* 89325 points

Hello Michel,

Thanks for the additional tests. Agreed that is not the behavior I would expect. I assume the R7 result was also 0x00000000?

I have asked the IP designer to double check the source code based on your testing. Based on those tests, it is not clear to me if the SUC command is the issue, or if there is something else going on (e.g., if ADD sets the carry bit, is that bit cleared with the ZERO command? If the bit is not cleared, and then you do an ADD that does NOT set the carry bit, does the second add overwrite the carry bit value to 0? Or is it left as 1 from the first ADD instruction?)

Please feel free to ping the thread if I do not have a response for you by Wednesday.

Regards,

Nick

0 Nick Saulnier over 1 year ago in reply to Nick Saulnier

TI__Guru* 89325 points

Hello Michel,

I'm not super familiar with the source code language. However, when I looked at the source code, it tentatively seems like SUC uses the same logic as ADC (where the carry bit is added to the output instead of subtracted). I am having some other team members double check my work before I file a ticket to update the assembly instruction docs. Please ping the thread if I have not provided another update within a couple of business days.

Regards,

Nick

0 Nick Saulnier over 1 year ago in reply to Nick Saulnier

TI__Guru* 89325 points

Hello Michel,

Thanks for your patience here. I've got a meeting with the associated HW folks later this week. Hoping to close out everything by Friday.

Regards,

Nick

0 Nick Saulnier over 1 year ago in reply to Nick Saulnier

TI__Guru* 89325 points

Hello Michel,

Can you confirm the PRU compiler that you are using? The HW folks will run some tests on their side to validate behavior, and will come back to us in a couple of days.

Regards,

Nick

0 michel GUENEGO over 1 year ago in reply to Nick Saulnier

Prodigy 55 points

Hi Nick,

I use TI CGT PRU 2.3.3

Sorry missing to copy R7 in previous 32bits example

Yes R7 is 0x00000000 at end

Best Regards,

Michel

0 Nick Saulnier over 1 year ago in reply to michel GUENEGO

TI__Guru* 89325 points

Edited Aug 29 2023: This is NOT actually a bug in the assembly instructions. However, the docs still need to be clarified. See the next response for more details.

Hello Michel,

Thank you for reporting this, and for your patience through this process.

I am able to confirm that there is a bug in both SUC and RSC assembly commands for all currently released processors with a PRU (should affect AM18/OMAP-L138, AM335x, AM437x, AM57x, AM62x, AM64x, AM65x, and any J7 devices available as of Aug 2023 that have a PRU subsystem).

I am working with other teams to get our documentation updated, PRU compiler updated, etc. Exact documentation changes are TBD, but currently we are planning to deprecate those two instructions for all existing PRU devices.

Please let us know if any additional discussion is needed.

Regards,

Nick

0 Nick Saulnier over 1 year ago in reply to Nick Saulnier

TI__Guru* 89325 points

Hello Michel,

We have some updates! Thanks to Dimitar for commenting at https://e2e.ti.com/support/processors-group/processors/f/processors-forum/1262131/sk-am64b-pru-deprecation-of-suc-and-rsc

It looks like the PRU core designer copied the ARM design concept for implementing subtraction with 2's complement logic. SUB uses a carry bit value of 1 to implement "normal" subtraction. Which means that for the output of SUB, carry_bit = 1 means no carry occured, while carry_bit = 0 means there is a carry bit that needs to be subtracted. For more details on the ARM logic, reference https://stackoverflow.com/questions/41253124/i-cant-understand-some-instructions-in-arm-sbc-rsc

So it looks like we need to think about the carry bit as being negative or positive. Instructions that assume a "positive" carry bit as an input (e.g., ADC) will only work with instructions that give a "positive" carry bit as an output (e.g., ADD), while instructions that assume a "negative" carry bit (SUC / RSC) will only work with instructions that give a "negative" carry bit as an output (e.g., SUB).

So SUB/RSC is actually working as designed. We will still work to clarify our documentation, since we do not discuss the above concepts anywhere.

Regards,

Nick

0 michel GUENEGO over 1 year ago in reply to Nick Saulnier

Prodigy 55 points

Hi Nick,

I am more happy with this new answer than with the previous.

If you read again my original question, at the begin of this discussion, this was what I see : the carry seems to work inverted for subtract. And I asked just a confirmation of that, because the documentation says it is the same carry, not inverted for subtract. I have also insisted in next examples about 32bits instead of 16...

So thanks for you taking the time to shed light on this point.

Best Regards,

Michel

Processors

Processors forum

SK-AM64B: AM64x PRU assembler, how works this bloody carry ?