This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TMS320F28377D: VCU-II Error in calculation

Part Number: TMS320F28377D
Other Parts Discussed in Thread: C2000WARE

Hello,

I am trying to create a custom assembly routine that leverages the VCU-II for CRC computations.

I used the existing routines as examples and would like the prototype to be the following:

void CRC_calc (uint16_t *data, uint16_t length, uint16_t *result);

I made sure that all the values are being passed properly to the assembly routine, and the correct starting value for VCRC.
Regardless, the computation from the VCU seems off.

if the prototype is the following, I see no issues.

void CRC_calc (void *data, uint32_t length, uint32_t*result);

I would like to understand what is the requirement for calling any of the VCU library functions? Is there a requirement to keep the function prototypes as uint32_t vs. uint16_t?

Thanks,

Jesus Chung

  • Hi,

    I think there must be some mistake in your assumptions about which registers and values are being passed when making the call. can you take a look at the following document.

    https://www.ti.com/lit/an/sprac71a/sprac71a.pdf?ts=1616053259317

    I do not think there is any limitation on using uint32 vs uint16. I will check that and see if there is any issues. Also let me know which version of compiler you are using. 

    Thanks

    Aravindhan

  • Arvindhan, 

    Thanks for the quick reply. I narrowed it down to passing in a uint16_t *result where the result is the seed to be passed on to VCRC via
    VMOV32       VCRC, mem32

    If I use a uint32_t *result, and do a VMOV32       VCRC, *XAR5, this seems to work.

    In the case of passing a uint16_t * result, *XAR5 has the result (0xFFFF 0xXXXX).
    In order to be in the correct format, I made sure that I put the *result in a 32bit register, and I see VCRC updating to 0x0000FFFF (where 0xFFFF is my seed).
    Regardless, after calling  VCRC16P2L_1  *XAR4++        ; octect 1   where *XAR4 is 0x0049,

    The result comes back as if the seed is 0x00000000. Is there an issue or a specific sequence for the variable passed in to VCRC?

    Thanks,

    Jesus

  • Hi,

    i probably will need your sequence of assembly instructions to look further into what is going wrong. is it possible to share ? 

    Thanks

    Aravindhan

  • void CRC_VCU_8(uint16_t *msg_ptr, uint16_t length, uint16_t *result_crc);

    *XAR4 contains msg_ptr
    ACC contains the length

    *XAR5 contains the result_crc

    I initialize result_crc to 0xFFFF;

        VCRCCLR                                      ; Clear out the CRC result registt
        MOVL         XAR0, ACC                       ; Store the length on XAR0
        MOVL         ACC, *XAR5                      ; Move the Seed to ACC
        MOV          AR3, AH                        ; Take the top 16-bit (where Seed is)
        VMOV32       VCRC, XAR3                     ; Load seed value into the CRC result register

    I check VCRC and it is 0x0000FFFF, but after calling VCRC16P2L_1  *XAR4++        ; octect 8, the result is the same as a seed of 0x0000

    if I pass in *XAR5 to VCRC, VCRC is wrong as it is looking at a 32-bit value, so VCRC is 0xFFFFXXXX where XXXX is the contiguous 16-bit value.

    VMOV32       VCRC,*XAR5,

    I am using compiler 6.2.9

  • Thanks. can you share the entire snippet till the instruction VCRC16P2L_1  *XAR4++. That would be helpful. There are source code under the folder libraries\dsp\VCU\c28\source\vcu2\crc in c2000ware. Can you also see if any of them can be used directly without having to write new code ? 

    Thanks

    Aravindhan

  • I'm using the same code as in the library but want to use a specific input format that does not use the CRC_Handle which is more generic than our current use/implementation.

    One of the main changes is to have the input seed a uint16_t.
    From the documentation, what does mem32 mean? And how can I pass into VCRC a uint16_t?
    VMOV32 VCRC, mem32


    Seems that VMOV32 VCRC, *XAR5 works only if the input is a uint32_t pointer to the seed

    but VMOV32 VCRC, XAR5 would not work where XAR5 has the seed. I see VCRC update but the calculations are still as if VCRC is 0.


    My goal is to have a uint16_t (whether as the input or as a variable) as input to the function that can be used to initialize VCRC.


    _CRC_VCU_8:
        CRC_CONTEXT_SAVE
       
        VCRCCLR                                      ; Clear out the CRC result registt
        MOVL         XAR0, ACC                       ; Store the length on XAR0
        VMOV32       VCRC, *XAR5                     ; Load seed value into the CRC result register   <---- this is uint16_t *seed
    _CRC_run16BitPoly1_Loop:
        MOV          AL, AR0
        MOV          AH, @AL
        AND          AL, #0xFFF8                     ; Check to see if length greater than 8 bytes
                                                     ; if true, handle the <8 bytes in a loop
                                                     ; AL is now a multiple of 8
        SBF          _CRC_run16BitPoly1_LT8BytesLeft, EQ
        LSR          AL, #3                          ; loop in 8 bytes at a time
        MOV          AR1, AL                         ; move count into AR1
        SUB          AR1, #1                         ; subtract 1, accounts for the RPTB instruction i.e. it loops
                                                     ; N + 1 times
        .align       2                               ; align at 32-bit boundary to remove penalty
                                                     ; loop through the message 8 bytes at a time
        RPTB         _CRC_run16BitPoly1_RepeatBlock, AR1
        VCRC16P2L_1  *XAR4++    ; octect 1
        VCRC16P2L_1  *XAR4++    ; octect 2
        VCRC16P2L_1  *XAR4++    ; octect 3
        VCRC16P2L_1  *XAR4++    ; octect 4
        VCRC16P2L_1  *XAR4++    ; octect 5
        VCRC16P2L_1  *XAR4++    ; octect 6
        VCRC16P2L_1  *XAR4++    ; octect 7
        VCRC16P2L_1  *XAR4++    ; octect 8
    _CRC_run16BitPoly1_RepeatBlock:
    _CRC_run16BitPoly1_LT8BytesLeft:
        ANDB        AH, #07h     ; Get the number of 0-7 bytes left
        MOVB        AL, #8       ; Move 8 to AL
        SUB         AL, AH       ; Find the number of instructions we need (8 - bytes left)
       
        ; AL Now has the number of bytes left in instructions. Now multiply by how much each instruction takes
        MPY          ACC, AL,#(__CRC_run16BitPoly1_LT8BytesLeft_end - _CRC_run16BitPoly1_LT8BytesLeft_start)
       
        MOVL         XAR7, #_CRC_run16BitPoly1_LT8BytesLeft_start
        ADDL         XAR7, ACC  ; Add the number of instructions to jump to
       
        LB          *XAR7       ; jump to the correct place for the number of bytes left
        ;.align 2
    _CRC_run16BitPoly1_LT8BytesLeft_start:
        VCRC16P2L_1  *XAR4++        ; octect 1
    __CRC_run16BitPoly1_LT8BytesLeft_end:
        VCRC16P2L_1  *XAR4++        ; octect 2
        VCRC16P2L_1  *XAR4++        ; octect 3
        VCRC16P2L_1  *XAR4++        ; octect 4
        VCRC16P2L_1  *XAR4++        ; octect 5
        VCRC16P2L_1  *XAR4++        ; octect 6
        VCRC16P2L_1  *XAR4++        ; octect 7
        VCRC16P2L_1  *XAR4++        ; octect 8
       
    _CRC_run16BitPoly1_End:
        VMOV32       ACC, VCRC     ; Save the result to the structure
        CRC_CONTEXT_RESTORE
        LRETR                       ; Return value is in AL







  • Hi,

    I will check with the design team on the capability of this instruction to use 16bit. It probably is not possible. But i will confirm.

    Thanks

    Aravindhan

  • Hi,

    Mem32 is documented in our Extended Instruction Set Ref Manual - https://www.ti.com/lit/ug/spruhs1c/spruhs1c.pdf. Not sure why you have to use uint16_t in your code. any specific reason ?

    Also you mentioned that  "VMOV32 VCRC, XAR5 would not work where XAR5 has the seed. I see VCRC update but the calculations are still as if VCRC is 0." - can you add coupe of NOPs after this instruction and also after VCRCCLR and see if there is any impact ? also put couple of NOPs at the beginning of _CRC_run16BitPoly1_End: and let me know what happens.

    Thanks

    Aravindhan 

  • Hi,

    any further progress you have been able to do ? did you try the above suggestions ?

    Thanks

    Aravindhan

  • Hello Aravindhan,

    Thanks for your comments. I did include a NOP before doing a VMOV32 ACC, VCRC and this solved the issue.


    Although, I still do not understand the need for this?

    Based on www.ti.com/.../spruhs1c.pdf page 469, 4.5.2 General Guidelines for VCRC Pipeline Alignment

    For example - read of VCRC after CRC calculation (Legal scenario):
        VCRC16P1L_1 *XAR7++
        VMOV32 *XAR6++, VCR

    Why is the NOP required, and is there going to be an update to the documentation to include the need to have a NOP between a VMOV32 and VCRC?

    Thanks,

    Jesus

  • Great. I will look into the documentation and see where this is mentioned. 

    Thanks

    Aravindhan

  • Hi,

    The needs for NOPs is mentioned in Page 469 of the document  https://www.ti.com/lit/ug/spruhs1c/spruhs1c.pdf. can you let me know if this documentation is clear or need any improvement.

    Thanks

    Aravindhan

  • Hello Aravindhan,

    This is the same document I refer to above. In that document, it reads the following:

    1. All fixed polynomial VCRC instructions are executed in single cycle. However if fixed polynomial VCRC instructions is followed by an instruction which updates VCRC register then a NOP is necessary before update of VCRC register

    ..

    For example - read of VCRC after CRC calculation (Legal scenario):
    VCRC16P1L_1 *XAR7++
    VMOV32 *XAR6++, VCRC

    Why is this a legal scenario? Should it read:

    For example - read of VCRC after CRC calculation (Legal scenario):
    VCRC16P1L_1 *XAR7++
    NOP
    VMOV32 *XAR6++, VCRC

    Based on the description, if we update the VCRC register, then a NOP is neessary before reading the VCRC register.
    In this case, the above "Legal Scenario" is inaccurate and needs to be updated to include a NOP.

    Thanks

    Jesus

  • Hi,

    I will share this findings with design folks and confirm if there is anything missing. 

    Thanks

    Aravindhan

  • Hi,

    as per architects following is the response. Did you have to add NOPs that were not provided in the following details ? let us know.

    thanks

    Aravindhan

    ---------------------------------------------------------------------------------------------- 

    The documentation is correct. See rule 6 of section 9.2 in the spec: 

    1. All polynomial specific CRC instructions update the VCRC register at the end of E2 (and not E1) phase of the pipeline (one cycle late compared to a strict single-cycle instruction). However, if the VCRC register is getting used as a source immediately after a CRC instruction in some other VCU instruction, the VCRC register value is “forwarded” and hence the sequence is allowed.

     

    On the other hand, if the VCRC instruction is a destination for a single cycle instruction immediately after a CRC instruction, both these instructions will try to update the VCRC register in the same cycle and hence this sequence is not illegal. For example:-

     

    Legal scenario (read of VCRC after CRC calculation):

    VCRC16P1L_1        *XAR7++

    VMOV32                  *XAR6++, VCRC

     

    Illegal scenario (write of VCRC after CRC calculation):

    VCRC16P1L_1        *XAR7++

    VMOV32                  VCRC, *XAR6++

     

    To make the above legal, insert a NOP:

    VCRC16P1L_1        *XAR7++

    NOP

    VMOV32                  VCRC, *XAR6++

    ----------------------------------------------------------------------------------------------

  • Hi,

    can you look at my last comment and see if the documentation is good enough or we are missing something. If you are fine with it , let me know so we can close this thread.

    Thanks

    Aravindhan

  • Aravindhan,

    The documentation seems clear about the pipeline alignments with reading/writing to the VCRC routines and whether these can be performed sequentially.

    Legal scenario (read of VCRC after CRC calculation):

    VCRC16P1L_1        *XAR7++

    VMOV32                  *XAR6++, VCRC

    this seems to not be the case, as I had to add a NOP between these two instructions per your comments.
    Is there an alignment issue? It seems to occur when I have an odd number of VCRC calculations performed, or is it based on an odd number of instructions to ensure the pipeline is aligned?

    thanks,

    Jesus

  • Jesus,

    Thanks for pointing this. This should not be the case. I will check with the design folks and confirm. 

    Thanks

    Aravindhan

  • Jesus,

    The design team would like to get your code. The one that fails and the changes you made to make it work. Both of these. It will be very helpful for us to look and see if we have missed something. Appreciate your help.

    Thanks

    Aravindhan

  • ****************************************************************************************************
    * Function:     CRC_VCU_init
    *
    * Description:  Ensures that the CRCMSGFLIP bit is cleared, this ensures that the input
    *               is interpreted in normal bit-order.
    *               Workaround to the silicon issue of first VCU calculation on power up being
    *               erroneous:
    *               Details Due to the internal power-up state of the VCU module, it is possible
    *               that the first CRC result will be incorrect. This condition applies to the
    *               first result from each of the eight CRC instructions.
    *               This rare condition can only occur after a power-on reset, but will not
    *               necessarily occur on every power on. A warm reset will not cause this condition
    *               to reappear.
    *               Workaround(s) The application can reset the internal VCU CRC logic by
    *               performing a CRC calculation of a single byte in the initialization routine.
    *               This routine only needs to perform one CRC calculation and can use any of the
    *               CRC instructions
    *
    * Parameters:   N/A
    *
    * Returns:  N/A
    *
    * Globals:  N/A
    *
    * Assumptions: MUST be called before other functions are used
    *
    * Register Usage:   XAR7
    *              XAR7 assigned to start a conversion
    *
    ****************************************************************************************************
    _Crc_16_init:
        VCLRCRCMSGFLIP
        MOVB      XAR7, #0
        VCRC8L_1  *XAR7
        VCRCCLR
        LRETR
        
    ****************************************************************************************************
    * Function:     Crc_16_calc
    *
    * Description:  Calculates a 16-bit CRC of an 16-bit stream of data
    *
    * Parameters:   uint16_t *msg_ptr is stored XAR4
    *               uint16_t length is stored in AL
    *               uint16_t seed is stored in AH
    *
    * Returns:      uint16_t CRC, in AL
    *
    * Globals:      N/A
    *
    * Assumptions: Crc_16_init MUST be called before
    *
    * Register Usage:   ACC(AH & AL),P(PH & PL) AR0, XAR4, XAR7, VCRC
    *               AL   assigned to the length
    *               P    copies the CRC seed so it can be pushed to the VCRC register.
    *               AR0  is used for the loop of
    *               XAR4 is assigned to the input data pointer
    *               XAR7 is used to jump to the correct starting point in the chunk calculation
    *
    ****************************************************************************************************
    * uint16_t Crc_16_calc(uint16_t *msg_ptr, uint16_t Length, uint16_t seed)
    _Crc_16_calc:
        VCRCCLR                                      ; Clear out the CRC result register
        MOV          PL, AH
        MOVL         *SP++, P                        ; Store the seed in the Stack to be` accessed.
        VMOV32       VCRC, *--SP                     ; Load seed value into the CRC result register
    
    _CRC_run16BitPoly3_Loop:
        MOV          AH, @AL
        LSR          AL, #2                          ; Check to see if length greater than 4 words
                                                     ; if true, handle the <4 words in a loop
                                                     ; AL is now a multiple of 4
        SBF          _CRC_run16BitPoly3_LT8BytesLeft, EQ
        MOV          AR0, AL                         ; move count into AR0
        SUB          AR0, #1                         ; subtract 1, accounts for the RPTB instruction i.e. it loops
                                                     ; N + 1 times
    
        .align       2                               ; align at 32-bit boundary to remove penalty
                                                     ; loop through the message 8 bytes at a time
        RPTB         _CRC_run16BitPoly3_RepeatBlock, AR0
        VCRC16P2H_1  *XAR4
        VCRC16P2L_1  *XAR4++
        VCRC16P2H_1  *XAR4
        VCRC16P2L_1  *XAR4++
        VCRC16P2H_1  *XAR4
        VCRC16P2L_1  *XAR4++
        VCRC16P2H_1  *XAR4
        VCRC16P2L_1  *XAR4++
    _CRC_run16BitPoly3_RepeatBlock:
    _CRC_run16BitPoly3_LT8BytesLeft:
        ANDB        AH, #03h     ; Get the number of 0-3 words left
        MOVB        AL, #4       ; Move 4 words to AL
        SUB         AL, AH       ; Find the number of instructions we need (4 - words left)
    
        ; AL Now has the number of bytes left in instructions. Now multiply by how much each instruction takes
        MPY          ACC, AL,#(_CRC_run16BitPoly3_LT8BytesLeft_end - _CRC_run16BitPoly3_LT8BytesLeft_start)
        MOVL         XAR7, #_CRC_run16BitPoly3_LT8BytesLeft_start
        ADDL         XAR7, ACC  ; Add the number of instructions to jump to
        LB          *XAR7       ; jump to the correct place for the number of bytes left
    
    _CRC_run16BitPoly3_LT8BytesLeft_start:
        VCRC16P2H_1  *XAR4
        VCRC16P2L_1  *XAR4++
    _CRC_run16BitPoly3_LT8BytesLeft_end:
        VCRC16P2H_1  *XAR4
        VCRC16P2L_1  *XAR4++
        VCRC16P2H_1  *XAR4
        VCRC16P2L_1  *XAR4++
        VCRC16P2H_1  *XAR4
        VCRC16P2L_1  *XAR4++
        ; Set a NOP so we have at least an instruction before we save the result
        NOP
    
    _CRC_run16BitPoly3_End:
        VMOV32       ACC, VCRC     ; Save the result to the structure
        LRETR                       ; Return value is in AL
    

    crc_vcu2_for_TI.asm

    Aravindhan,

    Attached is the code snippet. It is well commented, and shows where the NOP is added to ensure that the VMOV32       ACC, VCRC to save the VCRC value is valid.

    Let me know if you require any additional information.

    Thanks,

    Jesus

  • thanks Jesus. I will check with design team and get back to you.

    Thanks

    Aravindhan