This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

last Iteration done to avoid possible pointer overrun into undefined memory

In the FPU 1.50 library there is the below function.

The comment at the end is:

";--- Last iteration done seperately to avoid possible pointer overrun into

; undefined memory"

While I understand and see that they are just dropping the ++ increment of the address pointer?  and I assume them to mean that if it just so happens that this points to memory that is 'undefined' which I assume to mean an address that doesn't exist in the memory map from say the linker file. 

What exact error would that cause?  I mean if I am exiting this function there is no real harm in the address pointer pointing to an invalid address?  So my question is why have they gone to the effort to prevent the address pointer from incrementing? 

;;#############################################################################

;;! \file source/vector/mpy_SP_RVxCV.asm

;;!

;;! \brief C-Callable multiplication of a real vector and a complex vector

;;! \author David M. Alter

;;! \date 07/14/11

;;

;; HISTORY:

;; 07/14/11 - original (D. Alter)

;;

;; DESCRIPTION: C-Callable multiplication of a real vector and a complex vector

;; y_re[i] = x[i]*w_re[i]

;; y_im[i] = x[i]*w_im[i]

;;

;; FUNCTION:

;; extern void mpy_SP_RVxCV(complex_float *y, const complex_float *w,

;; const float32 *x, const Uint16 N)

;;

;; USAGE: mpy_SP_RVxCV(y, x, c, N);

;;

;; PARAMETERS: complex_float *y = result complex array

;; complex_float *w = input complex array

;; float32 *x = input real array

;; Uint16 N = length of w, x, and y arrays

;;

;; RETURNS: none

;;

;; BENCHMARK: 5*N + 15 cycles (including the call and return)

;;

;; NOTES:

;; 1) N must be at least 2.

;; 2) The inputs and return value are of type 'complex_float':

;;

;; typedef struct {

;; float32 dat[2];

;; } complex_float;

;;

;; Group: C2000

;; Target Family: C28x+FPU32

;;

;; Copyright (C) 2015 Texas Instruments Incorporated - http://www.ti.com/

;; ALL RIGHTS RESERVED

;;#############################################################################

;;$TI Release: C28x Floating Point Unit Library V1.50.00.00 $

;;$Release Date: Jun 2, 2015 $

;;#############################################################################

.global _mpy_SP_RVxCV

.text

_mpy_SP_RVxCV:

MOVL XAR6, *-SP[4] ;XAR6 = &x

ADDB AL, #-2 ;Subtract 2 from N since RPTB is 'n-1'

;times, and last iteration done separately

MOV32 R0H, *XAR5++ ;load first w_re value

;---Main loop

RPTB end_loop, @AL

MOV32 R1H, *XAR6++ ;load next x value

MPYF32 R2H, R1H, R0H ;y_re[i] = x[i]*w_re[i]

|| MOV32 R0H, *XAR5++ ;load next w_im

MPYF32 R3H, R1H, R0H ;y_re[i] = x[i]*w_re[i]

|| MOV32 R0H, *XAR5++ ;load next w_re

MOV32 *XAR4++, R2H ;store y_re[i]

MOV32 *XAR4++, R3H ;store y_im[i]

end_loop:

;--- Last iteration done seperately to avoid possible pointer overrun into

; undefined memory

MOV32 R1H, *XAR6 ;load next x value

MPYF32 R2H, R1H, R0H ;y_re[i] = x[i]*w_re[i]

|| MOV32 R0H, *XAR5 ;load next w_im

MPYF32 R3H, R1H, R0H ;y_re[i] = x[i]*w_re[i]

MOV32 *XAR4++, R2H ;store y_re[i]

MOV32 *XAR4, R3H ;store y_im[i]

;Finish up

LRETR ;return

;end of function _mpy_SP_RVxCV()

;*********************************************************************

.end

;;#############################################################################

;; End of File

;;#############################################################################

  • As far as I know, there's no problem having an invalid pointer value in an XARn register as long as you don't use it to access memory.
    If you were writing similar code in small memory model, you might be concerned about changing the upper 6 bits of the XARn registers (I'm not going to go into this here), but this code is clearly meant for large memory model, so I don't see why it would matter.
  • If you look at the code, you will see that the last loop must be done separately or there could potentially be an access into undefined memory.

    The main loop looks like this:

            MPYF32      R2H, R1H, R0H   ;y_re[i] = x[i]*w_re[i]
            || MOV32    R0H, *XAR5++    ;load next w_im
                                      
            MPYF32      R3H, R1H, R0H   ;y_re[i] = x[i]*w_re[i]
            || MOV32    R0H, *XAR5++    ;load next w_re

    The last MOV32 is actually a load in advance of the next loop iteration.  For the last "Iteration", that load doesn't exist.  The last iteration is written in the code like this:

            MPYF32      R2H, R1H, R0H   ;y_re[i] = x[i]*w_re[i]
            || MOV32    R0H, *XAR5      ;load next w_im
                                       
            MPYF32      R3H, R1H, R0H   ;y_re[i] = x[i]*w_re[I]

    First we see that the first MOV32 doesn't increment the XAR5 pointer.  That isn't really important.  We could increment the pointer with no adverse effects.  But, it is proper coding to not be incrementing a pointer when it doesn't need to be.  Second, you can see that the second MOV32 is missing from the code.

    If we hadn't written the last iteration separately and instead had executed the loop code an additional time, pointer XAR5 would have been incremented again by the first MOV32, and then an access would be made to that potentially undefined memory by the second MOV32.

    Writing the last iteration (which is really two algorithm iterations since the loop is unrolled once) avoids this potential problem.

    Regards,

    David