This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Can you please help us identify C64x+ linear assembly equivalent for MAX function?

Other Parts Discussed in Thread: TMS320C40

We are trying to find equivalent of R7=MAX (R5, R6)

TMS320C40 instructions to find the maximum of two operands are as below, can you please find C64x+ equivalent linear assembly code. Thanks.

LDF

R6,   R7

CMPF

R5,   R7

LDFLT

R5,   R7

  • I recommend you write the equivalent operation in C then see what code the compiler generates for it.  Use the option --src_interlist to save the assembly file, and have some useful comments inserted into it.  You'll see that the C64+ does not have floating point instructions.  For details on the instructions used, look them up in the C64/C64+ CPU manual.

    Thanks and regards,

    -George

  • Thanks for your support, however we tried but we not able to progress, can you please help us get the TI C64x+ assembly programming guide.
  • As George mentioned, the TI C64x+ does not have floating-point instructions, so you'll need to call an RTS routine to do a floating-point comparison.  Here's how it would look in linear assembly:

            MV         a, A4
            MV         b, B4
            CALLP      __c6xabi_gtf, B3
            MV         A4, pred
            MV         b, result
     [pred] MV         a, result
    
  • While debugging the linear assembly code and when we single step it using TI CCS debugger, it doesn't seem to reflect the correct behaviour. For example the values in various registers doesn't seem to be correct and as expected when we see the values after placing breakpoints at relevent places, can you please help us with this issue?

    Also it can be seen that various variables (.reg R1,R2) were mapped to same register (Register A5) internally. Can we change the mapping of these variables in TI CCS for C64X+? Thanks in advance.
  • hsg ken said:
    While debugging the linear assembly code and when we single step it using TI CCS debugger, it doesn't seem to reflect the correct behaviour.

    That's probably because the results of the instruction you are stopped at are a few cycles away from completing.  A LDW, for instances, takes 4 cycles to complete.So, when you are stopped at the LDW, the results of the load occur 4 cycles later.

    hsg ken said:
    Also it can be seen that various variables (.reg R1,R2) were mapped to same register

    This is normal behavior. If the lifetime of two variables does not overlap, then they can be allocated to the same register.  This is good in that it reduces the number of registers used, and thereby increases the number of variables which are allocated to registers.  The best policy is to give your variables descriptive names like count or sum, then let the tools allocate them to registers.

    Thanks and regards,

    -George

  • I'm sorry, my example was wrong. Here is a complete example:

            .def       fmax
            .ref       __c6xabi_gtf
    fmax    .cproc     a, b
            .reg       pred, result
            .call      pred=__c6xabi_gtf(a, b)
            MV         b, result
     [pred] MV         a, result
            .return    result
            .endproc
    
  • Thanks for the reply, now we require help on how to turn on the assembler optimization i.e. basically we now have a .sa file where we have the assembly code without mentioning any functional units, we want the settings to be done to convert this .sa file into final .asm file which would have functional units and gets the best optimized cycle counts. Thanks in advance.
  • The compiler does that work. You pass the .sa file to the compiler just like a .c file, and the compiler does all of those things. You can see the .asm file by using the --keep_asm command line option.
  • Thanks for the reply, however we want to understand what options we have to set in the CCS to get the maximum optimization possible for our .sa file when it converts to .asm file. Right now we have set Optimization level as 3 and in Advanced optimization we have set 5, but we are not getting required optimization and it not even better than directly writing in using TI Intrisic C itself. Can you please help us with the correct setting in the CCS Properties window for this file, what exactly should be set to get the assembly optimizer to do the required optimiztions. Thanks in advance.
  • Also can you please help us understand if we need to context saving in the .sa file at the beginning and context restoring at the end of the .sa file or it would be done by the TI Compiler tool automatically when we include .sa file in the project.

    Also can you help us with some code on how to do operation with the help of the current pointer/frame pointer, like for example how to access and get register values from frame pointer + 20. Thanks in advance.
  • hsg ken said:
    if we need to context saving in the .sa file at the beginning and context restoring at the end of the .sa file

    Organize your functions with the .cproc directive, and the tools take care of this detail.  Read about .cproc in the C6000 compiler manual.

    hsg ken said:
    can you help us with some code on how to do operation with the help of the current pointer/frame pointer

    Again, if your function uses .cproc, then the tools take care of this detail.  Create all the variables you need with .reg.  If any of them need to temporarily be on the stack, the tools take care of that for you.

    Thanks and regards,

    -George

  • You're trying to do a floating-point operation on a device that does not have a hardware floating-point comparison instruction. It's not going to get significantly better than the function call. Setting the optimization level to 3 and setting the silicon value appropriately should lead to the best possible code.