This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Compiler/TM4C1290NCPDT: Byteswap

Part Number: TM4C1290NCPDT

Tool/software: TI C/C++ Compiler

Hi,

I want to make use of the byteswap instruction of the TM4C129 /CortexM4 device. I know in assembler this could be done with an gcc- code as follows:

inline uint32_t Rev16(uint32_t a)
{
  asm ("rev16 %1,%0"
          : "=r" (a)
          : "r" (a));
  return a;
}

I cannot find how to translate this to valid Syntax using the TI Compiler for the TM4C Device. Any help is welcome. 

Regards

Micky

  • Hello Michael,

    Just to clarify, you are looking to translate this into Code Composer Studio as an inline assembly function to use with .c code? Which CCS version are you using?
  • That is correct, I want to do this as a C inline.

    I use CCS7 under Linux.

    ti-cgt-arm_16.9.1.LTS -version gives:

    TI ARM C/C++ Compiler                   v16.9.1.LTS
    Build Number 1QM7P-2LI-VATAQ-TAR-C08D
    
    TI ARM EABI C/C++ Parser                v16.9.1.LTS
    Build Number 1QM7P-2LI-VATAQ-TAR-C08D
    TI ARM C/C++ File Merge                 v16.9.1.LTS
    Build Number 1QM7P-2LI-VATAQ-TAR-C08D
    TI ARM C/C++ Optimizer                  v16.9.1.LTS
    Build Number 1QM7P-2LI-VATAQ-TAR-C08D
    TI ARM C/C++ Codegen                    v16.9.1.LTS
    Build Number 1QM7P-2LI-VATAQ-TAR-C08D
    TI ARM Assembler                        v16.9.1.LTS
    Build Number 1QM7P-2LI-VATAQ-TAR-C08D
    TI ARM Embed Utility                    v16.9.1.LTS
    Build Number 1QM7P-2LI-VATAQ-TAR-C08D
    TI ARM C Source Interlister             v16.9.1.LTS
    Build Number 1QM7P-2LI-VATAQ-TAR-C08D
    TI ARM Linker                           v16.9.1.LTS
    Build Number 1QM7P-2LI-VATAQ-TAR-C08D
    TI ARM Absolute Lister                  v16.9.1.LTS
    Build Number 1QM7P-2LI-VATAQ-TAR-C08D
    TI ARM Strip Utility                    v16.9.1.LTS
    Build Number 1QM7P-2LI-VATAQ-TAR-C08D
    TI ARM XREF Utility                     v16.9.1.LTS
    Build Number 1QM7P-2LI-VATAQ-TAR-C08D
    TI ARM C++ Demangler                    v16.9.1.LTS
    Build Number 1QM7P-2LI-VATAQ-TAR-C08D
    TI ARM Hex Converter                    v16.9.1.LTS
    Build Number 1QM7P-2LI-VATAQ-TAR-C08D
    TI ARM Name Utility                     v16.9.1.LTS
    Build Number 1QM7P-2LI-VATAQ-TAR-C08D
    TI ARM Object File Display              v16.9.1.LTS
    Build Number 1QM7P-2LI-VATAQ-TAR-C08D
    TI ARM Archiver                         v16.9.1.LTS
    Build Number 1QM7P-2LI-VATAQ-TAR-C08D

  • May I note that (both) PRO IDEs (IAR & Keil) provide such "in-line" ASM support. (while unlikely to prove "exact" - clues may present which guide & assist.)

    "Free - code-size limited versions" - are available for download - and the VAST "User Manuals" provide "great detail" - which (may) operate to your advantage, here.
  • I donot want to change the Toolchain or the compiler.
    My guess is, that this can be done either as an C inline or (not was good) as an assembler function. But I donot have an idea howto do so with the TI ARM Compiler. The gcc can do this as mentioned above.
  • I was not suggesting that you "change" your toolchain - instead I believe that you may "gain insight" - by seeing how (others) have implemented your objective...

    Reinventing the "wheel" - as you've discovered - is not always "quick/easy."      (thus my suggestion)

  • Your compiler has ARM instruction intrinsics.

    I am looking at the manual (my manual is for 16.9.0.STS but I doubt intrinsics would change between minor versions) and table 5-3 in section 5.13 shows a list of these intrinsics.

    Calls to these intrinsics are written in code with the same syntax as a call to a C function but the compiler replaces it with the assembly instruction.

    See the intrinsic __rev16().

    Screenshot of the document page:

    There are additional pages with many more intrinsics.

    Not sure where the link to this manual is... Maybe someone else can post that link here. :-)

  • Cmsis provides that. So teo options.

    1. Go cmsis: I posted a set of files to do that. It works with CCS. Or you can simply use the cmsis files for msp432.

    2. Or simply copy and paste the rbit implementation in the cmsis files. It is fairly independent.

    Both are simple to do.
  • I've found this "example of ASM" under (both) IAR and CCS.

    #if defined(ewarm) || defined(DOXYGEN)   // note that this is under IAR  (ewarm)
    static long
    MainLongMul(long lX, long lY)
    {
    //
    // The assembly code to efficiently perform the multiply (using the
    // instruction to multiply two 32-bit values and return the full 64-bit
    // result).
    //
    __asm(" smull r0, r1, r0, r1\n"
    " lsrs r0, r0, #16\n"
    " orr r0, r0, r1, lsl #16\n"
    " bx lr");

    //
    // This return is never reached but is required to avoid a compiler
    // warning.
    //
    return(0);
    }
    #endif

    #if defined(ccs)   //   And this is under CCS
    static long
    MainLongMul(long lX, long lY)
    {
    //
    // The assembly code to efficiently perform the multiply (using the
    // instruction to multiply two 32-bit values and return the full 64-bit
    // result).
    //
    __asm(" smull r0, r1, r0, r1\n"
    " lsrs r0, r0, #16\n"
    " orr r0, r0, r1, lsl #16\n"
    " bx lr\n");

    //
    // This is needed to keep the TI compiler from optimizing away the code.
    //
    return(lX * lY);
    }
    #endif

    You may note that the differences are (very) slight.     (\n and different return values - I now note)       The advantage of the PRO IDEs (which well pre-existed the one here) is their "Depth of User Manual explanation - their robustness - and of course - their "Vendor Agnostic" capability.     (one and only one - PuhLease....)

  • Michael Schuster said:
    My guess is, that this can be done either as an C inline or (not was good) as an assembler function.

    You may be dismissing this option too quickly. A good compiler may well be able to translate the C code into the equivalent (or better) of the assembly. Only drop to assembly if for some reason you need an exact sequence of assembly instructions or performance testing shows first that you need an improvement and second that assembly provides it.

    Check to see if something like the following (warning unchecked code)

    uint16_t x;
    
    x = ((x & (uint16_t)0xFFu) << 8u) | ((x & (uint16_t)0xFF00u) >> 8u);

    performs as well or better than your assembly code with the additional advantage of not being compiler dependent.

    Robert

  • Thanks a lot for your explanation. I can use the intrinsic function (as suggested by twelve12pm) but this might be usefull for further problems
  • Thanks, that was what I was orginally searching for!
  • Just a note: Refering to ntohs of LWIP I suggest an improvement :

    Change def.h in LWIP as follows to enhance the ntohs functions and define LWIP_PLATFORM_BYTESWAP

    /*
     * Copyright (c) 2001-2004 Swedish Institute of Computer Science.
     * All rights reserved. 
     * 
     * Redistribution and use in source and binary forms, with or without modification, 
     * are permitted provided that the following conditions are met:
     *
     * 1. Redistributions of source code must retain the above copyright notice,
     *    this list of conditions and the following disclaimer.
     * 2. Redistributions in binary form must reproduce the above copyright notice,
     *    this list of conditions and the following disclaimer in the documentation
     *    and/or other materials provided with the distribution.
     * 3. The name of the author may not be used to endorse or promote products
     *    derived from this software without specific prior written permission. 
     *
     * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR IMPLIED 
     * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF 
     * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT 
     * SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, 
     * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT 
     * OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS 
     * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN 
     * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING 
     * IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY 
     * OF SUCH DAMAGE.
     *
     * This file is part of the lwIP TCP/IP stack.
     * 
     * Author: Adam Dunkels <adam@sics.se>
     *
     */
    #ifndef __LWIP_DEF_H__
    #define __LWIP_DEF_H__
    
    /* arch.h might define NULL already */
    #include "lwip/arch.h"
    #include "lwip/opt.h"
    
    #ifdef __cplusplus
    extern "C" {
    #endif
    
    #define LWIP_MAX(x , y)  (((x) > (y)) ? (x) : (y))
    #define LWIP_MIN(x , y)  (((x) < (y)) ? (x) : (y))
    
    #ifndef NULL
    #define NULL ((void *)0)
    #endif
    
    /* Endianess-optimized shifting of two u8_t to create one u16_t */
    #if BYTE_ORDER == LITTLE_ENDIAN
    #define LWIP_MAKE_U16(a, b) ((a << 8) | b)
    #else
    #define LWIP_MAKE_U16(a, b) ((b << 8) | a)
    #endif 
    
    #ifndef LWIP_PLATFORM_BYTESWAP
    #define LWIP_PLATFORM_BYTESWAP 0
    #endif
    
    #ifndef LWIP_PREFIX_BYTEORDER_FUNCS
    /* workaround for naming collisions on some platforms */
    
    #ifdef htons
    #undef htons
    #endif /* htons */
    #ifdef htonl
    #undef htonl
    #endif /* htonl */
    #ifdef ntohs
    #undef ntohs
    #endif /* ntohs */
    #ifdef ntohl
    #undef ntohl
    #endif /* ntohl */
    #define htons(x) lwip_htons(x)
    #define ntohs(x) lwip_ntohs(x)
    #define htonl(x) lwip_htonl(x)
    #define ntohl(x) lwip_ntohl(x)
    #endif /* LWIP_PREFIX_BYTEORDER_FUNCS */
    
    #if BYTE_ORDER == BIG_ENDIAN
    #define lwip_htons(x) (x)
    #define lwip_ntohs(x) (x)
    #define lwip_htonl(x) (x)
    #define lwip_ntohl(x) (x)
    #define PP_HTONS(x) (x)
    #define PP_NTOHS(x) (x)
    #define PP_HTONL(x) (x)
    #define PP_NTOHL(x) (x)
    #else /* BYTE_ORDER != BIG_ENDIAN */
    #if LWIP_PLATFORM_BYTESWAP
    #define lwip_htons(n) __rev16(n)
    #define lwip_ntohs(n) __rev16(n)
    #define lwip_htonl(n) __rev(n)
    #define lwip_ntohl(n) __rev(n)
    #define LWIP_PLATFORM_HTONS(x) htons(x)
    #define LWIP_PLATFORM_NTOHS(x) ntohs(x)
    #define LWIP_PLATFORM_HTONL(x) htonl(x)
    #define LWIP_PLATFORM_NTOHL(x) ntohl(x)
    #else /* LWIP_PLATFORM_BYTESWAP */
    u16_t lwip_htons(u16_t x);
    u16_t lwip_ntohs(u16_t x);
    u32_t lwip_htonl(u32_t x);
    u32_t lwip_ntohl(u32_t x);
    #endif /* LWIP_PLATFORM_BYTESWAP */
    
    /* These macros should be calculated by the preprocessor and are used
       with compile-time constants only (so that there is no little-endian
       overhead at runtime). */
    #define PP_HTONS(x) ((((x) & 0xff) << 8) | (((x) & 0xff00) >> 8))
    #define PP_NTOHS(x) PP_HTONS(x)
    #define PP_HTONL(x) ((((x) & 0xff) << 24) | \
                         (((x) & 0xff00) << 8) | \
                         (((x) & 0xff0000UL) >> 8) | \
                         (((x) & 0xff000000UL) >> 24))
    #define PP_NTOHL(x) PP_HTONL(x)
    
    #endif /* BYTE_ORDER == BIG_ENDIAN */
    
    #ifdef __cplusplus
    }
    #endif
    
    #endif /* __LWIP_DEF_H__ */
    
    

  • Thank you for posting this code.

    Perhaps you should make this suggestion on the lwIP mailing list so that others would benefit from it as well?

    One caveat: This will only work when the compiler supports such __rev() and __rev16() intrinsics.