Compiler/TM4C1290NCPDT: Byteswap

Michael Schuster

Part Number: TM4C1290NCPDT

Tool/software: TI C/C++ Compiler

Hi,

I want to make use of the byteswap instruction of the TM4C129 /CortexM4 device. I know in assembler this could be done with an gcc- code as follows:

inline uint32_t Rev16(uint32_t a)
{
  asm ("rev16 %1,%0"
          : "=r" (a)
          : "r" (a));
  return a;
}

I cannot find how to translate this to valid Syntax using the TI Compiler for the TM4C Device. Any help is welcome.

Regards

Micky

over 7 years ago

0 Ralph Jacobi over 7 years ago

TI__Guru*** 135005 points

Hello Michael,

Just to clarify, you are looking to translate this into Code Composer Studio as an inline assembly function to use with .c code? Which CCS version are you using?

0 Michael Schuster over 7 years ago in reply to Ralph Jacobi

Intellectual 796 points

That is correct, I want to do this as a C inline.

I use CCS7 under Linux.

ti-cgt-arm_16.9.1.LTS -version gives:

TI ARM C/C++ Compiler                   v16.9.1.LTS
Build Number 1QM7P-2LI-VATAQ-TAR-C08D

TI ARM EABI C/C++ Parser                v16.9.1.LTS
Build Number 1QM7P-2LI-VATAQ-TAR-C08D
TI ARM C/C++ File Merge                 v16.9.1.LTS
Build Number 1QM7P-2LI-VATAQ-TAR-C08D
TI ARM C/C++ Optimizer                  v16.9.1.LTS
Build Number 1QM7P-2LI-VATAQ-TAR-C08D
TI ARM C/C++ Codegen                    v16.9.1.LTS
Build Number 1QM7P-2LI-VATAQ-TAR-C08D
TI ARM Assembler                        v16.9.1.LTS
Build Number 1QM7P-2LI-VATAQ-TAR-C08D
TI ARM Embed Utility                    v16.9.1.LTS
Build Number 1QM7P-2LI-VATAQ-TAR-C08D
TI ARM C Source Interlister             v16.9.1.LTS
Build Number 1QM7P-2LI-VATAQ-TAR-C08D
TI ARM Linker                           v16.9.1.LTS
Build Number 1QM7P-2LI-VATAQ-TAR-C08D
TI ARM Absolute Lister                  v16.9.1.LTS
Build Number 1QM7P-2LI-VATAQ-TAR-C08D
TI ARM Strip Utility                    v16.9.1.LTS
Build Number 1QM7P-2LI-VATAQ-TAR-C08D
TI ARM XREF Utility                     v16.9.1.LTS
Build Number 1QM7P-2LI-VATAQ-TAR-C08D
TI ARM C++ Demangler                    v16.9.1.LTS
Build Number 1QM7P-2LI-VATAQ-TAR-C08D
TI ARM Hex Converter                    v16.9.1.LTS
Build Number 1QM7P-2LI-VATAQ-TAR-C08D
TI ARM Name Utility                     v16.9.1.LTS
Build Number 1QM7P-2LI-VATAQ-TAR-C08D
TI ARM Object File Display              v16.9.1.LTS
Build Number 1QM7P-2LI-VATAQ-TAR-C08D
TI ARM Archiver                         v16.9.1.LTS
Build Number 1QM7P-2LI-VATAQ-TAR-C08D

0 cb1_mobile over 7 years ago in reply to Michael Schuster

Guru 117855 points

May I note that (both) PRO IDEs (IAR & Keil) provide such "in-line" ASM support. (while unlikely to prove "exact" - clues may present which guide & assist.)

"Free - code-size limited versions" - are available for download - and the VAST "User Manuals" provide "great detail" - which (may) operate to your advantage, here.

0 Michael Schuster over 7 years ago in reply to cb1_mobile

Intellectual 796 points

I donot want to change the Toolchain or the compiler.
My guess is, that this can be done either as an C inline or (not was good) as an assembler function. But I donot have an idea howto do so with the TI ARM Compiler. The gcc can do this as mentioned above.

0 cb1_mobile over 7 years ago in reply to Michael Schuster

Guru 117855 points

I was not suggesting that you "change" your toolchain - instead I believe that you may "gain insight" - by seeing how (others) have implemented your objective...

Reinventing the "wheel" - as you've discovered - is not always "quick/easy." (thus my suggestion)

0 twelve12pm over 7 years ago

Genius 4455 points

Your compiler has ARM instruction intrinsics.

I am looking at the manual (my manual is for 16.9.0.STS but I doubt intrinsics would change between minor versions) and table 5-3 in section 5.13 shows a list of these intrinsics.

Calls to these intrinsics are written in code with the same syntax as a call to a C function but the compiler replaces it with the assembly instruction.

See the intrinsic __rev16().

Screenshot of the document page:

There are additional pages with many more intrinsics.

Not sure where the link to this manual is... Maybe someone else can post that link here. :-)

0 Danny F over 7 years ago in reply to Michael Schuster

Genius 3850 points

Cmsis provides that. So teo options.

1. Go cmsis: I posted a set of files to do that. It works with CCS. Or you can simply use the cmsis files for msp432.

2. Or simply copy and paste the rbit implementation in the cmsis files. It is fairly independent.

Both are simple to do.

0 cb1_mobile over 7 years ago in reply to Michael Schuster

Guru 117855 points

I've found this "example of ASM" under (both) IAR and CCS.

#if defined(ewarm) || defined(DOXYGEN) // note that this is under IAR (ewarm)
static long
MainLongMul(long lX, long lY)
{
//
// The assembly code to efficiently perform the multiply (using the
// instruction to multiply two 32-bit values and return the full 64-bit
// result).
//
__asm(" smull r0, r1, r0, r1\n"
" lsrs r0, r0, #16\n"
" orr r0, r0, r1, lsl #16\n"
" bx lr");

//
// This return is never reached but is required to avoid a compiler
// warning.
//
return(0);
}
#endif

#if defined(ccs) // And this is under CCS
static long
MainLongMul(long lX, long lY)
{
//
// The assembly code to efficiently perform the multiply (using the
// instruction to multiply two 32-bit values and return the full 64-bit
// result).
//
__asm(" smull r0, r1, r0, r1\n"
" lsrs r0, r0, #16\n"
" orr r0, r0, r1, lsl #16\n"
" bx lr\n");

//
// This is needed to keep the TI compiler from optimizing away the code.
//
return(lX * lY);
}
#endif

You may note that the differences are (very) slight. (\n and different return values - I now note) The advantage of the PRO IDEs (which well pre-existed the one here) is their "Depth of User Manual explanation - their robustness - and of course - their "Vendor Agnostic" capability. (one and only one - PuhLease....)

0 Robert Adsett over 7 years ago in reply to Michael Schuster

Guru 27665 points

Michael Schuster said:
My guess is, that this can be done either as an C inline or (not was good) as an assembler function.

You may be dismissing this option too quickly. A good compiler may well be able to translate the C code into the equivalent (or better) of the assembly. Only drop to assembly if for some reason you need an exact sequence of assembly instructions or performance testing shows first that you need an improvement and second that assembly provides it.

Check to see if something like the following (warning unchecked code)

uint16_t x;

x = ((x & (uint16_t)0xFFu) << 8u) | ((x & (uint16_t)0xFF00u) >> 8u);

performs as well or better than your assembly code with the additional advantage of not being compiler dependent.

Robert

0 Michael Schuster over 7 years ago in reply to cb1_mobile

Intellectual 796 points

Thanks a lot for your explanation. I can use the intrinsic function (as suggested by twelve12pm) but this might be usefull for further problems

0 Michael Schuster over 7 years ago in reply to twelve12pm

Intellectual 796 points

Thanks, that was what I was orginally searching for!

0 Michael Schuster over 7 years ago in reply to Danny F

Intellectual 796 points

Just a note: Refering to ntohs of LWIP I suggest an improvement :

Change def.h in LWIP as follows to enhance the ntohs functions and define LWIP_PLATFORM_BYTESWAP

/*
 * Copyright (c) 2001-2004 Swedish Institute of Computer Science.
 * All rights reserved. 
 * 
 * Redistribution and use in source and binary forms, with or without modification, 
 * are permitted provided that the following conditions are met:
 *
 * 1. Redistributions of source code must retain the above copyright notice,
 *    this list of conditions and the following disclaimer.
 * 2. Redistributions in binary form must reproduce the above copyright notice,
 *    this list of conditions and the following disclaimer in the documentation
 *    and/or other materials provided with the distribution.
 * 3. The name of the author may not be used to endorse or promote products
 *    derived from this software without specific prior written permission. 
 *
 * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR IMPLIED 
 * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF 
 * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT 
 * SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, 
 * EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT 
 * OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS 
 * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN 
 * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING 
 * IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY 
 * OF SUCH DAMAGE.
 *
 * This file is part of the lwIP TCP/IP stack.
 * 
 * Author: Adam Dunkels <adam@sics.se>
 *
 */
#ifndef __LWIP_DEF_H__
#define __LWIP_DEF_H__

/* arch.h might define NULL already */
#include "lwip/arch.h"
#include "lwip/opt.h"

#ifdef __cplusplus
extern "C" {
#endif

#define LWIP_MAX(x , y)  (((x) > (y)) ? (x) : (y))
#define LWIP_MIN(x , y)  (((x) < (y)) ? (x) : (y))

#ifndef NULL
#define NULL ((void *)0)
#endif

/* Endianess-optimized shifting of two u8_t to create one u16_t */
#if BYTE_ORDER == LITTLE_ENDIAN
#define LWIP_MAKE_U16(a, b) ((a << 8) | b)
#else
#define LWIP_MAKE_U16(a, b) ((b << 8) | a)
#endif 

#ifndef LWIP_PLATFORM_BYTESWAP
#define LWIP_PLATFORM_BYTESWAP 0
#endif

#ifndef LWIP_PREFIX_BYTEORDER_FUNCS
/* workaround for naming collisions on some platforms */

#ifdef htons
#undef htons
#endif /* htons */
#ifdef htonl
#undef htonl
#endif /* htonl */
#ifdef ntohs
#undef ntohs
#endif /* ntohs */
#ifdef ntohl
#undef ntohl
#endif /* ntohl */
#define htons(x) lwip_htons(x)
#define ntohs(x) lwip_ntohs(x)
#define htonl(x) lwip_htonl(x)
#define ntohl(x) lwip_ntohl(x)
#endif /* LWIP_PREFIX_BYTEORDER_FUNCS */

#if BYTE_ORDER == BIG_ENDIAN
#define lwip_htons(x) (x)
#define lwip_ntohs(x) (x)
#define lwip_htonl(x) (x)
#define lwip_ntohl(x) (x)
#define PP_HTONS(x) (x)
#define PP_NTOHS(x) (x)
#define PP_HTONL(x) (x)
#define PP_NTOHL(x) (x)
#else /* BYTE_ORDER != BIG_ENDIAN */
#if LWIP_PLATFORM_BYTESWAP
#define lwip_htons(n) __rev16(n)
#define lwip_ntohs(n) __rev16(n)
#define lwip_htonl(n) __rev(n)
#define lwip_ntohl(n) __rev(n)
#define LWIP_PLATFORM_HTONS(x) htons(x)
#define LWIP_PLATFORM_NTOHS(x) ntohs(x)
#define LWIP_PLATFORM_HTONL(x) htonl(x)
#define LWIP_PLATFORM_NTOHL(x) ntohl(x)
#else /* LWIP_PLATFORM_BYTESWAP */
u16_t lwip_htons(u16_t x);
u16_t lwip_ntohs(u16_t x);
u32_t lwip_htonl(u32_t x);
u32_t lwip_ntohl(u32_t x);
#endif /* LWIP_PLATFORM_BYTESWAP */

/* These macros should be calculated by the preprocessor and are used
   with compile-time constants only (so that there is no little-endian
   overhead at runtime). */
#define PP_HTONS(x) ((((x) & 0xff) << 8) | (((x) & 0xff00) >> 8))
#define PP_NTOHS(x) PP_HTONS(x)
#define PP_HTONL(x) ((((x) & 0xff) << 24) | \
                     (((x) & 0xff00) << 8) | \
                     (((x) & 0xff0000UL) >> 8) | \
                     (((x) & 0xff000000UL) >> 24))
#define PP_NTOHL(x) PP_HTONL(x)

#endif /* BYTE_ORDER == BIG_ENDIAN */

#ifdef __cplusplus
}
#endif

#endif /* __LWIP_DEF_H__ */

0 twelve12pm over 7 years ago in reply to Michael Schuster

Genius 4455 points

Thank you for posting this code.

Perhaps you should make this suggestion on the lwIP mailing list so that others would benefit from it as well?

One caveat: This will only work when the compiler supports such __rev() and __rev16() intrinsics.

Arm-based microcontrollers

Arm-based microcontrollers forum

Compiler/TM4C1290NCPDT: Byteswap