TMS320C6000 C/C++ CODE GENERATION TOOLS
7.0.5 Release Notes
May 2011

===============================================================================
Release Notes
===============================================================================

1. Defect History

The list of defects fixed in this release as well as known issues can
be found in the file DefectHistory.txt.

2. Compiler Documentation Errata

Errata from the "TMS320C6000 Optimizing Compiler User's Guide" and the
"TMS320C6000 Assembly Language User's Guide" is available online at the
Texas Instruments Embedded Processors CG Wiki:

http://tiexpressdsp.com/wiki/index.php?title=Category:CGT

under the 'Compiler Documentation Errata' link.

This Wiki has been established to assist developers in using TI Embedded
Processor Software and Tools.  Developers are encouraged to dig through all
the articles.  Registered users can update missing or incorrect information.

3. TI E2E Community

Questions concerning TI Code Generation Tools can be posted to the TI E2E
Community forums.  The "Development Tools" forum can be found at:

http://e2e.ti.com/support/development_tools/f/default.aspx

4. C67x Fast Run-Time Support (RTS) Library

These libraries are no longer included in the C6000 Code Generation Tools.
They are available for download at the following Texas Instruments page:

http://focus.ti.com/docs/toolsw/folders/print/sprc060.html

Or search the ti.com site for the tag: tms320c67x fastrts library

5. Defect Tracking Database

Compiler defect reports can be tracked at the Development Tools bug
database, SDOWP. The log in page for SDOWP, as well as a link to create
an account with the defect tracking database is found at:

https://cqweb.ext.ti.com/pages/SDO-Web.html

A my.ti.com account is required to access this page.  To find an issue
in SDOWP, enter your bug id in the "Find Record ID" box once logged in.
To find tables of all compiler issues click the queries under the folder:

"Public Queries" -> "Development Tools" -> "TI C-C++ Compiler"

With your SDOWP account you can save your own queries in your
"Personal Queries" folder.


===============================================================================
Contents
===============================================================================
1) Support for C6000 EABI
   1.1) Features
   1.2) Usage
   1.3) Table-Driven Exception Handling
   1.4) Compiler Version 6.1.0 Compatibility
   1.5) Migration from COFF to ELF
2) Support for Program Cache Layout 
   2.1) Background and Motivation
   2.2) What's New?
   2.3) Program Cache Layout Development Flow
   2.4) Comma-Separated Values (CSV) Files with WCG Information
   2.5) Linker Command File Operator - unordered()
   2.6) Cache Layout Tool Tutorial
   2.7) Things to Be Aware Of
3) Support for Dynamic Linking
   3.1) Importing and Exporting Symbols
      3.1.1 Using source 
      3.1.2 Import/Export using the ELF linkage macros defined in elf_linkage.h
      3.1.3 Import/Export using compiler options
         3.1.3.1 --visibility option
         3.1.3.2 --import_undef option
         3.1.3.3 Importing compiler helper functions
         3.1.3.4  Using linker options
   3.2) Bare Metal Dynamic Linking Support
      3.2.1 Building a dynamic executable
      3.2.2 Building a dynamic library
   3.3) C6x Linux Dynamic Linking Support
      3.3.1 Building c6x Linux executable
      3.3.2 Building c6x Linux DSO
4) Support for Building ROM Modules
   4.1) Building a ROM File
   4.2) ROM Masking a ROM File
   4.3) Stripping ROM Sections
   4.4) Building Application and Linking Against ROM File(s)
5) Support for Tesla ISA
   5.1) Architectural Characteristics
   5.2) Programming Characteristics
   5.3) Usage
   5.4) Features / Fixed Defects
6) Defect Reporting

-------------------------------------------------------------------------------
1) Support for C6000 EABI (Embedded Application Binary Interface)
-------------------------------------------------------------------------------

The C6000 7.0.1 compiler supports a new ABI (Application Binary Interface), 
C6000 EABI, in addition to the current COFFABI.

************************** IMPORTANT NOTICE *******************************
Using version 7.0 to build for EABI is not a practical reality today for most 
users.  All code in an EABI application must be built for EABI, and EABI 
versions of C6000 libraries are not generally available.  The principal purpose 
of supplying EABI in 7.0.x is so C6000 library suppliers can begin creating 
EABI products.  Please see the following wiki page for full details:
http://tiexpressdsp.com/index.php/EABI_Support_in_C6000_Compiler 
***************************************************************************

1.1) Features
-------------

C6000 EABI is an ABI for the C6000 architecture based on the ELF object file
format.  The major features are:

 - ELF object format (prerequisite for dynamic linking)
 - DWARF3 debug format
 - Improved "GPP parity"
 - Dynamic Linking support

The advantage of EABI is that it provides a modern, well-documented ABI that
is well-supported.  Many newer C++ language features, such as template
instantiation and extern inline, can be implemented more effectively.

Future development of TI's C6000 compiler will be primarily implemented within
EABI.  We encourage users to consider switching to EABI at their earliest
convenience.


1.2) Usage
----------
To compile using EABI, use the shell option --abi=eabi.  

Compile:
  > cl6x --abi=eabi <src_file> 

Compile and link:
  > cl6x --abi=eabi <src_file> -z -l<lnk.cmd>

When no RTS library is specified in the command line, the linker will choose
the right RTS library, provided the RTS library search path environment
variable (C6X_C_DIR) is set.

The following RTS libraries are included in the package for C6000 EABI:

    rts6200_elf.lib 
    rts6200_elf_eh.lib 
    rts6200e_elf.lib 
    rts6200e_elf_eh.lib
    rts6400_elf.lib 
    rts6400_elf_eh.lib 
    rts6400e_elf.lib 
    rts6400e_elf_eh.lib
    rts6700_elf.lib 
    rts6700_elf_eh.lib 
    rts6700e_elf.lib 
    rts6700e_elf_eh.lib
    rts6740_elf.lib 
    rts6740_elf_eh.lib 
    rts6740e_elf.lib 
    rts6740e_elf_eh.lib
    rts64plus_elf.lib 
    rts64plus_elf_eh.lib 
    rts64pluse_elf.lib 
    rts64pluse_elf_eh.lib
    rts67plus_elf.lib 
    rts67plus_elf_eh.lib 
    rts67pluse_elf.lib 
    rts67pluse_elf_eh.lib


1.3) Table-Driven Exception Handling
----------------------------------------- 

C6x compiler version 7.0.1 supports table-driven exception handling for ELF
executables.  Table-driven exception handling is more time-efficient than
setjmp/longjmp support when exceptions are not thrown.

The user interface is almost entirely the same as the COFFABI support; use the
option --exceptions and link with an RTS which has exception handling enabled.
For EABI only, if you have any 'extern "C"' functions which need to throw
exceptions, you must use the option "--extern_c_can_throw".

The underlying implementation is very different, and ELF object files with
exception handling support built with old alpha versions of the compiler will
not work with this version.  You must re-compile C++ files and use the newest
RTS library.


1.4) Compiler Version 6.1.0 Compatibility
----------------------------------------- 

C6x compiler version 7.0's COFFABI support is compatible with C6x compiler
version 6.1.0.  Projects currently using version 6.1.0 can update to version
7.0 without any issues.  The version 7.0 compiler generates COFFABI by
default, which is compatible with the version 6.1.0 compiler.


1.5) Migration from COFF to ELF
-------------------------------

To take advantage of C6000 EABI, users must explicitly request it using the
"--abi=eabi" command-line option.  Object files generated with this option
will not be compatible with object files generated for COFFABI.  However, most
C source files will not require modification, and most assembly source files
will require only trivial modification.

The major differences between EABI and COFFABI are:

 - Different name mangling 
   - C names are not mangled by prepending an underscore
   - C++ names are mangled using IA64 name mangling scheme
 - Small struct arguments are passed to functions by value
 - C type 'long' is 32 bits
 - Different bit-field layout
 - Different C auto-initialization

However, there are many more details.  Review the C6000 EABI Migration
document for these migration details.

Users need to take specific steps to migrate their projects to EABI:

 - COFF objects and libraries are not compatible with EABI objects; programs
   and libraries must be recompiled from source.  There is no COFF to ELF
   conversion utility.

 - C code which assumes the C type 'long' is 40 bits will need source
   modification to be migrated to EABI, where 'long' is 32 bits.

    - 40-bit arithmetic should be performed using 64-bit "long long" variables
      and expressions.  New intrinsics have been added to perform 40-bit
      arithmetic with values from "long long" expressions.

    - Users can temporarily use --long_precision_bits=40 option to make the
      long type 40 bits in EABI mode.  Please note that the 40-bit long RTS
      libraries are not provided as part of this release.  Users can build the
      RTS using --long_precision_bits=40 option if needed.

 - Assembly language routines that interface C/C++ functions need modification
   to remove underscores from identifiers defined or referred to in C.

Please refer to the C6000 EABI Migration document for details.


-------------------------------------------------------------------------------
2) Support for Program Cache Layout
-------------------------------------------------------------------------------

This release of the C6000 7.0 Code Generation Tools includes new
capability to help you develop better program cache efficiency into your
applications.  Program cache layout is the process of placing code in 
memory to minimize the occurrence of conflict misses in the program cache.


2.1) Background and Motivation
------------------------------

Problem Description:

- Effective utilization of the L1P instruction cache is an important part
  of getting the best performance from your processor.

- In the C6x family of processors, L1P instruction cache misses can 
  cause significant overhead because the L1P cache is direct-mapped.

- Some applications (like h264, for example) can spend 30+% of the 
  processor's time in L1P stall cycles due to L1P cache misses.

- A program cache miss happens when an instruction fetch fails to
  read an instruction from the program cache and the processor is
  required to access the instruction from the next level of memory.
  A request to L2 or external memory has a much higher latency than 
  an access from the first level instruction cache.
  
Approach:

- Many L1P cache misses are conflict misses.  

- Conflict misses occur when the cache has recently evicted a block of
  code that is now needed again.  In a program cache this often occurs
  when two frequently executed blocks of code (usually from different
  functions) interleave their execution and are mapped to the same 
  cache line.

  For example, suppose there is a call to function B from inside a
  loop in function A.  Suppose also that the code for function A's loop 
  is mapped to the same cache line as a block of code from function B 
  that is executed every time that B is called.  Each time B is called 
  from within this loop, the loop code from function A will be evicted 
  from the cache by the code in B that is mapped to the same cache line.  
  Even worse, when B returns to A, the loop code in A will evict the 
  code from function B that is mapped to the same cache line.

  Every iteration through the loop will cause 2 program cache conflict
  misses.  If the loop is heavily traversed, then the number of processor
  cycles lost to program cache stalls can become quite large.

- Many program cache conflict misses can be avoided with more intelligent
  placement of functions that are active at the same time.  Program cache 
  efficiency can be significantly improved using code placement strategies
  that utilize dynamic profile information that is gathered during the run
  of an instrumented application.

- In this release of the C6000 7.0 code generation tools, a new
  cache layout tool, clt6x, is included.  clt6x will take dynamic 
  profile information in the form of a weighted call graph (WCG)
  and create a preferred function order command file that can be
  input into the linker to guide the placement of function subsections.
  
Goal:

- Use the cache layout tool to help improve your program locality and 
  reduce the number of L1P cache conflict misses that occur during the 
  run of your application, thereby improving your application's 
  performance.

*** IMPORTANT! PLEASE NOTE! ***

What level of performance improvements can you expect to see?

If your application does not suffer from inefficient usage of the L1P cache,
then the new program cache layout capability will NOT have any affect on the
performance of your application. Before you invest development time into 
applying the program cache layout tooling to your application, the usage of
the L1P cache in your application should be analyzed. 

* Evaluating L1P Cache Usage

  Spending some time evaluating the L1P cache usage efficiency of your 
  application will not only help you determine whether or not your 
  application might benefit from using program cache layout, but it also
  gives you a rough estimate as to how much performance improvement you 
  can reasonably expect from applying program cache layout. 

  You can use Code Composer Studio (CCS) connected to an emulator with 
  trace enabled to collect profile information about the number of CPU 
  stall cycles that occur due to L1P cache misses during the run of an 
  application. For further details about how to gather information about 
  CPU stall cycles due to L1P cache misses, please use the CCS "Help" 
  browser and search for "Cache Event Profile". 

  The number of CPU stall cycles that occur due to L1P cache misses 
  gives you a reasonable upper bound estimate of the number of CPU 
  cycles that you may be able to recover with the use of the program
  cache layout tooling in your application. Please be aware that the 
  performance impact due to program cache layout will tend to vary for
  the different data sets that are run through your application.


2.2) What's New
---------------

  The C6x CGT v7.0 introduces some new features and capabilities that
  can be used in conjunction with the cache layout tool, clt6x.  The
  following is a summary:

  2.2.1) Path Profiler
    
      * Analysis Options
      
        --analyze=callgraph -- Instructs the compiler to generate WCG
                               analysis information.
        --analyze=codecov   -- Instructs the compiler to generate code
                               coverage analysis information.  This option
			       replaces the previous --codecov option.  
        --analyze_only      -- Halt compilation after generation of
                               analysis information is completed.

        o Behavior

          1. pprof6x will append code coverage/weighted call graph (WCG) 
             analysis information to existing CSV files that contain the
             same type of analysis information.

          2. pprof6x will now check to make sure that an existing CSV file
             contains analysis information that is consistent with the type 
             of analysis information it is being asked to generate (whether
             it be code coverage or WCG analysis).  Attempts to mix code 
             coverage and WCG analysis information in the same output CSV 
             file will be detected and pprof6x will emit a fatal error and
             abort.

      * Environment Variables

        To assist with the management of output CSV analysis files, pprof6x 
        now supports two new environment variables:

          TI_WCGDATA - Allows user to specify single output CSV file for all
	     WCG analysis information.  New information will be appended to
	     the CSV file identified by this environment variable, if the 
	     file already exists.

          TI_ANALYSIS_DIR - Specifies the directory in which the output 
	     analysis file will be generated.  The same environment variable 
	     can be used for both code coverage information and weighted call
	     graph information (all analysis files generated by pprof6x will
             be written to the directory specified by the TI_ANALYSIS_DIR
             environment variable).

             NOTE: The existing TI_COVDIR environment variable is still
	           supported when generating code coverage analysis, but is
		   overridden in the presence of a defined TI_ANALYSIS_DIR
		   environment variable.
 

  2.2.2) Cache Layout Tool (clt6x)

      clt6x <CSV files w/ WCG info> -o forder.cmd

                            -- Create a preferred function order command
                               file from input WCG information.

  2.2.3) Linker

      * --preferred_order Option

        --preferred_order=<function specification>
      
                            -- Prioritize the placement of a function 
                               relative to others based on the order in
                               which --preferred_order options are 
                               encountered during the linker invocation.

      * unordered() Linker Command File (LCF) Operator

        unordered()         -- This operator will relax placement 
                               constraints placed on an output section
			       whose specification includes an explicit
			       list of which input sections are contained
			       in the output section.

			       Please see section 2.5 for a detailed
			       description of the unordered() operator
			       and what it can be used for.

  Please see following sections for more details of how any of these 
  features, options, tools, and/or linker command file operators are used.


2.3) Program Cache Layout Development Flow
------------------------------------------

This section presents a development flow that incorporates the use of the 
program cache layout tooling. To get started using the program cache layout
capability, it is recommended that you read this section and then proceed 
to the simple cache layout tool tutorial in section 2.6 below.

 2.3.1) Gather Dynamic Profile Information

   The cache layout tool, clt6x, relies on the availability of dynamic 
   profile information in the form of a weighted call graph (WCG) in 
   order to produce a preferred function order command file that can be
   used to guide function placement at link-time when your application
   is re-built.

   There are several ways in which this dynamic profile information can 
   be collected.  For example, if you are running your application on 
   hardware, you may have the capability to collect a PC discontinuity 
   trace.  The discontinuity trace can then be post-processed to construct
   WCG input information for the clt6x.

   The method for collecting dynamic profile information that is presented
   here relies on the path profiling capabilities in the C6000 code
   generation tools.  Here is how it works:
        
   1. Build an instrumented application

      We are going to build an instrumented application using the 
      --gen_profile_info option ...

      Compile:
        > cl6x <options> --gen_profile_info <src_file(s)> 

      Compile and link:
        > cl6x <options> --gen_profile_info <src_file> -z -l<lnk.cmd>

      Use of --gen_profile_info instructs the compiler to embed counters 
      into the code along the execution paths of each function.

   2. Run instrumented application to generate .pdat file

      When the application runs, the counters embedded into the application 
      by --gen_profile_info keep track of how many times a particular 
      execution path through a function is traversed.  The data collected 
      in these counters is written out to a profile data file named 
      pprofout.pdat.

      The profile data file is automatically generated.  For example,
      if you are using the C64+ simulator under CCS, you can load and
      run your instrumented program, and you will see that a new 
      pprofout.pdat file is created in your working directory (where 
      the instrumented application is loaded from).

   3. Decode profile data file

      Once you have a profile data file, the file is decoded by the 
      profile data decoder tool, pdd6x, as follows:

        > pdd6x -e=<instrumented app out file> -o=pprofout.prf pprofout.pdat

      pdd6x produces a .prf file is then fed into the re-compile of the 
      application that uses the profile information to generate WCG input 
      data.

   4. Use decoded profile information to generate WCG input

      The compiler now supports a new option, --analyze, which is used to 
      tell the compiler to generate WCG or code coverage analysis information.  
      Its syntax is as follows:

        --analyze=callgraph -- Instructs the compiler to generate WCG
                               information.
        --analyze=codecov   -- Instructs the compiler to generate code
                               coverage information.  This option replaces
                               the previous --codecov option.  

      The compiler also supports a new --analyze_only option which
      instructs the compiler to halt compilation after the generation
      of analysis information has been completed.  This option replaces
      the previous --onlycodecov option.
        
      To make use of the dynamic profile information that you gathered,
      re-compile the source code for your application using the 
      --analyze=callgraph option in combination with the 
      --use_profile_info option:

        > cl6x <options> -mo --analyze=callgraph \
               --use_profile_info=pprofout.prf <src_file(s)>

      Use of -mo instructs the compiler to generate code for each
      function into its own subsection.  This option provides the
      linker with the means to directly control the placement of the
      code for a given function.

      The compiler generates a CSV file containing WCG information for
      each source file that is specified on the command line.  If such a
      CSV file already exists, then new call graph analysis information
      will be appended to the existing CSV file.  These CSV files are 
      then input to the cache layout tool (clt6x) to produce a preferred
      function order command file for your application.

      For more details on the content of the CSV files (containing WCG
      information) generated by the compiler, please see section 2.4 below.

 2.3.2) Generate Preferred Function Order from Dynamic Profile Information

      At this point, the compiler has generated a CSV file for each
      C/C++ source file specified on the command line of the re-compile
      of the application.  Each CSV file contains weighted call graph
      information about all of the call sites in each function defined
      in the C/C++ source file.

      The new cache layout tool, clt6x, collects all of the WCG 
      information in these CSV files into a single, merged WCG.  The
      WCG is processed to produce a preferred function order command 
      file that is fed into the linker to guide the placement of the 
      functions defined in your application source files.  This is how 
      to use clt6x:

        > clt6x *.csv -o forder.cmd

      The output of clt6x is a text file containing a sequence of 
      --preferred_order=<function specification> options.  By default,
      the name of the output file is "forder.cmd", but you can specify
      your own file name with the -o option.  The order in which 
      functions appear in this file is their preferred function order
      as determined by the clt6x.

      In general, the proximity of one function to another in the 
      preferred function order list is a reflection of how often the
      two functions call each other.  If two functions are very close
      to each other in the list, then the linker interprets this
      as a suggestion that the two functions should be placed very 
      near to one another.  Functions that are placed close together 
      are less likely to create a cache conflict miss at runtime when
      both functions are active at the same time.  The overall effect
      should be an improvement in program cache efficiency and 
      performance.

 2.3.3) Utilize Preferred Function Order in Re-Build of Application

      Finally, the preferred function order command file that is
      produced by the clt6x is fed into the linker during the re-build
      of the application, as follows:

        > clt6x <options> -z *.obj forder.cmd -l<lnk.cmd>

      The preferred function order command file, forder.cmd, contains
      a list of --preferred_order=<function specification> options.
      The linker prioritizes the placement of functions relative
      to each other in the order that the --preferred_order options
      are encountered during the linker invocation.

      Each --preferred_order option contains a function specification.
      A function specification can describe simply the name of the 
      function for a global function, or it may provide the path name
      and source file name where the function is defined.  A function 
      specification that contains path and file name information is 
      used to distinguish one static function from another that has the 
      same function name.

      As mentioned earlier, the --preferred_order options are 
      interpreted by the linker as suggestions to guide the placement
      of functions relative to each other.  They are not explicit
      placement instructions.  If an object file or input section
      is explicitly mentioned in a linker command file SECTIONS
      directive, then the placement instruction specified in the
      linker command file takes precedence over any suggestion from 
      a --preferred_order option that is associated with a function 
      that is defined in that object file or input section.  
      
      This precedence can be relaxed by applying the unordered() 
      operator to an output specification as described in section 
      2.5 below.


2.4) Comma-Separated Values (CSV) Files with WCG Information
------------------------------------------------------------

The format of the CSV files generated by the compiler under the
"--analyze=callgraph --use_profile_info" option combination is as
follows:

"caller","callee","weight" [CR][LF]
<caller spec>,<callee spec>,<call frequency> [CR][LF]
<caller spec>,<callee spec>,<call frequency> [CR][LF]
<caller spec>,<callee spec>,<call frequency> [CR][LF]
...

- Line 1 of the CSV file is the header line.  It specifies the meaning
  of each field in each line of the remainder of the CSV file.  In the
  case of CSV files that contain weighted call graph information, each
  line will have a caller function specification, followed by a callee
  function specification, followed by an unsigned integer that provides
  the number of times a call was executed during run time.
  
- There may be instances where the caller and callee function 
  specifications are identical on multiple lines in the CSV file.
  This will happen when a caller function has multiple call sites to
  the callee function.  In the merged WCG that is created by the clt6x,
  the weights of each line that has the same caller and callee function
  specifications will be added together.
  
- The CSV file that is generated by the compiler using the path profiling
  instrumentation will not include information about indirect function
  calls or calls to runtime support helper functions (like _remi or _divi).
  However, you may be able to gather information about such calls with 
  another method (like the PC discontinuity trace mentioned earlier).

- The format of these CSV files is in compliance with the RFC-4180
  specification of Comma-Separated Values (CSV) files.  For more details 
  on this specification, please see the following URL:
      
        http://tools.ietf.org/html/rfc4180
        

2.5) Linker Command File Operator - unordered()
-----------------------------------------------

  2.5.1) Basics

    A new unordered() operator is now available for use in a linker 
    command file.  The effect of this operator is to relax the placement 
    constraints placed on an output section specification in which the
    content of the output section is explicitly stated.

    Consider an example output section specification:

    SECTIONS
    {
      .text:
      {
        file.obj(.text:func_a)
        file.obj(.text:func_b)
        file.obj(.text:func_c)
        file.obj(.text:func_d)
        file.obj(.text:func_e)
        file.obj(.text:func_f)
        file.obj(.text:func_g)
        file.obj(.text:func_h)
        *(.text)

      } > PMEM

      ...

    }

    In the above SECTIONS directive, the specification of '.text' 
    explicitly dictates the order in which functions are laid out in 
    the output section.  That is, by default, the linker will layout 
    func_a through func_h in exactly the order that they are specified,
    regardless of any other placement priority criteria (such as a 
    preferred function order list that is enumerated by 
    --preferred_order options).

    The unordered() operator can be used to relax this constraint on
    the placement of the functions in the '.text' output section so that
    placement can be guided by other placement priority criteria.

    The unordered() operator can be applied to an output section as follows:

    SECTIONS
    {
      .text: unordered()
      {
        file.obj(.text:func_a)
        file.obj(.text:func_b)
        file.obj(.text:func_c)
        file.obj(.text:func_d)
        file.obj(.text:func_e)
        file.obj(.text:func_f)
        file.obj(.text:func_g)
        file.obj(.text:func_h)

        *(.text)

      } > PMEM

      ...

    }

    So that, given a list of --preferred_order options as follows:

      --preferred_order="func_g"
      --preferred_order="func_b"
      --preferred_order="func_d"
      --preferred_order="func_a"
      --preferred_order="func_c"
      --preferred_order="func_f"
      --preferred_order="func_h"
      --preferred_order="func_e"

    The placement of the functions in the '.text' output section will then
    be guided by this preferred function order list.  This placement will
    be reflected in a linker generated map file, as follows:

SECTION ALLOCATION MAP

 output                                  attributes/
 section   page    origin      length       input sections
 --------  ----  ----------  ----------   ----------------
 .text      0    00000020    00000120
                   00000020    00000020     file.obj (.text:func_g:func_g)
                   00000040    00000020     file.obj (.text:func_b:func_b)
                   00000060    00000020     file.obj (.text:func_d:func_d)
                   00000080    00000020     file.obj (.text:func_a:func_a)
                   000000a0    00000020     file.obj (.text:func_c:func_c)
                   000000c0    00000020     file.obj (.text:func_f:func_f)
                   000000e0    00000020     file.obj (.text:func_h:func_h)
                   00000100    00000020     file.obj (.text:func_e:func_e)

                   ...
 

  2.5.2) About DOT Expressions in the Presence of unordered()

    Another aspect of the unordered() operator that should be taken into
    consideration is that even though the operator causes the linker to
    relax constraints imposed by the explicit specification of an output
    section's contents, the unordered() operator will still respect the
    position of a DOT expression within such a specification.

    Consider the following output section specification:

    SECTIONS
    {
      .text: unordered()
      {
        file.obj(.text:func_a)
        file.obj(.text:func_b)
        file.obj(.text:func_c)
        file.obj(.text:func_d)

        . += 0x100;

        file.obj(.text:func_e)
        file.obj(.text:func_f)
        file.obj(.text:func_g)
        file.obj(.text:func_h)

        *(.text)

      } > PMEM

      ...

    }

    In the above specification of '.text', a DOT expression, ". += 0x100;",
    separates the explicit specification of two groups of functions in
    the output section.  In this case, the linker will honor the specified
    position of the DOT expression with respect to the functions on either
    side of the expression.  That is, the unordered() operator will allow
    the preferred function order list to guide the placement of func_a
    through func_d relative to each other, but none of those functions
    will be placed after the hole that is created by the DOT expression.
    Likewise, the unordered() operator allows the preferred function order
    list to influence the placement of func_e through func_h relative to
    each other, but none of those functions will be placed before the
    hole that is created by the DOT expression.

 
  2.5.3) GROUPs and UNIONs

    The unordered() operator can only be applied to an output section.
    This includes members of a GROUP or UNION directive.  For example,

    SECTIONS
    {
      GROUP
      {
        .grp1:
        {
          file.obj(.grp1:func_a)
          file.obj(.grp1:func_b)
          file.obj(.grp1:func_c)
          file.obj(.grp1:func_d)
        } unordered()

        .grp2:
        {
          file.obj(.grp2:func_e)
          file.obj(.grp2:func_f)
          file.obj(.grp2:func_g)
          file.obj(.grp2:func_h)
        }

        .text:  { *(.text) }

      } > PMEM

      ...

    }

    The above SECTIONS directive applies the unordered() operator to the
    first member of the GROUP.  The '.grp1' output section layout can
    then be influenced by other placement priority criteria (like the 
    preferred function order list), whereas the '.grp2' output section
    will be laid out as explicitly specified.

    The unordered() operator cannot be applied to an entire GROUP or UNION.
    Attempts to do so will result in a linker command file syntax error and
    the link will be aborted.


2.6) Cache Layout Tool Tutorial
-------------------------------

As a means of familiarizing yourself with the cache layout tool development 
flow, you can walk through this guided tour of the development of a simple
 application.  Included in the release distribution, you will find a 
sub-directory, clt_tutorial.  This sub-directory contains the following 
files:

	clt_tutor.txt
        main.c
        lots.c
        rare.c

To begin the tutorial, change your location to the clt_tutorial sub-
directory or copy the source files to your own working directory.
Then consider the following ...

  2.6.1) Introduction to the Source Files

    main.c:

    - defines main()
    - main() calls rare() once
    - main() calls main.c:local() 4 times

    - defines static local()
    - main.c:local() calls lots() 80 times

    lots.c:

    - defines lots(); globally visible
    - lots() calls lots.c:local() 100+ times

    - defines lots.c:local()

    rare.c:

    - defines rare(); globally visible

  2.6.2) Build an Instrumented Application

    > rm -f *.pdat
    > cl6x --abi=eabi -mv64+ --gen_profile_info main.c lots.c rare.c \
                -z -llnk.cmd -o app.out -m app.map

  2.6.3) Gather Dynamic Profile Information

    > Load and run app.out with CCS C64+ Simulator

    You should have a pprofout.pdat file in your working directory after
    completing this step.

  2.6.4) Decode Profile Data File

    > rm -f *.prf
    > pdd6x pprofout.pdat -eapp.out -o=pprofout.prf

    You should have a pprofout.prf file in your working directory after
    completing this step.

  2.6.5) Use Profile Information in Re-Compile of Application

    > cl6x --abi=eabi -mv64+ --use_profile_info=pprofout.prf \
           --analyze=callgraph -mo main.c lots.c rare.c

    You should have 3 CSV files in your working directory after completing
    this step: main.csv, lots.csv, rare.csv.  Their contents should be 
    as follows:

    main.csv:

    "caller","callee","weight"
    main,rare,1
    main,main.c:local,4
    main,printf,1
    main,fflush,1
    main.c:local,printf,4
    main.c:local,lots,80

    lots.csv:

    "caller","callee","weight"
    lots,lots.c:local,80
    lots,lots.c:local,28
    lots.c:local,printf,108

    rare.csv:

    "caller","callee","weight"

  2.6.6) Generate Preferred Function Order Command File

    > rm app_pfo.cmd
    > clt6x main.csv lots.csv rare.csv -o app_pfo.cmd

    You should have an app_pfo.cmd in your working directory containing:

    --preferred_order="printf"
    --preferred_order="lots.c:local"
    --preferred_order="lots"
    --preferred_order="main.c:local"
    --preferred_order="main"
    --preferred_order="rare"
    --preferred_order="fflush"

    This is the preferred function order list.  Note that the two 
    versions of local() are distinguished by their source file names:
    
      "main.c:local"
      "lots.c:local"

  2.6.7) Re-Link Application Incorporating Preferred Function Order

    > cl6x --abi=eabi -mv64+ -z main.obj lots.obj rare.obj app_pfo.cmd \
           -llnk.cmd -o app_opt.out -m app_opt.map

    You should have an app_opt.map file in your working directory.  If
    you open it and look at the contents of the .text output section, you
    should see that the placement of the functions specified in the 
    app_pfo.cmd file should match their actual order in that file.


2.7) Things to Be Aware Of
--------------------------

There are some behavioral characteristics and limitations of the program
cache layout development flow that you should bear in mind:

  2.7.1) Generation of Path Profiling Data File (.pdat)

    When running an application that has been instrumented to collect
    path-profiling data (using --gen_profile_info compiler option during
    build), the application will use functions in the runtime support
    library to write out information to the path profiling data file
    (pprofout.pdat in above tutorial).  If there is a path profiling
    data file already in existence when the application starts to run,
    then any new path profiling data generated will be appended to the
    existing file.

    To prevent combining path profiling data from separate runs of an
    application, you will need to either rename the path profiling data
    file from the previous run of the application or remove it before
    running the application again.

  2.7.2) Indirect Calls Not Recognized by Path Profiling Mechanisms

    When using available path profiling mechanisms to collect weighted
    call graph information from the path profiling data, pprof6x does
    not recognize indirect calls.  An indirect call site will not be
    represented in the CSV output file that is generated by pprof6x.
    
    You can workaround this limitation by introducing your own 
    information about indirect call sites into the relevant CSV file(s).
    If you take this approach, please be sure to follow the format
    of the callgraph analysis CSV file ("caller", "callee", 
    "call frequency").

    If you are able to get weighted call graph information from a PC
    trace into a callgraph analysis CSV, this limitation will no longer
    apply (as the PC trace can always identify the callee of an 
    indirect call).

  2.7.3) Multiple --preferred_order Options Associated with Single Function

    There may be cases in which you might want to input more than one
    preferred function order command file to the linker during the 
    link of an application.  For example, you may have developed or 
    received a separate preferred function order command file for one
    or more of the object libraries that are used by your application.

    In such cases, it is possible that one function may be specified
    in multiple preferred function order command files.  If this 
    happens, the linker will honor only the first instance of the 
    --preferred_order option in which the function is specified.
    

-------------------------------------------------------------------------------
3) Support for Dynamic Linking
-------------------------------------------------------------------------------
The dynamic linking support in the 7.0 compiler is alpha capability.

3.1) Importing and Exporting Symbols
-----------------------------------
In a dynamic linking system the user can build dynamic modules that are loaded
and relocated by a dynamic loader at run time. The dynamic loader can also
perform dynamic symbol resolution: resolve references from dynamic modules with
the definitions from other dynamic modules.

Only symbols that are imported or exported participate in the dynamic symbol
resolution. A symbol is exported if it is visible from a module during dynamic
symbol resolution. A dynamic object is a dynamic library (DLL or DSO) or a 
dynamic executable. The dynamic object is also called a module in this readme. 
Such a dynamic object imports a symbol when its symbol references are resolved 
by definitions from another dynamic object. The dynamic object that has the 
definition and makes it visible is said to export the symbol.

A function or a global variable can be imported or exported by controlling
the ELF symbol visibility attribute. This can be done by using source code 
annotations or compiler options.

3.1.1) Using source 
--------------------

- Explicitly import/export symbols in source code

  - To Export: Use __declspec(dllexport) qualifier with symbol declaration or
               definition

                __declspec(dllexport) int foo() { }

  - To Import: Use __declspec(dllimport) qualifier with symbol declaration

                __declspec(dllimport) int bar();

  - Typically, an API is exported by a module and is imported by another
    module. __declspec() can be added to the API header file.  

  - The Linker uses the most restrictive visibility for symbols.
    For example, consider the following:
    
    - if foo() is declared dllimport in a.c, and
    - if foo() is declared plain (no dllimport) in b.c, and
    - if a.c and b.c are compiled into ab.dll

    Then, the symbol foo will NOT be imported in ab.dll and the linker will
    report an error indicating that the reference to foo() is unresolved.
    
  - Use __declspec() in header files to avoid link time errors

3.1.2) Import/Export using the ELF linkage macros defined in elf_linkage.h
-------------------------------------------------------------------------

The 7.0 compiler includes the header file elf_linkage.h which defines the 
following macros to control the symbol visibility:

TI_IMPORT symbol_declaration
   Imports the symbol.  This cannot be applied to symbol definitions.
   Examples: 
      TI_IMPORT int foo(void);
      extern TI_IMPORT long global_variable;
   
TI_EXPORT symbol_definition|symbol_declaration
   Export the symbol. This can be applied to both declarations and
   definitions.
   Examples:
      TI_EXPORT int foo(void);
      TI_EXPORT long global_variable;

TI_PATCHABLE symbol_definition
   The definition is visible outside the module. Other modules can import 
   this definition. Also, a reference to this symbol can be patched to a 
   different definition if needed. Such calls go through indirection to enable 
   patching. This is also called symbol preemption.
   Examples:
      TI_EXPORT int foo(void);
      TI_EXPORT long global_variable;

TI_DEFAULT symbol_definition|symbol_declaration
   Symbol can be either imported or exported. The definition 
   is visible outside the module. Other modules can import this
   definition. The reference to the symbol can also be patched.

TI_PROTECTED symbol_definition|symbol_declaration
   The definition is visible outside the module. Other module
   can import this definition. However, a reference to the symbol can never be
   patched (non preemptable).

TI_HIDDEN symbol_definition|symbol_declaration
   The definition is not visible outside the module.

3.1.3) Import/Export using compiler options
------------------------------------------
The following compiler options can be used to control the symbol visibility
of global symbols. The symbols using source code annotations to control the
visibility are not affected by the compiler options.

3.1.3.1) --visibility option
--------------------------
The 7.0 compiler supports the --visibility=<default_visibility> option to 
specify the default visibility for global symbols. Note that this option
does not affect the visibility of symbols that use __declspec() or TI_xxx
macros to specify a visibility in the source code.

 The <default_visibility> arguments is one of the following:

 hidden    - The global symbols are NOT imported or exported. This is the 
             default compiler behavior.
 fhidden   - All the function definitions are hidden. All other global symbols
             are imported or exported.
 default   - All global symbols are imported, exported and patchable.
 protected - All global symbols are exported.

3.1.3.2) --import_undef option
-----------------------------
This option makes all the global symbol references imported. This option can 
be combined with the --visibility option. For example, the following option
combination make all definitions exported and all references imported.
 --import_undef --visibility=protected

Note that --import_undef takes precedence over the --visibility option.

3.1.3.3) Importing compiler helper functions
-----------------------------------------
- The compiler generates calls to functions in RTS. For example, to perform
  unsigned long division in user code, the compiler generates a call to 
  __c6xabi_divul (_divul in COFF ABI). Since there is no declaration and the 
  user does not call these functions directly, the __declspec() annotation
  cannot be used. This prevents the user from importing such functions from
  RTS built as a dynamic library. To aid this use case the compiler supports
  the following option:

  --import_helper_function

  When this option is specified, all the compiler generated calls to RTS 
  functions are treated as imported.

3.1.3.4) Using linker options
-----------------------------

- To import/export symbols when the source cannot be updated, use options :

  - Linker option --export=<symbol>

    - adds <symbol> to the dynamic symbol table as exported definition
    - searches the archive libs to find the definition
    - generates an error if <symbol> is not defined

  - Linker option --import=<symbol>

    - adds <symbol> to the dynamic symbol table as imported reference
    - If a definition of <symbol> is included in the current link --import is
      ignored with a warning

NOTE: These options cannot be used when building C6x Linux executable or DSO.
      See 3.3.


3.2) Bare Metal Dynamic Linking Support
---------------------------------------
The C6x compiler version 7.0 supports dynamic linking under EABI. 

In a dynamic linking system the user can build dynamic modules that are loaded
and relocated by a dynamic loader at run time. The dynamic loader can also
perform dynamic symbol resolution: resolve references from dynamic modules with
the definitions from other dynamic modules.

The 7.0 compiler supports the creation of such dynamic modules. It
specifically supports creating the dynamic executable and dynamic library.

A dynamic executable :

 - Will have dynamic segments
 - Can export/import symbols
 - Is optionally relocatable (Can contain dynamic relocations)
 - Must have an entry point
 - Can be created using -c/-cr compiler options
 - Should use far DP or absolute access for imported data, but can use near DP
   access for its own data

A dynamic library :

 - Will have dynamic segment
 - Can export/import symbols 
 - Is relocatable
 - Does not require an entry point
 - Cannot be created using -c/-cr compiler option
 - Cannot use near DP access, must use far DP or absolute addressing to access
   own and imported data


3.2.1) Building a dynamic executable
-----------------------------------

To build a dynamic executable, use linker options:

- --dynamic (means --dynamic=exe) option tells the linker to generate dynamic
    executable. 

    % cl6x --abi=eabi <src_files>.c 
    % cl6x -z <src_files>.obj <libs>.lib --dynamic -o base_img.exe 
 
    The above commands build the dynamic executable base_img.exe. All the 
    symbols exported in the source files become exported in the file
    base_img.exe and can be referenced by other modules. 

    In case the source file cannot be annotated, the linker options
    --import/--export can be used.

    % cl6x -z <src_files>.obj <libs>.lib --dynamic -o base_img.exe
      --export=MEM_alloc --export=MEM_free

    Note that all the --export/--import option can be placed in a text file
    and can be specified on the command line. For example, the above link
    step can be replaced by

    % cl6x -z <src_files>.obj <libs>.lib --dynamic -o base_img.exe 
      export_list.txt

    % cat export_list.txt
      --export=MEM_alloc
      --export=MEM_free
    
3.2.2) Building a dynamic library
--------------------------------

To build a dynamic library, use linker option

- --dynamic=lib option to generate baremetal dynamic library. The dynamic
  library is relocatable at run time by default.

- If source doesn't import/export symbols
  - --export/--import options to control symbol import/export respectively

- --soname option to specify a unique name to the library (default is the
    output file name)

- --mem_model:data=far to access all data using absolute addressing

    % cl6x --abi=eabi -o <src_files>.c --mem_model:data=far
    % cl6x -z <src_files>.obj <libs>.lib --dynamic=lib -o gsm_node.dll 
      --soname="UUID" --export=gsm_encode --import=MEM_alloc 
      --import=MEM_free base_img.exe

    Here MEM_alloc() and MEM_free() are imported, so a library or exe that
    exports these symbols should be specified (base_img.exe).


3.3) C6x Linux Dynamic Linking Support
--------------------------------------
The C6x 7.0 tools support building C6x Linux executable and DSO.

3.3.1) Building c6x Linux executable
-----------------------------------
To build a C6x Linux executable, compile the source using --linux compiler 
option and link the objects using the --sysv linker option.

    % cl6x --abi=eabi --linux <src_files>.c 
    % cl6x -z <src_files>.obj <libs>.lib --sysv -o a.out 
 
    The above commands build the C6x Linux  executable a.out. 

3.3.2) Building c6x Linux DSO
----------------------------
To build a C6x Linux DSO, 
 a) compile the source using the following options
    --linux - Generate code for linux
    --pic   - Generate position independent code

 b) link the objects using the following linker options
    --sysv - Generate SysV ELF output file 
    --shared - Generate Dynamic Shared Object (DSO)

 % cl6x --abi=eabi --linux --pic <src_files>.c 
 % cl6x -z <src_files>.obj <libs>.lib --sysv -o codec.so 
 
The above commands build the C6x Linux DSO code.so

-------------------------------------------------------------------------------
4) Support for Building ROM Modules
-------------------------------------------------------------------------------
The ROM module building support in the 7.0 compiler is alpha capability.

The C6x 7.0 compiler makes it easy to build modules to be ROMed and easily
allow users to link against such ROMed image. This support is based on ELF
dynamic linking support and hence is not supported in COFF ABI. That is,
this support is available only when using --abi=eabi option.

The C6x 7.0 compiler allows creating a ROM file. This file contains the
sections that is ROM masked and the RAM sections it refers to. This ROM file
can be used as a normal out file in the ROM masking flow. The hex converter
tool will accept this ROM file. 

This ROM file then can be linked against when building an application.
In case the ROM code needs to be stripped before delivering to customers for 
application build, the strip6x utility can be used to strip the ROM code.

4.1) Building a ROM File
------------------------

Use linker option --rom to build ROM File

    % cl6x g729*.obj -z rom.cmd --rom -o g729.rom

Only the exported symbols are visible outside the ROM file. Make sure all 
the symbols the user can refer from the application is exported from the
ROM file.  See section 3.1 for details on importing and exporting symbols.

The linker command file can specify which sections go into the ROM using the
following new type keywords:

type=READONLY
  Indicates that the output section contains true readonly data. The subsequent
  link cannot change the contents of this section. The static linker makes sure
  there are no relocations associated with these sections in the output file.
  When a section is marked READONLY, the linker makes sure all the references
  from this section is resolved to their final value.

type=BOUND
  Indicates that the sections address is bound to the its final address and
  cannot change. This also indicates during subsequent link that this address
  is allocated and not available for other allocation.

Now consider the linker command file used above, rom.cmd:

   % cat g729.rom

   MEMORY {
       G729_ROM : o = 0x00030000, l = 0x00020000 
       G729_RAM : o = 0x00100000, l = 0x00003000 
   } 

   SECTION { 
    .text: > G729_ROM , type=READONLY, type=BOUND
    .switch: > G729_ROM , type=READONLY, type=BOUND
    .far: > G729_RAM , type=BOUND
   } 

In this ROM file, the .text and .switch sections are placed in ROM. Since the
address of these sections are bound to their final address, use the type=BOUND
keyword. Also, the contents of these sections, once generated in this link
step, can never change. So, use the type=READONLY keyword to mark these
sections readonly. 

The .far section is referenced by the .text section. Since the .text section
is readonly, the references in the .text section should be final. This means
the address of the .far section cannot change once we resolve the .far 
references in the .text section. So, the .far section should use the
type=BOUND keyword to indicate it is bound to its final address. 


4.2) ROM Masking a ROM File
---------------------------

The ROM file can be used in your current ROM masking process.

   % hex6x -t g729.rom


4.3) Stripping ROM Sections
--------------------------

Once the ROM file is ROM masked, the ROM sections can be stripped using 
the strip utility:

   % strip6x --rom g729.rom

The strip utility strip the sections that are READONLY and BOUND when the --rom
option is specified.  Once the ROM sections are stripped, the ROM file
can be used in the application build. The ROM creator is expected to strip the
ROM contents and deliver the ROM file for application build.


4.4) Building application linking against ROM file
-------------------------------------------------

The ROM file is used when building the applications to resolve the
application's ROM symbol references. Also, the RAM sections in the ROM file
should be linked-in when building the application. 

Application can be built by specifying the ROM file as an input:

  % cl6x -z app*.obj g729.rom applnk.cmd -o app.out


-------------------------------------------------------------------------------
5) Support for Tesla ISA
-------------------------------------------------------------------------------

For updates on the Tesla project, please refer to the project Wiki page:

    http://syntaxerror.dal.design.ti.com/wiki/index.php/ProjInfo:Mini64

This package also includes Runtime Support (RTS) libraries built for little and
big endian variants:

  COFF:
    ./rtstesla_le_coff.lib     (little endian)
    ./rtstesla_be_coff.lib     (big endian)
    ./rtstesla_le_coff_eh.lib  (little endian, exception handling)
    ./rtstesla_be_coff_eh.lib  (big endian, exception handling)

  ELF:
    ./rtstesla_le_elf.lib      (little endian)
    ./rtstesla_be_elf.lib      (big endian)
    ./rtstesla_le_elf_eh.lib   (little endian, exception handling)
    ./rtstesla_be_elf_eh.lib   (big endian, exception handling)


5.1) Tesla Architectural Characteristics
----------------------------------------

Since the Tesla CPU is a derivative of the C64+ (Joule) CPU, architecturally
the Tesla CPU contains a single, B-side datapath and B-side register file
consisting of 32 registers (B0-B31).  The B-side high registers (B16-B31) are
mapped as A-side low registers (A0-A15) in order to support existing Joule
calling conventions and also to minimize Joule-to-Tesla porting efforts.  The
programmer is restricted to referring to the full register set as A0-A15 and
B0-B15, excluding control registers.  During encoding of the assembly,
the (A0-A15) registers are encoded as B-side high registers (B16-B31). This
is also true of predicate registers A0-A2, which are actually interpreted as 
registers B16-B18.


5.2) Tesla Programming Characteristics
--------------------------------------

The Tesla CPU is a derivative of the C64+ (Joule) CPU and contains a scaled
down, 4-issue VLIW execution engine.  A single datapath is exposed to the
programmer with 4 available function units: FU_L, FU_S, FU_D, and FU_M.  There
are a total of 32 registers available that are divided into 16 "A" registers
(A0-A15) and 16 "B" registers (B0-B15).  The registers are divided into
"A" and "B" registers to facilitate support for Joule calling conventions. Note
that even though there are "A" and "B" registers, there is only one register
file and one datapath on which both sets can be utilized.

  3.2.1) Porting of Joule Linear Assembly and C Source Code

    It is possible to take Joule linear assembly and compile it for Tesla
    successfully.  Function unit placement directives in linear assembly are
    ignored in order to facilitate easy porting from Joule.  However, linear
    assembly that has been manually optimized for Joule may not perform
    equivalently for Tesla.  This especially includes manually unrolled loops
    that are optimal for Joule.  There is likely to be increased register
    pressure due to the decreased number of available registers on Tesla, and
    this may inhibit software pipelining and may lead to severe performance
    degradation.  

    This behavior may also manifest in compiled C code in which loop unrolling
    pragmas, such as UNROLL, MUST_ITERATE, and PROB_ITERATE are used to 
    manually optimize loops in C code.  These pragmas should be adjusted
    or removed for Tesla scenarios.


5.3) Usage
----------

The '-mv=tesla' command-line option has been added to the C6x shell to enable
builds for the Tesla CPU.  Note: The default ABI is EABI.

ELF (default)

  Compile a file for Tesla:
  - cl6x -mv=tesla <source>

  Compile and Link an executable file for Tesla:
  - cl6x -mv=tesla <source> -z -lrtstesla_le_elf.lib -l<lnk.cmd>

COFF

  Compile a file for Tesla:
  - cl6x -mv=tesla --abi=coffabi <source>

  Compile and Link an executable file for Tesla:
  - cl6x -mv=tesla --abi=coffabi <source> -z -lrtstesla_le_coff.lib -l<lnk.cmd>


5.4) Features / Fixed Defects
-----------------------------

- Changed Parser/Assembler Predefined symbols
  The symbols that are enabled/defined by the assembler and the parser 
  have been changed to reflect Tesla's external visibility.

  - The assembler will no longer enable the following symbols for Tesla:
      .TMS320C6000_LITE 

    The assembler will now enable the following symbol for Tesla:
      .TI_C6X_TESLA 

  - The C parser will no longer define the following symbols for Tesla: 
      _TMS320C6000_LITE

    The C parser will now define the following symbol for Tesla:
      _TI_C6X_TESLA

- EABI is now the default ABI for Tesla
  To compile using COFFABI, use the shell option --abi=coffabi.  Please refer 
  to section 1 for details about EABI support in the C6000 7.0 compiler.

- RTS library names have been renamed:

  ** Old:                    ** New:
  rtstesla.lib           --> rtstesla_le_coff.lib
  rtstesla_eh.lib        --> rtstesla_le_coff_eh.lib
  rtstesla_elf.lib       --> rtstesla_le_elf.lib
  rtstesla_elf_eh.lib    --> rtstesla_le_elf_eh.lib
 
  rtsteslae.lib          --> rtstesla_be_coff.lib
  rtsteslae_eh.lib       --> rtstesla_be_coff_eh.lib
  rtsteslae_elf.lib      --> rtstesla_be_elf.lib
  rtsteslae_elf_eh.lib   --> rtstesla_be_elf_eh.lib


- Workaround MSVC 2003 Compiler bug (Defect SDSCM00029586)
  A bug in the MSVC 2003 compiler affected PC builds of the Tesla compiler
  and caused the compiler to generate incorrect code under alchemy
  program optimization.


- Disabled Joule Intrinsics 
  Joule EFI intrinsics are not applicable to Tesla and are now disallowed 
  by the parser and assembler.

  - EFI C intrinsics:
      _efi_cmd_s1
      _efi_cmd_s2
      _efi_cmd_m1
      _efi_cmd_m2
      _efi_rdw_s1
      _efi_rdw_s2
      _efi_rw_s1
      _efi_rw_s2
      _efi_sdw_l1
      _efi_sdw_l2
      _efi_sw_l1
      _efi_sw_l2
      _efi_sll_l1
      _efi_sll_l2

  - EFI assembly intrinsics:
      EFCMD
      EFRDW
      EFRW
      EFSDW
      EFSW

  - EFI linear assembly intrinsics:
      EFI_CMD_S1
      EFI_CMD_S2
      EFI_CMD_M1
      EFI_CMD_M2
      EFI_RDW_S1
      EFI_RDW_S2
      EFI_RW_S1
      EFI_RW_S2
      EFI_SDW_L1
      EFI_SDW_L2
      EFI_SW_L1
      EFI_SW_L2
     
- Disabled Joule Instructions
  Joule SPLOOP instructions are not applicable to Tesla and are now disallowed 
  by the assembler.

  - C64x+ SPLOOP assembly instructions:
      SPLOOP
      SPLOOPD
      SPLOOPW
      SPKERNEL
      SPKERNELR
      SPMASK
      SPMASKR

- Unitless BNOP instruction (Defect SDSCM00025588)
  The Tesla Assembler will now allow you to place a unitless BNOP instruction 
  in parallel with another S unit instruction.

- If Conversion of pseudo instruction (Defect SDSCM00025854)
  This defect pertained to the if-conversion of a pseudo instruction "move" of
  a large address or large constant to a register.  The instruction is then
  predicated with this same register and is later rewritten to two 
  instructions.  The fix ensures that a new predicate register is created for
  these types of instructions during if-conversion such they will no longer be
  made to modify their own predicate register.


-------------------------------------------------------------------------------
6) Defect Reporting
-------------------------------------------------------------------------------

All C6000 7.0 defect reports should be submitted using the SDO ClearQuest
database at the following URL: http://cqweb.itg.ti.com/SDO. 


