• Join
  • Sign In with my.TI Login
Texas Instruments
  • Products
  • Applications
  • Tools & Software
  • Support & Community
  • Sample & Buy
  • About TI
Sample & Purchase Cart Sample & Purchase Cart
  • Search
  • Advanced
TI E2E™ Community
  • Support Forums
  • Blogs
  • Groups
  • Videos
  • 简体中文
  • More ...
TI Home » TI E2E Community » Support Forums » Digital Signal Processors (DSP) » C6000 Multicore DSP » Keystone Multicore Forum (C66, 66A, AM5) » How to load a single-precise data in linear assembly code?
Share
C6000 Multicore DSP
  • Forums
  • Announcements
Options
  • Subscribe via RSS
Training Available
TI provides self-paced online training that introduces the primary components of the KeyStone II family of SoC devices.

  • KeyStone II SoC Overview >
  • KeyStone II Software Overview >
  • KeyStone II ARM Cortex-A15 Corepac Overview >
  • More Information >
  • Check out
    Multicore Mix blog
    • $core_v2_blog.Current.Name

      OpenMP - All aboard!

      Posted 1 day ago
      by Debbie Greenstreet
      With so many end products today relying on multicore DSPs for...
    • $core_v2_blog.Current.Name

      A look back: Two years of Multicore Mix

      Posted 2 days ago
      by Lauren Reed1
      A big thank you to everyone who participated in our contest last...
    • $core_v2_blog.Current.Name

      It’s our second anniversary, but you get the present!

      Posted 9 days ago
      by Lindsey Bare
      It’s hard to believe it’s already been two years...

    Forums

    How to load a single-precise data in linear assembly code?

    This question is answered
    may may92122
    Posted by may may92122
    on Mar 28 2012 04:30 AM
    Expert1030 points

    I wrote a linear assembly function as follows:

                        .def       _test

    _test:          .cproc    a_0

                        .reg        val_a0

                        LDW       *a_0++,val_a0

                       .return      val_a0

                       .endproc

    Then in another file in the project  calling this function:

    void      main()

    {

                        int  a = 1;

                       float  b = 1.0;

                       int temp1 = 0;

                       int temp2 = 0.0;

                       temp1 = test(&a);

                      printf("%d\n",temp1);

                       temp2 = test(&b);

                       printf("%f\n",temp2);


    }

    I ran the above code on EVM6678, and the following result appeared in the console window:

    1

    1065353216.00000

    It seemed that LDW couldn't work right with single precise data, can somebody tell me why?

    C6678 6678 TMS320C6678 EVMC6678 EVMC6678L C6678EVM 6678l 6678le EVM6678 c6678l EVM6678L Emulation
    Report Abuse
    • Reply
    You have posted to a forum that requires a moderator to approve posts before they are publicly available.
    All Replies
    • Chad Courtney
      Posted by Chad Courtney
      on Mar 28 2012 10:04 AM
      Mastermind22595 points

      You have temp2 as an INT and when you returned the value then your printf or temp2 is casting an INT to float.

      Best Regards,
      Chad

      ------------------------------------------------------------------------------------------------------------

      Please click the Verify Answer button on this post if it answers your question.

      Report Abuse
      • Reply
      You have posted to a forum that requires a moderator to approve posts before they are publicly available.
    • may may92122
      Posted by may may92122
      on Mar 29 2012 01:50 AM
      Expert1030 points

      Chad,Thanks, but I checked the code, I mistyped the code here, the original code is as follows:

                          .def       _test

      _test:          .cproc    a_0

                          .reg        val_a0

                          LDW       *a_0++,val_a0

                         .return      val_a0

                         .endproc

      Then in another file in the project  calling this function:

      void      main()

      {

                          int  a = 1;

                         float  b = 1.0;

                         int temp1 = 0;

                         foat temp2 = 0.0;

                         temp1 = test(&a);

                        printf("%d\n",temp1);

                         temp2 = test(&b);

                         printf("%f\n",temp2);


      }

      I ran the above code on EVM6678, and the following result appeared in the console window:

      1

      1065353216.00000

      It seemed that LDW couldn't work right with single precise data, can somebody tell me why?

      Report Abuse
      • Reply
      You have posted to a forum that requires a moderator to approve posts before they are publicly available.
    • Allen Lee
      Posted by Allen Lee
      on Mar 29 2012 04:49 AM
      Genius3500 points

      Hi May,

      Did you declare the prototype of the function? Such as:

      float test(void *a);

      If not, the output of test(&b) will be treated as integer and be converted to float by INTSP.

      Allen

      Please press the "Verify Answer" button if you think the post is helpful to your question.Thanks.

      Report Abuse
      • Reply
      You have posted to a forum that requires a moderator to approve posts before they are publicly available.
    • Chad Courtney
      Posted by Chad Courtney
      on Mar 29 2012 08:08 AM
      Mastermind22595 points

      I agree with Allen's comments.

      That said, back to the basic question of the LDW assembly instruction.  It's just going to return the 32 bit value that's stored at the location that's being pointed to.  It doesn't care if it's float, int, 2 16bit values packed, etc.  It simply returns the 32bits exactly as they're stored in memory.  It's your type casting/declarations in C that's affecting how this data is treated.

      If you want to, single step into the assembly code, look where the a_0 register (it's going to be A4 since A4 is passed in as the first variable of a function) look at the memory location pointed to by A4, display it as a SP Float in a memory window and see what you observe, display it again as plane hex value, step through the code until you get the LDW executed (4 single steps after LDW is when the data will land in the register (I assume it would be B4 register, but you'll have to look at the code in dissassembly to see.)  Now, you'll see this is the exact same 32bit hex data as was in memory and this is what gets returned back. 

      Best Regards,
      Chad

      ------------------------------------------------------------------------------------------------------------

      Please click the Verify Answer button on this post if it answers your question.

      Report Abuse
      • Reply
      You have posted to a forum that requires a moderator to approve posts before they are publicly available.
    • may may92122
      Posted by may may92122
      on Mar 30 2012 20:21 PM
      Expert1030 points

      Thanks Allen and Chad,

      With your help, I totally got the right result. But on the other side, I'm sad with the result.  I studied on linear assembly in order to improve the processing speed of code, but after it, I found that I failed.

      The length of the array in my test is 264, when  optimization level was not chosen, the CPU cycle of c code is 11,923 , and the CPU cycle of linear assembly code is 8,350; but after o2 optimization level  was chosen , the CPU cycle of c code is 440, and the CPU cycle of linear assembly code is 962, which is two times of the c code!  Does it mean that it's so hard to optimize the code? Following spru187t, I tried the optimization methods in section 3 and section 4, but except optimization level coming with the complier, no other methods work.  If I badly need to optimize it furtherly, what can I do?

      Best regards,

      May

      Report Abuse
      • Reply
      You have posted to a forum that requires a moderator to approve posts before they are publicly available.
    • Allen Lee
      Posted by Allen Lee
      on Mar 30 2012 21:14 PM
      Genius3500 points

      Hi May,

      In this situation, I think it need the manually assembly coding. You should assign the registers and arrange the pipeline by yourself in order to utilize the calculation resource as much as possible. It will be more complex and time-cosuming than linear assembly, but also more effective.

      Allen

      Please press the "Verify Answer" button if you think the post is helpful to your question.Thanks.

      Report Abuse
      • Reply
      You have posted to a forum that requires a moderator to approve posts before they are publicly available.
    • RandyP
      Posted by RandyP
      on Apr 01 2012 23:45 PM
      Guru60190 points

      May,

      This thread shows a specific linear assembly test routine and a specific C-code benchmarking routine. Your original questions and the insightful answers were all for those specific code examples.

      may may92122
      The length of the array in my test is 264, ... after o2 optimization level  was chosen , the CPU cycle of c code is 440, and the CPU cycle of linear assembly code is 962, ...

      You are now talking about completely different program code, both the linear assembly and the main() function in C. The linear assembly example was a trivial one that you would never use in a real application.

      And you now you seem to have 2 versions of the same routine, one in C and one in linear assembly. This has not been shown in any of your posts for this thread.

      It is no longer clear what your question is, at least not to me. Chad and Allen may know exactly what you are doing, but I do not.

      Regards,
      RandyP

      Search for answers, Ask a question, click  Verify  when complete, Help others, Learn more.

      Report Abuse
      • Reply
      You have posted to a forum that requires a moderator to approve posts before they are publicly available.
    • Chad Courtney
      Posted by Chad Courtney
      on Apr 02 2012 08:45 AM
      Verified Answer
      Verified by may may92122
      Mastermind22595 points

      Randy,

      I'd have to concur, it's difficult to tell specifically what's being referenced since it's not the code that was originally being discussed here.

      May,

      You may want to post another thread regarding the optimization, but you'll want to do so in the C/C++ Compiler Forum which includes coverage for assembler and linear assembly as well.  That said, I'll note that linear assembly still requires you to 'unroll' the loop to give it the flexibility to build optimal code, and the Compiler itself is designed if given the freedom to generate highly optimized code, and it's recommended to not go to assembly/linear assembly if not necessary, to keep your code as portable as possible.

      Best Regards,

      Chad

      ------------------------------------------------------------------------------------------------------------

      Please click the Verify Answer button on this post if it answers your question.

      Report Abuse
      • Reply
      You have posted to a forum that requires a moderator to approve posts before they are publicly available.
    • may may92122
      Posted by may may92122
      on Apr 09 2012 01:53 AM
      Expert1030 points

      I see now, thanks everyone.

      Report Abuse
      • Reply
      You have posted to a forum that requires a moderator to approve posts before they are publicly available.
    TI E2E™ Community
    • Support Forums
    • Blogs
    • Videos
    • Groups
    • Site Support & Feedback
    • Settings
    TI E2E™ Community Groups
    • TI University Program
    • Make the Switch
    • Microcontroller Projects
    • Motor Drive & Control
    Other Communities
    • Deyisupport
    • Designsomething.org
    • beagleboard.org
    • TI on Element 14
    • TI on TechXchangeSM
    Other Technical & Support Resources
    • WEBENCH® Design Center
    • Product Information Centers
    • Technical Documents
    • TI Design Network
    • TI Technical Articles
    • TI Training

    All content and materials on this site are provided "as is". TI and its respective suppliers and providers of content make no representations about the suitability of these materials for any purpose and disclaim all warranties and conditions with regard to these materials, including but not limited to all implied warranties and conditions of merchantability, fitness for a particular purpose, title and non-infringement of any third party intellectual property right. TI and its respective suppliers and providers of content make no representations about the suitability of these materials for any purpose and disclaim all warranties and conditions with respect to these materials. No license, either express or implied, by estoppel or otherwise, is granted by TI. Use of the information on this site may require a license from a third party, or a license from TI.

    Content on this site may contain or be subject to specific guidelines or limitations on use. All postings and use of the content on this site are subject to the Terms of Use of the site; third parties using this content agree to abide by any limitations or guidelines and to comply with the Terms of Use of this site. TI, its suppliers and providers of content reserve the right to make corrections, deletions, modifications, enhancements, improvements and other changes to the content and materials, its products, programs and services at any time or to move or discontinue any content, products, programs, or services without notice.

    Follow Us Texas Instruments on Facebook Texas Instruments on Twitter Texas Instruments on LinkedIn Texas Instruments on Google+
    TI Worldwide | Contact Us | my.TI Login | Site Map | Corporate Citizenship | mobile m.ti.com (Mobile Version)

    TI is a global semiconductor design and manufacturing company. Innovate with 100,000+ analog ICs and
    embedded processors, along with software, tools and the industry’s largest sales/support staff.

    © Copyright 1995-2013 Texas Instruments Incorporated. All rights reserved.
    Trademarks | Privacy Policy | Terms of Use