• Join
  • Sign In with my.TI Login
Texas Instruments
  • Products
  • Applications
  • Tools & Software
  • Support & Community
  • Sample & Buy
  • About TI
Sample & Purchase Cart Sample & Purchase Cart
  • Search
  • Advanced
TI E2E™ Community
  • Support Forums
  • Blogs
  • Groups
  • Videos
  • 简体中文
  • More ...
TI Home » TI E2E Community » Support Forums » Embedded Software » StarterWare » StarterWare forum » Starterware on beaglebone is going slow?
Share
StarterWare
  • Forum
Options
  • Subscribe via RSS

Forums

Starterware on beaglebone is going slow?

This question is answered
Karl Albertsson
Posted by Karl Albertsson
on Apr 11 2012 05:31 AM
Prodigy60 points

Hello!

I have some code running on the beaglebone, but it seems that performance is not that great. I am currently running a timer interrupt with some code in it. But the code that is being run in this is not going especially fast. I have one function that toggles a gpio pin. The difference between running the register command and running the register command via a function is 1.5us. So what I am saying is basically that a single function call is taking up 1.5us of processing time. That seems very bad.

Is there anything I'm missing here? Is the default clock rate of the beaglebone with starterware not set to 500 or 720? I've been troubleshooting this for several hours, and I can just not find whats wrong, because surely a function call should not take 1.5us?

Regards

Karl

Report Abuse
  • Reply
You have posted to a forum that requires a moderator to approve posts before they are publicly available.
All Replies
  • Karl Albertsson
    Posted by Karl Albertsson
    on Apr 11 2012 09:47 AM
    Prodigy60 points

    I have done some more tests. I have also enabled caches as in the demo application. Even then an application that takes 0.3 seconds to run on a blackfin at 400MHz takes about 5 seconds with the BeagleBone running starterware... There must be something wrong, anybody have any ideas?

    Report Abuse
    • Reply
    You have posted to a forum that requires a moderator to approve posts before they are publicly available.
  • Madhvapathi Sriram
    Posted by Madhvapathi Sriram
    on Apr 11 2012 12:10 PM
    Intellectual465 points

    Hmm... this problem sounds interesting, though I know it's paining you :-)

    Just putting my understanding of the problem here.

    You have a function which does some task (which is unusual, but lets have it that way), in the context of the timer interrupt.

    You set/reset a GPIO on entering the interrupt handler, and reset/set the GPIO before exiting. You see that the pulse duration is 5 secs.

    Are you doing any intensive computation? Like a lot of divisions, or complex math etc?

    Though I have never worked on Blackfin processors, my two cents:

    1. I see that Blackfin has DSP integrated and that could be one reason that task is finished faster and thus the rate of toggling is faster? The Sitara processors are ARM only and uses runtime libraries. Did you try with different compiler tool chains (TI, GCC, IAR) on the Sitara?
    2. How about comparing the two performances with just an empty handler. That way we will analyze the interrupt response times and plain interrupt latencies. Would that be a better way of comparison?

    While I continue to think, hope this helps..

    Regards,

    Madhvapathi Sriram

     

     

    Thanks and regards,

    Madhvapathi Sriram

    Report Abuse
    • Reply
    You have posted to a forum that requires a moderator to approve posts before they are publicly available.
  • Karl Albertsson
    Posted by Karl Albertsson
    on Apr 11 2012 12:41 PM
    Prodigy60 points

    Hi Madhvapathi!

    Sorry about the confusion, I will try to clarify. I currently have two programs.


    The first program consists of basically just a timer interrupt (dmtimer2) running at 50 KHz. The code inside the interrupt is not especially complex, it toggles a gpio on and off, and has some other calculations in between. While looking at it with my analyzer I noticed that the interrupts did not occur at steady 50KHz, but rather at like 30 KHz. So I began stripping down the code. During this time I noticed that just removing one function call and replacing it with what was inside of it would reduce the computation time of the interrupt with 1.5us, which seems awfully long for a function call I thought.

    So, I went back to my other program I wrote earlier (it is just a bunch of integer computations, very few floats). I had never benchmarked this before, I just made sure it could be run properly. Now, while running the program I notice that the execution time of it is very slow as well. The same program running on a blackfin is around 40 times faster.

    By now I'm basically thinking that it is not especially the interrupt that is slow, but the whole computation of the processor as a whole. So I looked around on the forums, and I saw that some guys were able to improve performance by enabling the caches as written in the demo application. So I take that code and apply it to my programs. The first program with the timer is now able to run at 50KHz, but replacing a single function call with its content is still telling me that function calls are very slow (>1us), so it did not seem to offer that much of an improvment.

    I now try to apply the cache code to my second program, this helps a bit, and basically cuts the execution time in half (from 10 seconds down to 5), but the blackfin is still 20 times faster.

    Now I'm starting to wonder if the processor is not running at full speed? Is the bootloader putting it at 500 MHz? Is there some other pipelining issues at hand? It just doesn't seem right that a single function call takes over 1us, or that the performance of the processor is basically at least 20 times slower than I would expect.

    I have currently only used CCS and the TMS470 compiler. I have not been digging into any optimization options (althought there does not seem to be many). I have tried both running the code via the debugger and booting from memory card, although the performance seems to be the same. To me it seems like there is some sort of initialization of something missing, because surely the performance must be greater...

    Thanks for the help!

    Regrards

    Karl

    Report Abuse
    • Reply
    You have posted to a forum that requires a moderator to approve posts before they are publicly available.
  • Karl Albertsson
    Posted by Karl Albertsson
    on Apr 12 2012 02:49 AM
    Verified Answer
    Verified by Karl Albertsson
    Prodigy60 points

    Ok, I finally fixed it!

    It seems like the demo-application did not use D-cache, only I-cache. When I enabled the D-cache (I used the code from the uartEdma_Cache project), things speeded up drastically, it is now faster than the blackfin with 50% :)

    Thanks for the help!

    Report Abuse
    • Reply
    You have posted to a forum that requires a moderator to approve posts before they are publicly available.
TI E2E™ Community
  • Support Forums
  • Blogs
  • Videos
  • Groups
  • Site Support & Feedback
  • Settings
TI E2E™ Community Groups
  • TI University Program
  • Make the Switch
  • Microcontroller Projects
  • Motor Drive & Control
Other Communities
  • Deyisupport
  • Designsomething.org
  • beagleboard.org
  • TI on Element 14
  • TI on TechXchangeSM
Other Technical & Support Resources
  • WEBENCH® Design Center
  • Product Information Centers
  • Technical Documents
  • TI Design Network
  • TI Technical Articles
  • TI Training

All content and materials on this site are provided "as is". TI and its respective suppliers and providers of content make no representations about the suitability of these materials for any purpose and disclaim all warranties and conditions with regard to these materials, including but not limited to all implied warranties and conditions of merchantability, fitness for a particular purpose, title and non-infringement of any third party intellectual property right. TI and its respective suppliers and providers of content make no representations about the suitability of these materials for any purpose and disclaim all warranties and conditions with respect to these materials. No license, either express or implied, by estoppel or otherwise, is granted by TI. Use of the information on this site may require a license from a third party, or a license from TI.

Content on this site may contain or be subject to specific guidelines or limitations on use. All postings and use of the content on this site are subject to the Terms of Use of the site; third parties using this content agree to abide by any limitations or guidelines and to comply with the Terms of Use of this site. TI, its suppliers and providers of content reserve the right to make corrections, deletions, modifications, enhancements, improvements and other changes to the content and materials, its products, programs and services at any time or to move or discontinue any content, products, programs, or services without notice.

Follow Us Texas Instruments on Facebook Texas Instruments on Twitter Texas Instruments on LinkedIn Texas Instruments on Google+
TI Worldwide | Contact Us | my.TI Login | Site Map | Corporate Citizenship | mobile m.ti.com (Mobile Version)

TI is a global semiconductor design and manufacturing company. Innovate with 100,000+ analog ICs and
embedded processors, along with software, tools and the industry’s largest sales/support staff.

© Copyright 1995-2013 Texas Instruments Incorporated. All rights reserved.
Trademarks | Privacy Policy | Terms of Use