• Join
  • Sign In with my.TI Login
Texas Instruments
  • Products
  • Applications
  • Tools & Software
  • Support & Community
  • Sample & Buy
  • About TI
Sample & Purchase Cart Sample & Purchase Cart
  • Search
  • Advanced
TI E2E™ Community
  • Support Forums
  • Blogs
  • Groups
  • Videos
  • 简体中文
  • More ...
TI Home » TI E2E Community » Support Forums » Digital Signal Processors (DSP) » C6000 Multicore DSP » C64x Multicore DSP Forum » Multi-core programming in C6472 using shared memory
Share
C6000 Multicore DSP
  • Forums
  • Announcements
Options
  • Subscribe via RSS
Resources
  • KeyStone Multicore DSP + ARM Product Folder
  • Keystone II MCSDK Download
  • Keystone II MCSDK User's Guide

  • KeyStone Multicore DSP Product Folder
  • Keystone I BIOS-MCSDK Download
  • Keystone I BIOS-MCSDK User's Guide

  • Keystone I & II Training

  • C6000 Multicore DSP Product Folder
  • C6000 Multicore Devices

  • DESKTOP-LINUX-SDK 01_00_00_07 Download
  • MCSDK-VIDEO 02_01_00_08 Download

  • C6472 and C6474 Online Training
  • Check out
    Multicore Mix blog
    • $core_v2_blog.Current.Name

      Wireless base stations – why monitor the backup battery?

      Posted 5 days ago
      by Raj Radjassamy
      2G, 3G and 4G LTE base stations use lead-acid batteries as the...
    • $core_v2_blog.Current.Name

      Innovation through smart integration – achieving lower power, high performing small cells

      Posted 6 days ago
      by Debbie Greenstreet
      Wireless small cell market activity is accelerating lately and...
    • $core_v2_blog.Current.Name

      Four for Friday: What’s up at the Small Cells World Summit?

      Posted 19 days ago
      by Debbie Greenstreet
      If you are a follower of the wireless technology, you have probably...

    Forums

    Multi-core programming in C6472 using shared memory

    This question is not answered
    AC53351
    Posted by AC53351
    on Jan 10 2011 05:39 AM
    Intellectual370 points

    Hello,

    I want to build multicore application in C6472 using the shared memory. I am using CCS v4 as the compiler. I have read the document SPRUEG5C. The arbitration logic seems to be quite complex to me. In this regard I have some questions (I'm sorry for posting them if they appear silly) .

    1. Does using optimization level-3 in CCSV4 imply that the shared memory access is configured to be pre-fetchable always? In that case I think I cannot use atomic access monitor in optimization level-3. Because SPRUEG5C says "Atomic access should go only to non-prefetchable address spaces." Am I correct?

    2. Other than the configuration of prefetchable/nonprefetchable part of the shared memory, the power down issues and the fault indications, do I need to use the SMC memory mapped registers manually from my code or they will be used by the arbitration logic hardware only?

    3. Reading the document for SMC controller it seems to me that the user 'talks' to the atomic access monitor and the atomic access monitor controls the arbitration logic hardware. The user cannot directly control the arbitration logic hardware without using atomic access monitor. Am I right?

    4. While programming do I need to specify somehow the per-bank SMC controller through which I am trying to access the shared memory or the location of the shared memory I'm trying to access itself indicates the hardware about the per-bank controller through which the request should go?

    5. From the C code if I want to access a shared memory location by a declaration like "#define VALUE (*((volatile unsigned int *) 0x00200000))" and then "VALUE=0x1234", when this code is compiled will it automatically pass the write(or read) request through the atomic access monitor (by using LL, SL, CMTL instructions in assembly) or I have to take some other step to ensure atomic access? If so then what are such steps?

    6. Can anyone please send me an example project where shared memory is used by multiple cores preferably without using DSP-BIOS?

     

    Regards,

    AC.

     

     

    Report Abuse
    • Reply
    You have posted to a forum that requires a moderator to approve posts before they are publicly available.
    All Replies
    • Viswanath L
      Posted by Viswanath L
      on Jan 10 2011 05:57 AM
      Intellectual290 points

      I don't know if you are already aware of this, but here is the link to SMMQT: http://software-dl.ti.com/dsps/dsps_registered_sw/sdo_sb/targetcontent/MQT/index.html

      I does use DSP/BIOS though, but the source code might answer many of your questions.

      Report Abuse
      • Reply
      You have posted to a forum that requires a moderator to approve posts before they are publicly available.
    • Shreyas Prasad
      Posted by Shreyas Prasad
      on Jan 13 2011 17:02 PM
      Expert4445 points

      AC,

      Is there a particular reason that would like to stay away from BIOS for your multi-core application?  If not, TI offers a product called 'IPC'  that facilitates developing multicore BIOS applications on devices including C6472.  Low-level hardware operations like interacting with Atomic Access Monitors on C6472 are abstracted away by IPC modules such as GateMP which is used for protection of shared resources including shared memory.  Other functionality that is offered include inter-processor notifications, multicore heaps and data structures.

      Regarding your questions,  I'll have to get back to you regarding questions #1 & 2.

      Regarding question #3, yes you are right.  The user only interacts with the atomic access monitor.  When using IPC, the user interacts with the GateMP module which itself interacts with AAM's at a lower level.

      Regarding question #4, the location of the shared memory itself determines the mapping to hardware.

      Regarding question #5, no--operations on volatile variables aren't automatically made atomic between multiple processors.  You would have to protect these operations using atomic access monitors (or GateMP if you are using IPC).

      Regarding question #6--IPC ships with a couple multi-core applications that can be built for C6472.  However, these applications do use BIOS.

      Regards,

      Shreyas

      Report Abuse
      • Reply
      You have posted to a forum that requires a moderator to approve posts before they are publicly available.
    • Viswanath L
      Posted by Viswanath L
      on Jan 14 2011 01:49 AM
      Intellectual290 points

      Hello Shreyas,

      Do you have any particular reason to recommed IPC over SMMQT? SMMQT also uses the AAMs for intercore arbitration, and provides simplified API calls via the BIOS MSGQ module.

      I would be greatly interested to know about any differentiation between the two choices, as I have already some progress with SMMQT.

      Regards,

      Viswa.

      Report Abuse
      • Reply
      You have posted to a forum that requires a moderator to approve posts before they are publicly available.
    • AC53351
      Posted by AC53351
      on Jan 14 2011 02:54 AM
      Intellectual370 points

      Thanks a lot  Viswanath L and Shreyas Prasad for your prompt and precise replies. I am going through the SMMQT examlpes to decipher them. As Shreyas has asked, the reasons I am interested to build application without DSP BIOS are:

      1. I want to learn and observe the actions going on in the register level. I know sometimes that sounds a bit impractical and time-inefficient, notwithstanding. Actually previously I stumbled in the same way for firing an ISR in a single core without using DSP BIOS. But later I managed to do that by mixing some C and assembly codes. I am also able to generate interprocessor interrupt from one core and service that from other core without BIOS using IPCGR registers and the corresponding event numbers. So now I am targeting the shared memory access.

      2. I want the whole code to be visile to me(as much as possible). So I am trying to avoid API based abstractions as much as possible.

      3. I want to avoid any overhead from the application code. Though it is said that DSP BIOS comes with a little overhead, still just trying to cope up if the application is manageable without BIOS then its fine. 

      Thanks once again for the replies. Regards,

      AC.

      Report Abuse
      • Reply
      You have posted to a forum that requires a moderator to approve posts before they are publicly available.
    • Shreyas Prasad
      Posted by Shreyas Prasad
      on Jan 14 2011 11:42 AM
      Expert4445 points

      Viswanath, the main difference between IPC and SMMQT is that SMMQT works with BIOS 5.x and IPC works with BIOS 6.x.  Also, SMMQT has limited functionality and device support compared to IPC.  It does offer message passing via MSGQ and it ships with code to use Atomic Access Monitors on C6472.  IPC is also more portable between multiple devices since hardware details (i.e. atomic access monitors, hardware semaphores, etc) are abstracted away in top-level modules.

      AC, I understand your motivations to avoid BIOS and operate at the register level. FYI (regarding point #2), BIOS6 and IPC are both open source and ship with the source code as well.

      Report Abuse
      • Reply
      You have posted to a forum that requires a moderator to approve posts before they are publicly available.
    • AC53351
      Posted by AC53351
      on Jan 18 2011 05:07 AM
      Intellectual370 points

      Thanks Shreyas for the information.

      1. Can you please tell me the path where the open source libraries are there for BIOS or IPC?

      2. What is the concept of memory bank? Is it there inside the Shared  Memory Controller only(seems like that from the indication of SMC boundary in figure 2 of SPRUEG5C)? Or the whole shared L2 RAM (768KB) is divided into 4 physically separate address spaces called banks in case of C6472(seems like that from sect. 4.2 of SPRUEG5C which says: "SMC divides SL2 RAMs address space into 4 physical pages.")? What is the meaning of "256 bits wide memory bank" mentioned in the SMC controller user guide(figure 2, SPRUEG5C)?

      3. If the memory banks are really 4 segments of the physical memory, then

          (i) What are the address boundaries? Are they(bank-0, bank-1, bank-2, bank-3) equally spaced within the total 0x BFFFF locations of the SL2 RAM of C6472 or something else?

          (ii) Can four different cores read/write different SL2 RAM locations in/through four different banks at the same time(assuming there is no previous request pending)? Or the arbitration logic will come into picture to resolve the conflict and sequentially arrange the requests and give the cores a feeling that the read/write are done simultaneously? Assuming the accesses are not made atomic because they are accessing different locations.

       

      Regards,

      AC.

      Report Abuse
      • Reply
      You have posted to a forum that requires a moderator to approve posts before they are publicly available.
    • Shreyas Prasad
      Posted by Shreyas Prasad
      on Jan 18 2011 10:18 AM
      Expert4445 points

      You can download SYS/BIOS at http://software-dl.ti.com/dsps/dsps_public_sw/sdo_sb/targetcontent/sysbios/index.html and IPC at http://software-dl.ti.com/dsps/dsps_public_sw/sdo_sb/targetcontent/ipc/index.html.  Note that you will also need XDCTools since both BIOS and IPC depend on this product.  XDCTools can be downloaded at http://software-dl.ti.com/dsps/dsps_public_sw/sdo_sb/targetcontent/rtsc/index.html.

      All 3 products contain both built libraries and source code.

      I'm not familiar enough with the memory architecture of C6472 to answer your remaining questions.  I will forward this post to someone more knowledgeable about this topic.

      Regards,

      Shreyas

       

      Report Abuse
      • Reply
      You have posted to a forum that requires a moderator to approve posts before they are publicly available.
    • Shreyas Prasad
      Posted by Shreyas Prasad
      on Jan 21 2011 13:25 PM
      Expert4445 points

      I forwarded your remaining questions and obtained the following details regarding the C6472 SMC:

      C6472 SMC controls 4 banks of memory.  Each bank is 256 bits wide and the banks are interleaved as follows:
      - bank 0: base address + 0:31, 128:159, ...
      - bank 1: base address + 32:63, 160:191, ...
      - bank 2: base address + 64:95, 192:223, ...
      - bank 3: base address + 96:127, 224:255, ...
       
      SMC allows 4 concurrent accesses to 4 different banks.  If there is a bank conflict, it will select one access and let the others wait.


      Regards,
      Shreyas
      Report Abuse
      • Reply
      You have posted to a forum that requires a moderator to approve posts before they are publicly available.
    • Estian Malan
      Posted by Estian Malan
      on May 17 2011 05:45 AM
      Intellectual635 points

      Hi AC.

      I have EXACTLY the same issues as you with the understanding of the SMC. I have been studying the SMC (SPRUEG5C) as well as the CSL user's guide with slow progress towards understanding how to use the SMC and ensure atomicity. There is clearly also a shortage in proper simple example projects that work 'out-of-the-box'.

      My reason for avoiding SYS/BIOS is that my multi-core application uses the SRIO peripheral in DirectIO mode, and this mode is not supported under SYS/BIOS. Only message-passing mode is supported under SYS/BIOS. There are MANY other perfectly legitimate and sensible reasons to avoid an OS. Generally you can get much better optization and performance out of an application by programming it in bare-board (i.e. NO SYS/BIOS), especially if that application involves the execution of repetitive single tasks, regardless of whether it is multi-core or not. As soon as your application becomes more multi-task oriented, it is strongly advised to move to an OS (Like SYS/BIOS) of some sort.

      I have a bare-board (i.e. NO SYS/BIOS) multi-core application in which all the cores try to access (i.e. read AND write) a common integer variable in SL2 RAM. This variable is used as a semaphore between the cores to arbitrate access to other resources, and simply has the value 1 for a 'busy' condition, and 0 for 'non-busy'. This variable has the simple purpose of indicating to the cores that certain resources are blocked from being accessed, because another core is busy working on them. However, if atomic access to this variable is not guaranteed, it could obviously happen that two cores simultaneously read the value as 0 (non-busy), then simultaneously assert the variable to 1 (busy), and then simultaneously access the other resources, thus defeating the purpose of atomicity to the resources. According to the specifications of the SL2 controller, it supports atomic access monitoring, but I have had no success in understanding how this is accomplished

       

      My question is the following:

      * Is the atomic access to SL2 RAM supposed to be transparent to the end-user, or does it have to be controlled manually?

      In my opinion it should be transparent.

       

      I have been successful in developing this code for the C6474 by using the on-chip SEMAPHORE module, but now I want to migrate the code to the C6472 for performance comparison purposes.

      Have you had any success in the mean-time? Do you have any advice on how this (seemingly simple) task mentioned above can be accomplished/guaranteed? Maybe a different approach? Your help would be greatly appreciated.

       

      Regards.

      Estian.

       

       

      Report Abuse
      • Reply
      You have posted to a forum that requires a moderator to approve posts before they are publicly available.
    • one and zero
      Posted by one and zero
      on May 17 2011 06:24 AM
      Expert6880 points

      Hi,

      atomic access to the SL2 RAM is supported via the instruction set of the device. You'll find an explanation and examples in the TMS320C64x/C64x+ DSP

      CPU and Instruction Set Reference Guide (SPRU732) in the chapter "C64x+ CPU Atomic Operations".


      We offer 3 instructions that work together with shared L2 memory on the TMS320C6472:

      • LL — Load Linked Word from Memory

      • SL — Store Linked Word to Buffer

      • CMTL — Commit Store Linked Word to Memory Conditionally

      How does it work?

      The LL instruction reads a word of memory and prepares to execute an SL instruction. The LL instruction reads a word form memory with a side effect, a link valid flag is set true and the address is monitored. If any other process stores to that address, the link valid flag is cleared. The link valid flag is also cleared if the SL instruction is executed with a different address. The SL instruction buffers a word to be stored to memory by the CMTL instruction. It does not commit the change. Finally the CMTL instruction reads the value of the link valid flag. If the link valid flag is true, the data buffered by the SL instruction is written to memory. If the commit fails, the update must be retried.

       

      I hope that helps.

       

      Kind regards,

      one and zero

       

      Please click the Verify Answer button on this post if it answers your question.

      You can also follow me on Twitter: http://twitter.com/oneandzeroTI

      Do you want to read interesting multicore articles? Check out our Multicore Mix

       

      Report Abuse
      • Reply
      You have posted to a forum that requires a moderator to approve posts before they are publicly available.
    • AC53351
      Posted by AC53351
      on May 17 2011 15:21 PM
      Intellectual370 points

      Hi,

      As per my understanding (developed by reading and discussion in this thread) the user 'talks' to the atomic access monitor and the atomic access monitor controls the arbitration logic hardware. The user cannot directly control the arbitration logic hardware without using atomic access monitor. In my application more than one cores were trying to read a location simultaneously but that was not a write attempt. So I could bypass the requirement of atomicity. I used the shared memory location as a simple memory mapped register and that worked. If not mentioned, the code generation tool will not generate LL, SL, CMTL instructions (such as in my case I did not want atomicity, so SL2 access was using simple load store instructions). Also the post by Shreyas Prasad helped me understand the interleaved memory structure which is helpful for VLIW architecture. For large chunk of data (~2000) I checked the time required to write this chunk by different cores to different non intersecting regions in SL2. The overall time is almost same for simultaneous try of 1,2,3,4 cores. For more cores this time increases. This re-ensures the 4-bank structure.

      But as per my understanding harping on the same string as one and zero, if DSP BIOS is to be avoided then the only way to ensure atomicity is to use LL, SL, CMTL. Now next question comes whether to embed these assembly codes inside C code using 'asm'? I was advised in forum not to do so. Because for doing that I need other registers also. But don't know whether the cross compiled C code is doing  something with that register or not. So there comes question of push pop into stack or else knowing the way the registers are handled by the code gen. tools. Things will be complicated. So I think writing functions using these assembly instructions and calling them from C code in order to maintain atomicity will be a better option. Though in that case also there are some restrictions on the register usage (can be found in 'optimizing compiler' doc) still I think (not tried) that will be a simple way.

      Regards,

      AC.

      Report Abuse
      • Reply
      You have posted to a forum that requires a moderator to approve posts before they are publicly available.
    • one and zero
      Posted by one and zero
      on May 18 2011 06:40 AM
      Expert6880 points

      Hi AC,

      you're absolutely right. You shouldn't use the asm() in your C-code. What you can do is copy the exa mples in an .asm file. You can call the assembler functions from C.

      For more info on how to mix C and assembly please have a look in TMS320C6000 Optimizing Compiler (SPRU187), Chapter 7.5

      Kind regards,

      one and zero

       

      Please click the Verify Answer button on this post if it answers your question.

      You can also follow me on Twitter: http://twitter.com/oneandzeroTI

      Do you want to read interesting multicore articles? Check out our Multicore Mix

       

      Report Abuse
      • Reply
      You have posted to a forum that requires a moderator to approve posts before they are publicly available.
    • Estian Malan
      Posted by Estian Malan
      on May 19 2011 01:41 AM
      Intellectual635 points

      Hi One and Zero / AC,

      Thank you for your prompt replies and advice!

      Wow! I really can't believe that things can be so complicated for something so simple. Atomicity is certainly much simpler with a hardware semaphore like the C6474...

      Have the TI developers considered developing a CSL API that would make this process a little simpler, and perhaps include in a new release of the CSL?

       

      Ok, so before I spend a few weeks attempting to get this working the way you suggest, I would like to know:

      1. Is there perhaps another way of developing a semaphore that guarantees atomicity towards common resources for the C6472? I'll have you know that I have studied all the available documents on the TI website that address this, but to me, none of them seem as sound and reliable as a hardware solution.
      2. Which 'examples' are you referring to before, and where exactly can I get hold of them?

      Regards.

      Estian.

       

      Report Abuse
      • Reply
      You have posted to a forum that requires a moderator to approve posts before they are publicly available.
    • Estian Malan
      Posted by Estian Malan
      on May 26 2011 02:33 AM
      Intellectual635 points

      Hi.

       

      Ok, so based on the examples in the documentation you suggested, I have attempted a very simple approach and it seems I am stumbling at the very first hurdle. Here is the simple C code for my main routine:

       

      #include <stdio.h>

      extern asmfunc(void);

      void main(void)

      {

       asmfunc();

      }

       

      And here is the simple assembly function (.asm file) I created, and simply included into my CCS 4.0 project:

       

      .global _asmfunc

      _asmfunc:

      NOP 4

       

      I am getting the fllowing compiler error:

       

      "../asmfunc.asm", ERROR!   at line 1: [E0002] Illegal mnemonic specified

      .global _asmfunc


       

      What am I doing wrong? I am also unsure as which Build Options settings I have to fiddle with in CCS 4.0. There are so many parameters, its making my head spin... :)

       

      Please help!

      Estian.

       

       

       

      Report Abuse
      • Reply
      You have posted to a forum that requires a moderator to approve posts before they are publicly available.
    • one and zero
      Posted by one and zero
      on May 26 2011 07:19 AM
      Expert6880 points

      Hi Estian,

      please try like that:

           .global _asmfunc

      _asmfunc:

      NOP 4

       

      Kind regards,

      one and zero

       

      Please click the Verify Answer button on this post if it answers your question.

      You can also follow me on Twitter: http://twitter.com/oneandzeroTI

      Do you want to read interesting multicore articles? Check out our Multicore Mix

       

      Report Abuse
      • Reply
      You have posted to a forum that requires a moderator to approve posts before they are publicly available.
    12
    TI E2E™ Community
    • Support Forums
    • Blogs
    • Videos
    • Groups
    • Site Support & Feedback
    • Settings
    TI E2E™ Community Groups
    • TI University Program
    • Make the Switch
    • Microcontroller Projects
    • Motor Drive & Control
    Other Communities
    • Deyisupport
    • Designsomething.org
    • beagleboard.org
    • TI on Element 14
    • TI on TechXchangeSM
    Other Technical & Support Resources
    • WEBENCH® Design Center
    • Product Information Centers
    • Technical Documents
    • TI Design Network
    • TI Technical Articles
    • TI Training

    All content and materials on this site are provided "as is". TI and its respective suppliers and providers of content make no representations about the suitability of these materials for any purpose and disclaim all warranties and conditions with regard to these materials, including but not limited to all implied warranties and conditions of merchantability, fitness for a particular purpose, title and non-infringement of any third party intellectual property right. TI and its respective suppliers and providers of content make no representations about the suitability of these materials for any purpose and disclaim all warranties and conditions with respect to these materials. No license, either express or implied, by estoppel or otherwise, is granted by TI. Use of the information on this site may require a license from a third party, or a license from TI.

    Content on this site may contain or be subject to specific guidelines or limitations on use. All postings and use of the content on this site are subject to the Terms of Use of the site; third parties using this content agree to abide by any limitations or guidelines and to comply with the Terms of Use of this site. TI, its suppliers and providers of content reserve the right to make corrections, deletions, modifications, enhancements, improvements and other changes to the content and materials, its products, programs and services at any time or to move or discontinue any content, products, programs, or services without notice.

    Follow Us Texas Instruments on Facebook Texas Instruments on Twitter Texas Instruments on LinkedIn Texas Instruments on Google+
    TI Worldwide | Contact Us | my.TI Login | Site Map | Corporate Citizenship | mobile m.ti.com (Mobile Version)

    TI is a global semiconductor design and manufacturing company. Innovate with 100,000+ analog ICs and
    embedded processors, along with software, tools and the industry’s largest sales/support staff.

    © Copyright 1995-2013 Texas Instruments Incorporated. All rights reserved.
    Trademarks | Privacy Policy | Terms of Use