This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

BCACHE API

Other Parts Discussed in Thread: OMAP-L137

I'm using an OMAP-L137 on a custom board. I am using PROC, POOL, and NOTIFY components of DSPLink. I'm just now debugging the use of fake data. That is, some of the developers do not have access to the board that provides input data to the DSP. For those people, the app creates a large pool area, notifies the DSP of the pointer, and then populates the test data buffer.

The DSP code gets the pointer, the data is good. I run the "easy" algorithms on the test data with no problem. When I get to the part of the algorithm that uses more of the test data buffer (140000+ bytes of data), something bad happens. I've been debugging this in CCS using the Kernel/Object Viewer, breakpoints, etc. The "real" code (the part of the code that receives data from the SPI device), works fine. When I switch to the Fake Data buffer as the source for the data, it doesn't work. When the bad thing happens, the "TSK" entry in the Kernel/Object Viewer changes from holding 4 tasks to holding none. The code in the debugger continues to step through looping as expected, but eventually I find the code in the idle loop never returning to the other tasks.

I have several tasks running and I'm trying to figure out who the trouble maker is. The stacks are fine and no where close to overflowing, so that isn't it. I would like to review the functionality of the BCACHE calls that I'm using because I'm afraid that I might be abusing them. None of the links I've found for SPRU403 are any good. That is, the file seems to be gone. I've had a similar problem in the past with other documents. Can anyone point me to a copy of the file that is good? Or perhaps attach it someplace on the forum?

  • BIOS bundles its documentations in a "docs" folder that gets installed along with the actual libraries.  Alternatively, here's the "home page" for most of our software:

    http://software-dl.ti.com/dsps/dsps_public_sw/sdo_sb/targetcontent/index.html

    Go to DSP/BIOS -> DSP/BIOS 5.x -> DSP/BIOS <latest version> -> Release notes.  The under "Documentation" you'll find links to all the docs.

    Also, here's a wiki page with tips on finding issues that might result in software instability:

    http://wiki.davincidsp.com/index.php?title=DSP_BIOS_Debugging_Tips

    In terms of the BCACHE APIs, you should note that cache operations always occur on entire LINES of data.  The L2 cache line size is 128 bytes.  If you have any buffers/structures/data on which you are performing cache operations you need to be sure that the data is aligned to a 128 byte boundary and that it is a multiple of 128 bytes.  In other words, you need to make sure there is no other data "sharing" the cache line.

  • Flamingo,

    Here are some hints that might help you (or not :) ):

    1) I had problems with DSPLink when the version that I was building the DSP was different than the one that I was building the Linux side. By version I do not mean the version number, but that if you rebuilt DSPLink in the Linux side, than it can cause conflicts with the DSP side. If you are not using CCS to build, and you are building everything in Linux, than is probably not the issue. If you are using CCS to build and you have DSP Link in your windows machine, than that could be the problem.

    2) Make sure that there are no conflicts in the Linux side with the DSP side -> disable everything in the Linux kernel that could conflict with the DSP side. If you let me know all you are using in the DSP side I can give you some hints.

    3) Some DSP programs do pin mux configurations, make sure to track then and see if they are not conflicting with pin mux configurations previously made by the arm. You might need to check if the DSP program is using an "OR" or an "=" when does the pin mux. For example, if your DSP program does:

    PINMUX_REG = 0x01010101; // this might overwrite what was done in the arm side and cause conflicts

    you can try changing it to:

    PINMUX_REG |= 0x01010101; // the OR makes sure that it does not overwrite.

     

     

     

     

  • I'm sure that some dsp code makes pinmux changes. I've learned more since the pinmux post. Arnie at TI provides this information:

    "Just wanted to touch base regarding the use of the KICK registers lock/unlock capability.  There seems to be a fundamental race condition w/ the KICK registers and they can’t be safely used.  The approach that is being taken is to "unlock forever" and never touch them again.  Future revisions of the Si will disable the KICK lock-out functions thus writes to the KICK registers will be ignored.

    I just wanted to make sure it‘s clear that this is a silicon limitation not an issue/limitation of any of the OMAPL137 SDK software (i.e. DSPLink, DSP/BIOS, etc) that application developer need to be aware of."

    So, if you are writing software that changes the pinmux on the DSP side after DSPLink is initialized, you would want to avoid the use of the KICK registers. Or at least that's how I interpret Arnie's advise.

  • This issue is specifically with the unlock/lock feature of the KICK register.  Here is a scenario:

    Arm:
    *(unsigned int *)KICK0) = 0x83e70b13 // unlock
    *(unsigned int *)KICK1) = 0x95a4f1e0  // unlock
    //write to the config registers

    DSP:
    *(unsigned int *)KICK0) = 0x83e70b13 // unlock
    *(unsigned int *)KICK1) = 0x95a4f1e0  // unlock
    //write to the config registers


    - ARM unlocked KICK registers and start writing to cfg area
    - DSP started KICK register unlocking. In the process of unlocking, unless both the KICK register write
    happens, KICK register becomes locked.
    -ARM write to cfg area fails without notice while KICK registers remain locked.

    Solution:
    - Boot unlocks these registers and then these registers are not touched by application at all.

    We will soon be adding this to our errata for the OMAPL137 device.  Which can be found at:
    http://www-s.ti.com/sc/techlit/sprz291

  • Hi,

    "So, if you are writing software that changes the pinmux on the DSP side after DSPLink is initialized, you would want to avoid the use of the KICK registers. Or at least that's how I interpret Arnie's advise."

    Just one clarification: Not just after DSPLink is initialized, but even before that, you should never write to kick registers. This is because the kick registers are unlocked initially, and then are not touched at all by all of TI software. All of the TI software expects the kick registers to be unlocked (i.e. we don't follow the process of unlock-use-lock due to the race conditions mentioned by Arnie). In case your software uses the KICK registers even before DSPLink is initialized and leaves them in a locked state (by following the unlock-use-lock approach), DSPLink would be in trouble because its IPC interrupts would get lost. Similarly, any other software that writes to the CFG registers would run into the same issue. The IPC interrupts happen to be in the CFG registers, and hence DSPLink is impacted; however any other software that uses anything in the CFG registers would be similarly impacted.

    Regards,
    Mugdha

  • I vaguely remember that EDMA wants things on 128 byte boundaries also. As a result, everything that I was using for a source or destination for EDMA transfers was on a 128 byte boundary (unless it was a HW register, in which case it was what it was).

    In one case, I had a single 32 bit value that was to be written to the SPI transmit register everytime I got a GPIO falling trigger event. This code was 128 byte aligned, but not cached by me. I don't know if there was any hidden caching going on, but everything worked fine, so I didn't explore it too much.

    When you mentioned multiples of 128 bytes also, I thought that I might want to put this single value in an array. I would still transfer only 4 bytes and it would still be non-incrementing for both the source and destination, so the effect would be minor (or so I thought). Instead, the single change of converting the data from 4 bytes to 128 barfed up the SPI timing completely.

    What seems to have happened is that the variable moved from internal memory to far memory when it became an array. I have 6 nanosecond far memory, so I didn't believe that moving the variable to far memory would be enough to cause the timing problem, so I converted it back to a 4-byte value, but added "far" in front. Again the SPI timing barfed. Please note that the receive SPI data is written to cached far memory. Even without making the 4-byte value into an array, the forced alignment of other arrays did "automatically" result in the 4-byte value having a 128 byte memory slot in the

    map when I put it in far memory. (In other words, although it isn't actually 128 bytes long, nothing appears in the map for the whole 128 bytes, so it isn't a cache corruption issue causing the SPI timing barf.)

    I'm still guessing that the move to far memory might somehow be causing cache thrashing, but I'm not sure how to verify that. I'm concerned because I can't add "near" in front of the 4 bytes and I'm worried that a later build might move this to far memory and break the SPI timing again.

    If anyone has any insights into the failure mode, please let me know. As I said, the code works as long as I leave it as a 4 byte value in internal memory, so this isn't an emergency. I'm just curious and a little "concerned".