c6474 free() problems

r_robotics

Other Parts Discussed in Thread: TMS320C6474

Hello,

I have been experiencing some odd behavior related to malloc() and free(). Basically, if my program crashes (thus freezing the DSP), the free() function spinlocks when I reload a program (different or same program). The only workaround I've found is to physically reset the board, or perform a hard power cycle. Sometimes, only a hard power cycle will fix the issue.

So my question is, is there a way to perform some sort of a soft reset, in code, that will re-initialize the hardware so that a hard power cycle is not necessary to resume normal function?

I'm using the Spectrum Digital TMS320C6474 evaluation board, CCSv4.2, and BIOSv5.41

Thanks in advance!

over 14 years ago

0 RandyP over 14 years ago

TI__Guru* 84110 points

A soft reset happens when you branch to _c_int00 which is the entry point for your program. This will not help your problem.

Your GEL script most likely does a GEL_reset() before loading your program. This is almost a hard reset and usually fixes things like this.

If a board-level hard reset is required to get out of this problem, it is deeper than just the free() function.

Can you give us any hints about what the program does? Do you make strong use of the hardware semaphores? Do you change cache settings or other hardware features?

Regards,
RandyP

If this answers your question, please click the Verify Answer button below. If not, please reply back with more information.

0 r_robotics over 14 years ago in reply to RandyP

Expert 1290 points

Hi Randy,

Right now I'm running a modified version of the "client" demo found in the NDK. It basically functions as a data echo server; receiving up to 6MiB of data, storing the data, and then returning the data to the sender. I first noticed the issue when I had a simple buffer overrun, which obviously caused some interesting behavior. I have noticed the issue several other times for unknown reasons. There appear to be no memory leaks or anymore buffer overruns.

I basically created a new function based on the supplied "dtask_udp_echo()" function found in <CCS3.3_INSTALL_DIR>/ndk_2_0_0/packages/ti/ndk/example/tools/common/servers

This demo, as I'm sure you know, sets up the EMAC and changes the L2RAM base to "point to the shared RAM" (which I take to mean the external 128MiB of DDR2). I've changed the default configuration to allow a heap to be created in DDR2 (the L2RAM length was obviously not enough to store 6MiB) and set the Memory Section Manager to designate DDR2 as the segment for malloc() and free() operations. Lastly, to force a Gigabit link, I commented lines 475, 480, 481, and 482 in ethdriver.c located in <CCS3.3_INSTALL_DIR>\ndk_2_0_0\packages\ti\ndk\src\hal\evm6474\eth_c6474\

Thanks for the help!

EDIT: on a possibly related note, I am unable to malloc() more than about 3MiB after the first malloc() called (regardless of size). If the first malloc() requests it, I can get up to about 8MiB (8MiB comes from 16MiB of the 128MiB of total DDR2 allocated to this program, minus 8MiB taken by the L2RAM, whose base is set as the DDR2 base...from the best I can tell). Just thought that might be of some use as well.

0 RandyP over 14 years ago in reply to r_robotics

TI__Guru* 84110 points

If what you need is help with the NDK, then we need to move this to the BIOS forum under Development Tools. If what you need is help with doing a reset, then I can give you a couple of suggestions.

But I doubt you really want to have a fix-the-symptom solution rather than find-the-cause solution. A CCS reset or board-level POR may get control back, but something is wrong that needs to be solved.

r_robotics said:
on a possibly related note, I am unable to malloc() more than about 3MiB after the first malloc() called (regardless of size). If the first malloc() requests it, I can get up to about 8MiB (8MiB comes from 16MiB of the 128MiB of total DDR2 allocated to this program, minus 8MiB taken by the L2RAM, whose base is set as the DDR2 base...from the best I can tell).

This is not completely clear, but sounds like a software limitation, by design or by function error. I am not sure what you mean exactly by "minus 8MiB taken by the L2RAM, whose base is set as the DDR2 base". Can you help me understand what this whole paragraph is saying, please?

Regards,
RandyP

0 r_robotics over 14 years ago in reply to RandyP

Expert 1290 points

Sorry if I wasn't clear. I'm a little confused myself. Perhaps this does belong in the BIOS form, sorry if it's in the wrong location. I seem to be experiencing similar issues when compiling programs without the NDK. So I'll let you decide where this should reside.

In any case, for this specific replicable problem, I'll continue to address the NDK example specifics.

With regards to your question, I believe that the configuration file is attempting to map the L2RAM to the DDR2 external RAM chip.

So, DDR2 = 16MiB total

L2RAM length defined as 8MiB

If the config file is mapping the L2RAM to DDR2, then that would leave 8MiB left over for a heap, etc.

Or I may have miss-interpreted the configuration and it is just changing the base of the L2RAM to map to a different section of L2RAM. If this is the case, then you can likely discard the last paragraph of my previous post for which you had the question.

The configuration file (*.tcf) that shipped with the demo contains these lines:

// Changed the base of L2RAM to point to the shared
// RAM of all 3 cores so as to enable this application
// to be run on multiple cores simultaneously.
bios.MEM.instance("L2RAM").base = 0x00800000;

0 RandyP over 14 years ago in reply to r_robotics

TI__Guru* 84110 points

You made a lot of edits before I had a chance to read this. I do believe we are both confused. Here is a blast of comments and questions:

If you have a failing example without the NDK, it will easier for me to help you. I am starting to suspect that the NDK is not the source of the problem, so if you can switch to the non-NDK one, please do.

It is a really bad idea to use the "L2RAM" label for anything other than L2RAM. It might be a one-line way to make a big change in your code mapping, but it is a guaranteed source of errors and is very confusing to the nice people on this side of your internet connection.

Your post above implies that the red lines are part of what was shipped with the demo, but is that the case or did you add this line and the 3 lines of comments? The Local L2 address range starting at 0x00800000 is definitely not shared RAM, in fact that address range cannot be accessed using that address by any bus master other than the one core of which an L2RAM space is loaded and used. Each core has a unique L2RAM module at 0x00800000 that only that one core can access. The Global L2 address ranges of 0x10800000 (Core0), 0x11800000 (Core1), and 0x12800000 (Core2) are mirrored ranges for the same Local L2 memory modules that only one core can access using 0x00800000.

If you have the tcf line in red, then L2RAM only has 1MB available (you can assume I mean MiB when I say MB). If you are trying to access 8MB from the real L2RAM, that will fail.

The best solution may depend on what you are really trying to do. If you want to move things around in your implementation of the NDK, and it quits working when you do that, then you need to back off from that change and find another way to do what you want. You may need to have 3 different tcf files for the 3 different cores, and use the Global L2 addresses. You will need to stay aware of how the chip will handle bootloading later on, but while you are loading the programs from CCS, the right entry points will be set automatically.

What are you really trying to do here?

Regards,
RandyP

0 RandyP over 14 years ago in reply to RandyP

TI__Guru* 84110 points

Although you have not said it, I will assume you are using the EVMC6474. Since this is a new project you are working on, I strongly encourage you to install and use CCSv4. You can use a free license with the EVM when you use the on-board emulation, just connecting a USB cable to the board probably the way you are doing it with CCS 3.3.

CCSv4 is easier to use, in my opinion, with any of our multi-core devices. And it is more stable. My CCS 3.3 would crash regularly whenever I had more than 3 cores connected at a time. But CCSv4 allows you to work with all the cores from a single CCS instance.

In the Training section of TI.com, there is a training video set for the C6474. It may be helpful for you to review all of the modules. You can find the complete video set at http://focus.ti.com/docs/training/catalog/events/event.jhtml?sku=OLT110002 .

In the TI Wiki Pages, we have information from our latest DSP training class that can help you understand how to use CCSv4. From the search box at the top of the Wiki, search for C6000 Workshop or go directly to C6000_Embedded_Design_Workshop_Using_BIOS. It only deals with a single-core DSP, but it teaches you how to use CCSv4 on a C64x+ DSP which is the same core as in the C6474. Another good idea is to search on CCSv4 GSG to find some Getting Started Guides. One of the GSGs that I keep a bookmark for is GSG:Common_target_configurations; I search for 6474 on that page to remember how to configure CCSv4 to use the on-board emulator.

Regards,
RandyP

0 r_robotics over 14 years ago in reply to RandyP

Expert 1290 points

Hi Randy,

In an attempted direct response to your questions/comments:

-I will see if I can reproduce the error without the NDK, if I cannot reliably reproduce the error, maybe this thread should be moved.

-I completely agree the L2RAM label should only be used as per your description. More on this below.

-You are correct, the demo that shipped with these exact lines of text, comments and all.

-Makes sense

-What I'm trying to do is start with a completely stock version of the NDK example ("client" demo) that shipped with the EVMC6474 software and build a prototype data echo server. The only changes I have made, I already listed in my previous posts.

========

RandyP said:

Although you have not said it, I will assume you are using the EVMC6474. Since this is a new project you are working on, I strongly encourage you to install and use CCSv4. You can use a free license with the EVM when you use the on-board emulation, just connecting a USB cable to the board probably the way you are doing it with CCS 3.3.

Actually I clearly spelled out that I am using that board, see the first post. As well as CCSv4.2. I have simply imported the CCSv3v3 project into CCSv4.2

Regards,

0 r_robotics over 14 years ago in reply to r_robotics

Expert 1290 points

EDIT: Removed, irrelevant

0 RandyP over 14 years ago in reply to r_robotics

TI__Guru* 84110 points

Looks like I need to change the nice adjective I used to describe myself earlier. One good thing is that I understand more about your situation from these last two posts. The bad is of course that my tact has failed me and I apologize. You can decide if it is good or bad that I prefer we continue this thread here until we figure out that it is unique to the NDK.

In your second post, you listed the following, and your later posts have cleared up some of my unknowns:

r_robotics said:
I basically created a new function based on the supplied "dtask_udp_echo()" function found in <CCS3.3_INSTALL_DIR>/ndk_2_0_0/packages/ti/ndk/example/tools/common/servers

This demo, as I'm sure you know, sets up the EMAC and changes the L2RAM base to "point to the shared RAM" (which I take to mean the external 128MiB of DDR2). I've changed the default configuration to allow a heap to be created in DDR2 (the L2RAM length was obviously not enough to store 6MiB) and set the Memory Section Manager to designate DDR2 as the segment for malloc() and free() operations. Lastly, to force a Gigabit link, I commented lines 475, 480, 481, and 482 in ethdriver.c located in <CCS3.3_INSTALL_DIR>\ndk_2_0_0\packages\ti\ndk\src\hal\evm6474\eth_c6474\

If you took the initial copy of the dtask_udp_echo function and only did mallocs as large as the default configuration would allow (how big was that?), would it work without the lockups?

There are a lot of things that could be the problem, even outside NDK-specific issues, and since there seem to be a small set of changes, can you narrow down your changes to figure out which one causes it to break? Some guesses for the list of things that could cause failures would include a limitation on the size of .sysmem used for the heap, cache coherency when the heap moves to DDR2, hard-coded assumptions in the NDK that break, for starters.

If you stay with a smaller buffer in your malloc(), does it work without failures?

Could you put the following into a zip file and attach it to a reply, please? When you are in the reply edit window, above the previous post click on Options and then on Add for a file attachment.

- original and modified tcf files
- .map file from your output folder (usually Debug or Release). Include the original if you have it.

Right now, are you only running this on core0?

Regards,
RandyP

0 r_robotics over 14 years ago in reply to RandyP

Expert 1290 points

No worries. I believe I fell victim to the same, I apologize as well :)

RandyP said:
You can decide if it is good or bad that I prefer we continue this thread here until we figure out that it is unique to the NDK.

Sounds good to me.

RandyP said:
If you took the initial copy of the dtask_udp_echo function and only did mallocs as large as the default configuration would allow (how big was that?), would it work without the lockups?

I'll see if I can narrow it down, and report my findings

RandyP said:
If you stay with a smaller buffer in your malloc(), does it work without failures?

Yes, it sure does.

RandyP said:
Right now, are you only running this on core0?

Yes sir, that's correct.

0 r_robotics over 14 years ago in reply to RandyP

Expert 1290 points

RandyP said:
If you took the initial copy of the dtask_udp_echo function and only did mallocs as large as the default configuration would allow (how big was that?), would it work without the lockups?

Modified TCF: 134217448 bytes

Original TCF: 101880 bytes

Both worked without lockups. I have attached the original and modified TCF files.

Thanks Randy!

tcf.zip

0 RandyP over 14 years ago in reply to r_robotics

TI__Guru* 84110 points

What is the next step?

It looks like you have good results. Your tcf shows a large 128MiB heap in DDR space and you are able to use that for your buffer.

Regards,
RandyP

0 r_robotics over 14 years ago in reply to RandyP

Expert 1290 points

Well since we have determined it's probably not a hardware issue (more likely related to the NDK or my own code), I suppose we should move this to another forum. However, I would prefer to investigate further myself before asking anyone else to look into the issue. Thereby minimizing the chance that someone spends their time trying to fix what turns out to be my own coding mistake. However if it's more appropriate to go ahead and move this thread, then by all means do so.

Randy, I appreciate your involvement, you've been very friendly and helpful with regards to isolating the issue and pointing me in the right direction.

Cheers,

0 le oanh over 13 years ago in reply to r_robotics

Prodigy 100 points

I work with Srio examples of TSM320C6474. I make .tcf file by using new->configure BIOS/DSP. after I insert .tcf file and buid project. I have problems of RTDX_read.

Someones can help me to make .tcf for TSM320C6474

Thank

Processors

Processors forum

c6474 free() problems