This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

HeapMem_alloc loops forever!

Other Parts Discussed in Thread: AM3359, SYSBIOS

Hi all

I'm using the HeapMem implementation of the Heap in the following system configuration:

  • CCS 6.0.0.00190
  • SYS/BIOS 6.40.3.39
  • XDCtools 3.30.4.52
  • CPU: Sitara ARM AM3359

I need to open and close files dinamically and I don't know the file dimensions.

So, I call Memory_alloc(heapMem0, size, 0, &eb) and Memory_free(heapMem0, (Ptr*)mblock, size) to manage the Heap when necessary. I know that the value of "size" is correct for each function call; heapMem0 is an instance name declared in the .cfg file as follows:

  • var HeapMem = xdc.useModule('ti.sysbios.heaps.HeapMem');
  • var heapMem0Params = new HeapMem.Params();
  • heapMem0Params.instance.name = "heapMem0";
  • heapMem0Params.size = 16777216;
  • heapMem0Params.sectionName = ".myHeapSect";
  • Program.global.heapMem0 = HeapMem.create(heapMem0Params);

Randomly the task involved in the allocations hangs, looping forever inside the HeapMem_alloc() function code. The ROV seems to be blocked too and it waits for data forever. After some minutes the entire CCS does not answer and the only way is to turn off the device.

It seems that something in the heap queue is corrupted, beacuse the value of the currHeader is the same of the currHeader->next, so it loops forever looking for a NULL that will never found.

The situation when I pause the execution is:

I don't know if I'm making some mistakes using the HeapMem_alloc() or if there is some other kind of problem.

Do you kindly have any suggestions?

Thank you very much

Best Regards

Maurizio

  • Hi,

    Are you using the instrument, noninstrumented or custom BIOS libtype? If you are using custom, do you have asserts enabled (i.e. BIOS.assertsEnabled = true;)?

    Can you try the application built with instrumented or custom with assertsEnabled = true? We have checks in HeapMem_alloc/HeapMem_free that might help find the problem. It looks a little like you freed the same block twice.

    Generally how long does it take to reproduce the problem?

    Are you setting heapMem0 as your default heap?

    Does the address for the curHeader (0x8f011e00) in the expressions window you posted make sense? For example, is it in the heap's buf (you can look at ROV before the program goes south). What is the base address of the heap's buf?

    Todd

  • Hi Todd.
    Thank you for you suggestions.

    This was my assert configuration:

    Defaults.common$.diags_ASSERT = Diags.ALWAYS_OFF;
    BIOS.libType = BIOS.LibType_Custom;
    BIOS.assertsEnabled = false;

    Following your suggestion, I changed cfg file as follows:

    Defaults.common$.diags_ASSERT = Diags.ALWAYS_ON;
    BIOS.libType = BIOS.LibType_Custom;
    BIOS.assertsEnabled = true;

    And when the failure happened, the console showed this log:

    ti.sysbios.heaps.HeapMem: line 371: assertion failure: A_invalidFree: Invalid free
    xdc.runtime.Error.raise: terminating execution

    The values of the variables are all right until the xdc_runtime_IHeap_free function:

    But the values are no more valid when the code enters the ti_sysbios_heaps_HeapMem_free function:

    Size changed somehow from 12288 (right value inside xdc_runtime_IHeap_free) to 2401038880 (inside ti_sysbios_heaps_HeapMem_free) bytes.

    Obj pointer inside ti_sysbios_heaps_HeapMem_free, points to unknown location.

    Hints or workaround?

     

    These are my allocation MAP and MMU configuration: 

    SEGMENT ALLOCATION MAP

    run origin  load origin   length   init length attrs members
    ----------  ----------- ---------- ----------- ----- -------
    80000000    80000000    000658b2   000658b2    r-x
      80000000    80000000    00000098   00000098    r-x .init
      80000098    80000098    0006581a   0006581a    r-x .text
    800658b4    800658b4    00050000   00000000    rw-
      800658b4    800658b4    00050000   00000000    rw- .stack
    800b58b8    800b58b8    000247aa   000247aa    r--
      800b58b8    800b58b8    000247aa   000247aa    r-- .const
    800da068    800da068    00001560   00000000    rw-
      800da068    800da068    00001560   00000000    rw- .data
    800db800    800db800    00000040   00000040    r--
      800db800    800db800    00000040   00000040    r-- .vecs
    800dc000    800dc000    00004000   00000000    rw-
      800dc000    800dc000    00004000   00000000    rw- ti.sysbios.family.arm.a8.mmuTableSection
    800e0000    800e0000    00000e28   00000e28    r--
      800e0000    800e0000    00000e28   00000e28    r-- .cinit
    81000000    81000000    00440828   00000000    rw-
      81000000    81000000    00440828   00000000    rw- .bss
    8f000000    8f000000    01000000   00000000    rw-
      8f000000    8f000000    01000000   00000000    rw- .myHeapSect

    MMU configuration

    SYS_MMU_ENTRY applMmuEntries[] = {
        {(void*)0x09000000,0},  //FPGA - Non bufferable| Non Cacheable
        {(void*)0x48300000,0},  //PWM - Non bufferable| Non Cacheable
        {(void*)0x48200000,0},  //INTCPS,MPUSS - Non bufferable| Non Cacheable
        {(void*)0x48100000,0},  //I2C2,McSPI1,UART3,UART4,UART5, GPIO2,GPIO3,MMC1 - Non bufferable| Non Cacheable
        {(void*)0x48000000,0},  //UART1,UART2,I2C1,McSPI0,McASP0 CFG,McASP1 CFG,DMTIMER,GPIO1 -Non bufferable| Non Cacheable
        {(void*)0x44E00000,0},  //Clock Module, PRM, GPIO0, UART0, I2C0, - Non bufferable| Non Cacheable
        {(void*)0x4A300000,0},  //PRUSS1 - Non bufferable| Non Cacheable
        {(void*)0x49000000,0},  //EDMA3 - Non bufferable| Non Cacheable
        {(void*)0x49800000,0},  // EDMA in non-idle mode, Non bufferable| Non Cacheable
        {(void*)0x49900000,0},  // EDMA in non-idle mode, Non bufferable| Non Cacheable
        {(void*)0x49a00000,0},  // EDMA in non-idle mode, Non bufferable| Non Cacheable
        {(void*)0x4A100000,0},  //CPSW - Non bufferable| Non Cacheable
        {(void*)0xFFFFFFFF,0xFFFFFFFF}
    };

    My default heap is not heapMem0. When I want to use heapMem0 I call Memory_alloc() in this way: Memory_alloc(heapMem0, sizeof(RowStruct), 0, &eb).

    For your convenience, this is the cfg file:

    *********************** CFG ***************************************

    var Defaults = xdc.useModule('xdc.runtime.Defaults');
    var Diags = xdc.useModule('xdc.runtime.Diags');
    var Error = xdc.useModule('xdc.runtime.Error');
    var Main = xdc.useModule('xdc.runtime.Main');
    var Memory = xdc.useModule('xdc.runtime.Memory')
    var SysMin = xdc.useModule('xdc.runtime.SysMin');
    var System = xdc.useModule('xdc.runtime.System');
    var Text = xdc.useModule('xdc.runtime.Text');
    var Clock = xdc.useModule('ti.sysbios.knl.Clock');
    var Swi = xdc.useModule('ti.sysbios.knl.Swi');
    var Task = xdc.useModule('ti.sysbios.knl.Task');
    var Semaphore = xdc.useModule('ti.sysbios.knl.Semaphore');
    var BIOS = xdc.useModule('ti.sysbios.BIOS');
    var Hwi = xdc.useModule('ti.sysbios.hal.Hwi');
    var HeapMem = xdc.useModule('ti.sysbios.heaps.HeapMem');
    var MailBox = xdc.useModule('ti.sysbios.knl.Mailbox');
    var GateMutex = xdc.useModule('ti.sysbios.gates.GateHwi');
    var Log = xdc.useModule('xdc.runtime.Log');
    var Timestamp = xdc.useModule('xdc.runtime.Timestamp');
    var LoggingSetup = xdc.useModule('ti.uia.sysbios.LoggingSetup');
    var Load = xdc.useModule('ti.sysbios.utils.Load');
    var LogSync = xdc.useModule('ti.uia.runtime.LogSync');
    var TimestampProvider = xdc.useModule('ti.sysbios.family.arm.a8.TimestampProvider');
    var UIABenchmark = xdc.useModule('ti.uia.events.UIABenchmark');
    var Idle = xdc.useModule('ti.sysbios.knl.Idle');
    var Cache = xdc.useModule('ti.sysbios.hal.Cache');
    var ti_sysbios_family_arm_a8_Cache = xdc.useModule('ti.sysbios.family.arm.a8.Cache');

    Program.argSize = 0x0;
    Defaults.common$.diags_ASSERT = Diags.ALWAYS_ON;
    System.maxAtexitHandlers = 4;      
    BIOS.heapSize = 0x20000;
    Program.stack = 0x50000;
    SysMin.bufSize = 0x200;
    System.SupportProxy = SysMin;
    Main.common$.diags_INFO = Diags.ALWAYS_ON;
    Program.sectionsExclude = "^\\.bss|^\\.neardata|^\\.rodata";

    BIOS.libType = BIOS.LibType_Custom;
    BIOS.assertsEnabled = true;
    Load.hwiEnabled = true;
    Load.swiEnabled = true;
    Main.common$.diags_INFO = Diags.ALWAYS_ON;
    LoggingSetup.sysbiosSwiLogging = true;
    LoggingSetup.sysbiosHwiLogging = true;
    LoggingSetup.sysbiosHwiLoggingRuntimeControl = false;
    LoggingSetup.sysbiosSwiLoggingRuntimeControl = false;
    LoggingSetup.sysbiosTaskLoggingRuntimeControl = false;
    LoggingSetup.loadLoggingRuntimeControl = false;
    LoggingSetup.sysbiosLoggerSize = 32768;
    LoggingSetup.mainLoggerSize = 4096;
    LoggingSetup.mainLoggingRuntimeControl = true;
    environment['xdc.cfg.check.fatal'] = 'false';
    LoggingSetup.enableTaskProfiler = true;
    LoggingSetup.enableContextAwareFunctionProfiler = true;
    LoggingSetup.countingAndGraphingLogging = true;
    LoggingSetup.benchmarkLogging = true;
    LoggingSetup.snapshotLogging = true;
    LoggingSetup.sysbiosSemaphoreLogging = true;
    LoggingSetup.loadTaskLogging = true;
    LoggingSetup.loadSwiLogging = true;
    LoggingSetup.loadHwiLogging = true;

    Idle.idleFxns[0] = "&mmc_file_system_handler";
    Idle.idleFxns[1] = "&console_handler";
    Idle.idleFxns[2] = null;
    Idle.idleFxns[3] = null;

    var heapMem0Params = new HeapMem.Params();
    heapMem0Params.instance.name = "heapMem0";
    heapMem0Params.size = 16777216;
    heapMem0Params.sectionName = ".myHeapSect";
    Program.global.heapMem0 = HeapMem.create(heapMem0Params);
    Program.sectMap[".myHeapSect"] = new Program.SectionSpec();
    Program.sectMap[".myHeapSect"].loadSegment = "DDR2(HIGH)";
    Program.sectMap[".myHeapSect"].type = "NOINIT";

    *****************************************************************************

    Thank you very much

    Best Regards

    Maurizio

     

  • It looks like some corruption is going on. You can try setting a breakpoint on xdc_runtime_Error_raiseX__E. I'd look to see if any task stacks are over-written (you can check the stack peak in ROV->Task->Details). 

  • Hi Todd.
    I'm sorry for the delay, but in the last days we had to move a bit forward with the application.

    I tried to do what suggested: I was not able to set a breakpoint on xdc_runtime_Error_raiseX_E, but I checked the peak of the stack in the ROV when the program asserted.

    It seems to be all right (peaks seem to be all far from the sizes):


    I tried also to look at the heap in the ROV, but the CCS looped forever trying to acquire data!



    These are my allocation MAP and MMU configuration:

    SEGMENT ALLOCATION MAP

    run origin  load origin   length   init length attrs members
    ----------  ----------- ---------- ----------- ----- -------
    80000000    80000000    0006d4ea   0006d4ea    r-x
      80000000    80000000    00000098   00000098    r-x .init
      80000098    80000098    0006d452   0006d452    r-x .text
    8006d4ec    8006d4ec    00050000   00000000    rw-
      8006d4ec    8006d4ec    00050000   00000000    rw- .stack
    800bd4f0    800bd4f0    000247b4   000247b4    r--
      800bd4f0    800bd4f0    000247b4   000247b4    r-- .const
    800e1ca8    800e1ca8    00001560   00000000    rw-
      800e1ca8    800e1ca8    00001560   00000000    rw- .data
    800e3400    800e3400    00000040   00000040    r--
      800e3400    800e3400    00000040   00000040    r-- .vecs
    800e4000    800e4000    00004000   00000000    rw-
      800e4000    800e4000    00004000   00000000    rw- ti.sysbios.family.arm.a8.mmuTableSection
    800e8000    800e8000    00000e10   00000e10    r--
      800e8000    800e8000    00000e10   00000e10    r-- .cinit
    81000000    81000000    0044299a   00000000    rw-
      81000000    81000000    0044299a   00000000    rw- .bss
    8f000000    8f000000    01000000   00000000    rw-
      8f000000    8f000000    01000000   00000000    rw- .myHeapSect


    MMU configuration

    SYS_MMU_ENTRY applMmuEntries[] = {
        {(void*)0x09000000,0},  //FPGA - Non bufferable| Non Cacheable
        {(void*)0x48300000,0},  //PWM - Non bufferable| Non Cacheable
        {(void*)0x48200000,0},  //INTCPS,MPUSS - Non bufferable| Non Cacheable
        {(void*)0x48100000,0},  //I2C2,McSPI1,UART3,UART4,UART5, GPIO2,GPIO3,MMC1 - Non bufferable| Non Cacheable
        {(void*)0x48000000,0},  //UART1,UART2,I2C1,McSPI0,McASP0 CFG,McASP1 CFG,DMTIMER,GPIO1 -Non bufferable| Non Cacheable
        {(void*)0x44E00000,0},  //Clock Module, PRM, GPIO0, UART0, I2C0, - Non bufferable| Non Cacheable
        {(void*)0x4A300000,0},  //PRUSS1 - Non bufferable| Non Cacheable
        {(void*)0x49000000,0},  //EDMA3 - Non bufferable| Non Cacheable
        {(void*)0x49800000,0},  // EDMA in non-idle mode, Non bufferable| Non Cacheable
        {(void*)0x49900000,0},  // EDMA in non-idle mode, Non bufferable| Non Cacheable
        {(void*)0x49a00000,0},  // EDMA in non-idle mode, Non bufferable| Non Cacheable
        {(void*)0x4A100000,0},  //CPSW - Non bufferable| Non Cacheable
        {(void*)0xFFFFFFFF,0xFFFFFFFF}
    };


    What can I check or try to do?

    Thank yo very much
    Best Regards
    Maurizio

  • Why couldn't you set a breakpoint at xdc_runtime_Error_raiseX_E (please look in the mapfile for "Error_raise" to make sure the name is correct). What networking stack, HTTP and file system code are you using (and what version)?
  • Hi Todd.
    The impossibility to set a breakpoint in the function xdc_runtime_Error_raiseX__E was my mistake, because I was looking for the function as C code.
    I looked for the address of the function in the MAP file, as you suggested, and set a breakpoint at that address (using the disassembly form).
    This is the situation when the error happened and the code reached the breakpoint:


     
    It seems that the stack is all right.

    My code is based on plain sdk 1.1.0.6:
    - lwip 1.4.0;
    - HTTP: lwip 1.4.0 httpserver_raw_io application example;
    - file system: fatfs09 because we needed long file name support.

    In our application we use malloc and problem occurs calling the ff_memfree:

    What other trials can I do to understand the problem?

    Thank you very much

    Best Regards

    Maurizio

  • I'm not sure what else I can recommend since I'm not familiar with the code you are using (i.e. lwip). malloc uses a small header before the buffer to maintain the size of the buffer. This might be getting corrupted by something. I'd look to see what is right before the block of memory that you are freeing (when you are halted on the Error_raiseX breakpoint). Is the size field being over-written? You mentioned that the size changed in IHeap_free...you have to be careful when you have optimized code. The order of the execution gets tricky and the expressions window might be displaying the value you think it is. I'd look at the physical memory to see what the size is.

    Todd

  • Hi Todd.

    The code hasn't any optimization.

    Below what I saw looking to the memory when the program asserts.

    Looking to the variables in the xdc_Void xdc_runtime_IHeap_free function, it seems to be all right (green squares).

    The function call     __inst->__fxns->free((void*)__inst, block, size)     is the      HeapMem_free in the module HeapMem.c.

    Looking to the same variables, named "obj", "addr" and "size" in the HeapMem_free , I saw:

    1. The location (and then the value) of "obj" is no more valid (red squares in the picture).

    2. Location and value of "addr" could be ok (yellow sqares in the picture)

    3. The location of "size" could be ok, but its value is not right (yellow sqares in the picture) because, in the HeapMem_free, the value of "size" changed few lines before the assert: "if ((offset = size & (obj->minBlockAlign - 1)) != 0) { size += obj->minBlockAlign - offset; }" but obj was wrong!

    The memory area around the address 0x815C935C (inside default heap) is:

    The red, yellow and green squares are the locations described above.

    The memory area around the address 0x8F006880 (inside my custom heap) is:

    The values at the green squares (0x8F006880 and 0x00001380) are wrong: they are not the expected values (should be the data previously read from a file), but seems to be the address and the size of the block to be freed.

    We double checked the address alignement to avoid cache issues and everything is ok (block to be freed starts at address 0x8F006880 that is a multiple of a cache line size).

    Thank you very much.

    Best Regards.

    Maurizio

  • Your application might be built with no optimization, but the kernel is being built with optimization. Look at the custom compiler options right below where you set the Enable Asserts.

    I don't think the size is being corrupted. There is a check higher up in HeapMem_free that would have caught that. I think you simply have the same buffer being freed twice. I'd look further up the stack in the fs_close, http_state_free, etc files. Try to identify if there is a place where a double free could have occurred.
  • Hi Todd.
    You were right!
    We logged the HeapMem_free sequence while the software was running and we found that a double HeapMem_free occurs when the program asserts.
    We still haven't found which function calls the HeapMem_free twice.
    Now we have temporarily suspended the debug because we need to complete the application, but we'll resume search asap.

    Thank you very much
    Best Regards
    Maurizio