This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

MessageQ SharedRegion HeapMemMP problem

Other Parts Discussed in Thread: SYSBIOS

Hi,

In my application Im' using MessageQ between cores, but sometimes I get the following error messages when I do BIOS->Scan for Errors in ROV:

,ti.sdo.ipc.heaps.HeapMemMP,Detailed,(0x80000ae8),N/A,Caught exception in view init code: Error: The SharedRegion 2 of the SharedRegion pointer 0xa9c8b7d6 is currently invalid.
,ti.sdo.ipc.heaps.HeapMemMP,Detailed,(0x80000ae8),totalFreeSize,Error: could not access next element linked list: Error: The SharedRegion 2 of the SharedRegion pointer 0xa9c8b7d6 is currently invalid.
,ti.sdo.ipc.heaps.HeapMemMP,FreeList,HeapMemMP@80000ae8-freeList,N/A,Caught exception in view init code: Error: The SharedRegion 2 of the SharedRegion pointer 0xa9c8b7d6 is currently invalid.

The device is 6678, and the SharedRegion is in DDR3 (0x80000000).

Does anyone know what this error messages mean and how to fix it?

Thanks

  • Hi Johannes,

    First, could you tell me what software versions (CCS, BIOS, IPC, etc...) you're using?

    Thanks,

    Whitney

  • Hi,

    Sorry I forgot to give those information before. The versions are:

    CCS:                        5.1.0.09000
    C6000 Compiler:  7.4.0
    IPC:                         1.24.3.32
    Sys/BIOS:               6.33.6.50
    XDCtools:              3.24.5.48

    My cfg file is:

    var BIOS        = xdc.useModule('ti.sysbios.BIOS');
    BIOS.heapSize   = 0x1000000;
    BIOS.heapSection = "systemHeap";
    Program.sectMap["systemHeap"] = "DDR3"; // DDR3 starts in 0x8000_0000

    /* Shared Memory base address and length */
    var SHAREDMEM           = 0x80000000;
    var SHAREDMEMSIZE       = 0x01000000;

    /*
     *  Need to define the shared region. The IPC modules use this
     *  to make portable pointers. All processors need to add this
     *  call with their base address of the shared memory region.
     *  If the processor cannot access the memory, do not add it.
     */
    var SharedRegion = xdc.useModule('ti.sdo.ipc.SharedRegion');
    //SharedRegion.numEntries =   7;
    SharedRegion.setEntryMeta(0,
        { base: SHAREDMEM,
          len:  SHAREDMEMSIZE,
          ownerProcId: 0,
          isValid: true,
          name: "Shared0",
        });

    I have a shared region and the BIOS heap in DDR3. In the .map file I see:    

    ti.sdo.ipc.SharedRegion_0
    *          0    80000000    01000000     NOLOAD SECTION
                      80000000    01000000     --HOLE--

    systemHeap
    *          0    81000000    01000000     UNINITIALIZED
                      81000000    01000000     configApp2_pe66.oe66 (systemHeap)

    And the error messages (ROV->BIOS->Scan for errors...) are:

    mod                                                    tab               inst                                                           field                      message
    ,ti.sdo.ipc.heaps.HeapMemMP,    Detailed,    (0x81000ae8),                                        N/A,                      Caught exception in view init code: Error: The SharedRegion 2 of the SharedRegion pointer 0xa9c8b7d6 is currently invalid.
    ,ti.sdo.ipc.heaps.HeapMemMP,    Detailed,    (0x81000ae8),                                        totalFreeSize,     Error: could not access next element linked list: Error: The SharedRegion 2 of the SharedRegion pointer 0xa9c8b7d6 is currently invalid.
    ,ti.sdo.ipc.heaps.HeapMemMP,    FreeList,    HeapMemMP@81000ae8-freeList,    N/A,                      Caught exception in view init code: Error: The SharedRegion 2 of the SharedRegion pointer 0xa9c8b7d6 is currently invalid.

    As you can see, the address in the 'inst' field in in my systemHeap section, if I change the location of the heap, the address in the error changes too (the ae8 in the end of the address seems to be constant).
    The address in the 'message' field "SharedRegion pointer 0xa9c8b7d6" is always the same too.
    Another point that I think it's important to mention is that I only get this error in the last core that I use, i.e., if I use cores 0-2 the error occurs in Core2, if I add Core3, the error shows up only in Core3.

    The HeapMemMP is created as usual in Core0:

        // Create the heap that will be used to allocate messages.
        HeapMemMP_Params_init(&heapMemParams);
        heapMemParams.regionId       = 0;
        heapMemParams.name           = "SharedHeap0";
        heapMemParams.sharedBufSize  = 0xC00000;
        heapHandle = HeapMemMP_create(&heapMemParams);

        if (heapHandle == NULL) {
            System_abort("HeapBufMP_create failed\n" );
        }

        // Register this heap with MessageQ
        MessageQ_registerHeap((IHeap_Handle)heapHandle, HEAPID);

    And opened in the other cores:

        do {
            status = HeapMemMP_open("SharedHeap0", &heapHandle);
            // Sleep for 1 clock tick to avoid inundating remote processor with interrupts if open failed
            if (status < 0) {
                Task_sleep(1);
            }
        } while (status < 0);

        // Register this heap with MessageQ
        MessageQ_registerHeap((IHeap_Handle)heapHandle, HEAPID);    
        
    Also, I have a data pointer sent together with the message and I use this same heap to allocate memory for the data.

    Thanks

  • Does your program crash?  It definitely sounds like something is corrupting the Heap.

    I think you need to add a hardware watchpoint to the bad address (0x81000ae8) looking for the bad value (0xa9c8b7d6)

    That value definitely does not look like a SharedRegion pointer.

    Judah

  • Hi Judah,

    Sorry for the late reply.

    No, the application doesn't crash. If I put a hardware watchpoint (write) in address 0Xyyyyyb08 (the difference in the address form the previous post is because I made some changes to the code) it's reached twice:



    The error message I get in ROV  is about the 'free' properties of HeapMemMP. In all cores I create the same shared region I mentioned above.
    Also I have a HeapMemMP instance created in core0 and opened by the other cores.

    If in the static creation of the shared region I put:

    SharedRegion.translate = true;
                      
    I get the following error in ROV (now it's shared region 5 because I tested with 6 cores instead of 3 to see if the error still ocurred):

    ,ti.sdo.ipc.heaps.HeapMemMP,Detailed,(0x9a000b08),N/A,Caught exception in view init code: Error: The SharedRegion 5 of the SharedRegion pointer 0xa9c8b7d6 is currently invalid.
    ,ti.sdo.ipc.heaps.HeapMemMP,Detailed,(0x9a000b08),totalFreeSize,Error: could not access next element linked list: Error: The SharedRegion 5 of the SharedRegion pointer 0xa9c8b7d6 is currently invalid.
    ,ti.sdo.ipc.heaps.HeapMemMP,FreeList,HeapMemMP@9a000b08-freeList,N/A,Caught exception in view init code: Error: The SharedRegion 5 of the SharedRegion pointer 0xa9c8b7d6 is currently invalid.


    As you can see, the free size info is not shown and it shows the error message "Caught exception in view init code: Error: The SharedRegion 5 of the SharedRegion pointer 0xa9c8b7d6 is currently invalid."

    If I do:

    SharedRegion.translate = false;

    I get a different error message:

    ,ti.sdo.ipc.heaps.HeapMemMP,Detailed,(0x9a000b08),N/A,Caught exception in view init code: "C:/ti/xdctools_3_24_05_48/packages/xdc/rov/StructureDecoder.xs", line 517: java.lang.Exception: Target memory read failed at address: 0x3211442, length: 8
    ,ti.sdo.ipc.heaps.HeapMemMP,Detailed,(0x9a000b08),totalFreeSize,Error: could not access next element linked list: JavaException: java.lang.Exception: Target memory read failed at address: 0x3211442, length: 8
    ,ti.sdo.ipc.heaps.HeapMemMP,FreeList,HeapMemMP@9a000b08-freeList,N/A,Caught exception in view init code: JavaException: java.lang.Exception: Target memory read failed at address: 0x3211442, length: 8

    Thanks

  • Johannes,

    I'm trying to understand if this is a real issue in your program or just an issue that you see with ROV?  If its just an ROV thing, it could be that ROV is seeing stale data since you are looking at shared memory.  For example, if a buffer has been freed on one core but on another core the view is still looking at what was cached, then ROV could be showing stale data.

    Typically on c6678 device you should set SharedRegion.translate = false;  Because all cores see memory at the same address.

    Judah

  • Hi Judah,

    I agree that it can be just a ROV issue. But I still have a question about it:

     What's the "SharedRegion 5" in the error message "Error: The SharedRegion 5 of the SharedRegion pointer" it's the shared region created by Core6 or the number 5 is the number of shared regions created?

    Thanks

  • Since you have SharedRegion.translate is "true", it takes the bad value and computes the region id.  In this case it thinks the region id is 5 and its saying 5 is not a valid SharedRegion.

    Judah