This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Help required on codec engine

Other Parts Discussed in Thread: OMAP3530

Dear Experts,

I wish some expert would answer my question this time. I have wanted for long time. but still no answer at all. I try to express my question clearly.

I have jm decoder to port into DSP side.  Since the malloc() and calloc() are not supported on DSP, also i tried MEM_alloc() and so on. They are not functionally. I do not know why too many SPRU guide just gives misleading for that ? (I may not  be correct, but the guide needs give more about environment and examples.)  So I have to try like the following:

I suppose to put configuration and memory initialization on app side (ARM) in order to avoid too many melloc() inside DSP side:

but, it is unsuccessful, Here is example: I have structure and configuration file like the following:

typedef struct VIDDECCOPY_TI_Obj {

    IALG_Obj    alg;            /* MUST be first field of all XDAS algs */

     int intra_profile_deblocking;               //!< Loop filter usage determined by flags and parameters in bitstream 

     int iDecFrmNum;

     int conceal_mode;

     int ref_poc_gap;

     int poc_gap;

     unsigned int    maxHeight;

     unsigned int    maxWidth;

     DecoderParams   *p_Dec;

     InputParameters   *inParams;    

} VIDDECCOPY_TI_Obj;

 

XDAS_Void Configure(InputParameters *p_Inp)

{

  memset(p_Inp, 0, sizeof(InputParameters));

  p_Inp->FileFormat = PAR_OF_ANNEXB;

  p_Inp->ref_offset=0;

  p_Inp->poc_scale=2;

  p_Inp->silent = FALSE;

  p_Inp->intra_profile_deblocking = 0;

  p_Inp->iDecFrmNum = 3;//dave

  p_Inp->write_uv=1;

  // picture error concealment

  p_Inp->conceal_mode = 0;

  p_Inp->ref_poc_gap = 2;

  }

So, I did like this: put the followings inside "Void encode_decode(VIDDEC_Handle dec, FILE *in, FILE *out)", (this function is located inside ./apps/video_copy.c)

VIDDECCOPY_TI_Obj *jmdec = (Void *) dec;

Configure(jmdec->inParams); 

and then keep others the same as the original codec engine,  I just do memcpy like the example from codec engine, which means it never related "jmdec" defined above.

But, the result is not correct from memcpy().

 

My question:

1.

If it is not allowed to set up any value for the codec handler (like "dec" above), how can I use the handler to pass some associated values into DSP side?

2.

&decInArgs, &decOutArgs are for this purpose? how to do so?

3.

The main question is that, how can I make those "malloc()" inside jm code  work in Code engine DSP side. example is appreciated.  

 

Best regards

David


  • dave li said:
    Since the malloc() and calloc() are not supported on DSP...

    This isn't true.  If you're having issues using malloc() on the DSP, it's likely an issue with config (maybe the rts lib needs to be configured with more memory?  Or there's an issue with interop between the rts lib and BIOS?) or build (maybe you need to link in the right rts lib?).  Unfortunately, I can't address this one, but I do know we supply libs that provide malloc() support on the DSP.

    If you want to follow up on this one, we should post it to the BIOS forum, where the DSP experts can help.

    dave li said:
    also i tried MEM_alloc() and so on. They are not functionally.

    This is also likely a config issue.  Perhaps you're not creating a MEM heap large enough for your MEM_alloc() allocations?  You most certainly can MEM_alloc() from the DSP side.

    This is another topic for the DSP experts on the BIOS forum.

    dave li said:
    So I have to try like the following:...

    To integrate your codec into the Codec Engine framework, you really should follow the XDAIS guidelines and not try to explicitly allocate memory on the DSP side.  I'm not following what you're trying to do, but...

    If your algorithm needs 'internal, worker buffers', it should ask from them via memTabs[]:

    • in IALG_Fxns->algNumAlloc() return how many memory blocks (max) you may want in the worst case
    • in IALG_Fxns->algAlloc() use the create params to decide your memory requirements and return details about your memory needs in the memTab[] array
    • the framework will allocate the memory for you
    • in IALG_Fxns->algInit() (and future fxn calls) utilize the memory that's been allocated on your behalf

    If, on the other hand, you need memory for [typically large] data buffers, these can be allocated on the ARM side using CE's Memory_contigAlloc() API.  A physically contiguous memory buffer will be returned, which you can fill and provide via VIDDEC's inBufs and outBufs fields (which utilize buffer pointers).

    Also, b/c you asked, this article describes how to extend XDM structures.  Note that you generally can't pass pointers in these extended fields - because the framework doesn't know they're pointers, it won't convert them from their ARM-usable virtual addresses to their DSP-usable physical addresses.  If you _really_ want to pass pointers in these extended fields, you can convert them from virt to phys addrs yourself, using Memory_getBufferPhysicalAddress().

    Chris

  • Thank you, Chris

    You comments are very helpful.

    Right now, I am trying to put all the decoder inside DSP side. 

     

    Since I am using CE, and I prefer to use XDIAS compliant memory allocation now.

    Inside jm decoder side, the total number of melloc() and calloc() are so many (more than 300). I have to allocate them on DSP side with "memTab[]". Here is my concern:

    How many "memTab[]" I can use? I was trying to find "IALG_Fxns->algNumAlloc()", but it seems to be set as "NULL", so I find "#define IALG_DEFMEMRECS 4". when I tried to re-define the default value, it seems no any influence as it is. Even though I set it as "1", and I already have 5 memories of  "memTab[4]", It is still working.  Any suggestion? 

    The main question is: If I have 300~500 memories of "memTab[]",   will it work or not?

     

    BR

     

    dave

  • IALG_Fxns->algNumAlloc() is a fxn your algorithm can optionally implement.  When initializing your IALG_Fxns table, if you set the algNumAllocs() fxn ptr to NULL, XDAIS will use 4 (IALG_DEFMEMRECS, as you've found).  Since you have more memory requests, you should provide an implementation of algNumAlloc() that returns 500... then be sure to plug your algNumAlloc() implementation into your IALG_Fxns table - something like this (modified from CE's examples/ti/sdo/ce/examples/codecs/viddec_copy/viddec_copy.c):

    #define IALGFXNS  \
        &VIDDECCOPY_TI_IALG,/* module ID */                         \
        NULL,               /* activate */                          \
        VIDDECCOPY_TI_alloc,/* alloc */                             \
        NULL,               /* control (NULL => no control ops) */  \
        NULL,               /* deactivate */                        \
        VIDDECCOPY_TI_free, /* free */                              \
        VIDDECCOPY_TI_initObj, /* init */                           \
        NULL,               /* moved */                             \
        addYourFxnHere      /* numAlloc */

    The framework will call this algNumAlloc() fxn, and see you need 500 entries.  It will dynamically create a memTab[] array with 500 entries, and pass it to your algAlloc() fxn, along with any create params.  Your algAlloc() fxn can evaluate these create params to determine how many memTab[]'s it _really_ needs - up to the worse case 500.  It should then fill the attributes (size/align/type/etc) fields of the memTab[] array, and return a "maybe less that worse case" value, indicating how many memTab[] records it really needs.

    The framework will then attempt to satisfy all those memTab[] requests.  If it can, it will complete the memTab[] with the actual memory addresses you've been granted and pass it to your algInit() fxn.

    FWIW, 500 is pretty high, but I don't know of a limit in the framework.  Typically, an algorithm combines many small memory needs into one block containing a data struct that contains the smaller memory fields.  This helps with fragmentation, as well as management of those many, little buffers.

    Just make sure you have configured your system with enough memory (BIOS sections and DSKT2 config).

    Chris

  • Thanks, Chris

    It is a explicit answer, and please accept my appreciation. Now I can set up more memTab[], but I still failed to  "make enough memory". I did like this:

    Thank you for giving a help. 

    David

     

     

    1. "./servers/all_codecs/all.cfg":

    var DSKT2 = xdc.useModule('ti.sdo.fc.dskt2.DSKT2');

    DSKT2.DARAM0     = "L1DHEAP";

    DSKT2.DARAM1     = "L1DHEAP";

    DSKT2.DARAM2     = "L1DHEAP";

    DSKT2.SARAM0     = "L1DHEAP";

    DSKT2.SARAM1     = "L1DHEAP";

    DSKT2.SARAM2     = "L1DHEAP";

    DSKT2.ESDATA     = "DDRALGHEAP";

    DSKT2.IPROG      = "L1DHEAP";

    DSKT2.EPROG      = "DDRALGHEAP";

    if (platform.match("evmOMAPL13[78]")) {

        DSKT2.DSKT2_HEAP = "SDRAM";

    }

    else {

        DSKT2.DSKT2_HEAP = "DDR2";

    }

     

    DSKT2.DARAM_SCRATCH_SIZES = [6490000,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0];

    DSKT2.SARAM_SCRATCH_SIZES = [6490000,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0];

     

    DSKT2.ALLOW_EXTERNAL_SCRATCH = true;

    DSKT2.SARAM_SCRATCH_SIZES[0] =  6490000;

     

    2. Set up memTab[] inside "VIDDECCOPY_TI_alloc()"

    /* Request memory for my object */

        memTab[23].size = 8000000;  // 8M memory for incoming video data

        memTab[23].alignment = 0;

        memTab[23].space = IALG_DARAM0;

        memTab[23].attrs = IALG_SCRATCH;

     

    but , it gives the following error:

    =========

    root@omapzoom2:/target-ce# ./app_remote.xv5T                                                        

    @0x000d423e:[T:0x4001df90] ti.sdo.ce.examples.apps.video_copy - main> ti.sdo.ce.examples.apps.videoy

    @0x000d4689:[T:0x4001df90] ti.sdo.ce.examples.apps.video_copy - App-> Application started.          

    Unable to handle kernel NULL pointer dereference at virtual address 0000013c                        

    pgd = c0004000                                                                                      

    [0000013c] *pgd=00000000                                                                            

    Internal error: Oops: 17 [#1]                                                                       

    last sysfs file: /sys/kernel/uevent_seqnum                                                          

    Modules linked in: lpm_omap3530 dsplinkk cmemk                                                      

    CPU: 0    Not tainted  (2.6.32 #1)                                                                  

    PC is at __remove_shared_vm_struct+0x28/0x8c                                                        

    LR is at 0x8000875                                                                                  

    pc : [<c008d0cc>]    lr : [<08000875>]    psr: 00000113                                             

    sp : c718fdf0  ip : c7260c28  fp : c718ffb0                                                         

    r10: 00001000  r9 : c036c234  r8 : 00000000                                                         

    r7 : 40000000  r6 : c7260bd0  r5 : c7260c28  r4 : c7260c28                                          

    r3 : 0000013c  r2 : c6c0d990  r1 : c6c0d990  r0 : c7260c28                                          

    Flags: nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user                                   

    Control: 10c5387d  Table: 87248019  DAC: 00000015                                                   

    Process avahi-daemon (pid: 561, stack limit = 0xc718e2e8)                                           

    Stack: (0xc718fdf0 to 0xc7190000)                                                                   

    fde0:                                     c7260c28 c008b2c0 40000000 c7260e38                       

    fe00: 00001465 00000000 c7260e38 c716c080 00000000 c71a2140 00000001 c008ce84                       

    fe20: c718fe28 00000000 0000007d c036c234 c716c080 00000000 c70b2180 c716c0b4                       

    fe40: c718e000 c004b768 00000000 c716c080 c70b2180 c004ecc8 c70b2180 00000037                       

    fe60: 00000000 c718e000 c70b2180 0000000b c718fee8 c004ff74 0000000b c71a2140                       

    fe80: c70b22ec 0000000b 000000dc c71e8570 c718fee8 c718e000 c718ff68 c0050414                       

    fea0: 0000000b c0058e9c c718fee8 c71e8580 c718ffb0 00000000 40223000 c718ffb0                       

    fec0: 4022402c 00000000 c718e000 00000048 00000001 c00294d8 00010000 c70b2180                       

    fee0: 00000001 c002d56c 0000000b 00000000 00030001 00000008 00000000 c01efab4                       

    ff00: 00000045 be898aac 00000048 c0026280 0000000d c00a6f54 c710a540 0004eef1                       

    ff20: 00000000 c00a767c 00000025 00000000 40037d4c 0000004e c0026f68 c718e000                       

    ff40: 00028fe0 c0052008 c03755f0 c0065884 ffffffff 00000000 40037d4c c718fe9c                       

    ff60: 00000010 c718ff1c 00000001 be8986d4 000003d8 00000000 00000000 00000045                       

    ff80: 00000000 be898adc 4006c900 ffffffff 40223000 00000007 4022402c 00000000                       

    ffa0: c718e000 00000048 00000001 c0026e0c 000563f0 00000000 00000001 4022402c                       

    ffc0: 4022409c 40223000 00000007 4022402c 4022402c 40224064 00000048 00000001                       

    ffe0: 000563a8 be898550 40169f58 40167844 60000010 ffffffff 00000000 00000000                       

    [<c008d0cc>] (__remove_shared_vm_struct+0x28/0x8c) from [<c008b2c0>] (free_pgtables+0x30/0x9c)      

    [<c008b2c0>] (free_pgtables+0x30/0x9c) from [<c008ce84>] (exit_mmap+0xc8/0x12c)                     

    [<c008ce84>] (exit_mmap+0xc8/0x12c) from [<c004b768>] (mmput+0x34/0xd0)                             

    [<c004b768>] (mmput+0x34/0xd0) from [<c004ecc8>] (exit_mm+0x10c/0x110)                              

    [<c004ecc8>] (exit_mm+0x10c/0x110) from [<c004ff74>] (do_exit+0x160/0x584)                          

    [<c004ff74>] (do_exit+0x160/0x584) from [<c0050414>] (do_group_exit+0x7c/0xa8)                      

    [<c0050414>] (do_group_exit+0x7c/0xa8) from [<c0058e9c>] (get_signal_to_deliver+0x2d4/0x304)        

    [<c0058e9c>] (get_signal_to_deliver+0x2d4/0x304) from [<c00294d8>] (do_notify_resume+0x68/0x5b8)    

    [<c00294d8>] (do_notify_resume+0x68/0x5b8) from [<c0026e0c>] (work_pending+0x1c/0x20)               

    Code: 0a000007 e593300c e593300c e2833f4f (e1932f9f)                             

    =========

  • This is a really long article, but might be worth reading through just for the background on memory usage in a DVSDK-like environment.  It also describes the steps needed to change the memory map - which you may need to do if you have very large memory requirements:

    http://processors.wiki.ti.com/index.php/Changing_the_DVEVM_memory_map

    To your direct question, captured and/or displayed video buffers are not allocated by the algorithm, but rather the [ARM-side, in your case] application.  The algorithm doesn't perform any actual I/O, and doesn't talk to device drivers.  Rather it's passive and is called by an application when the _application_ has data buffers to work on.  So, I think your need for "8M for incoming video data" is probably not needed by the alg, but may very well be needed by the ARM-side application (which is where the Linux drivers performing data capture are running).

    Typically, those large video buffers come from one of two places (both allocated on the ARM/Linux-side) - 1) the video driver itself, or 2) CMEM.

    I'm not sure why you're seeing a crash, but you can probably benefit from turning on CE_DEBUG to help narrow down the cause.

    Chris

  • THank you, Chris, great answer.

    yeah, I agree, you are right,  

    Q1:

    I already use the following to estimate the memory size, but, it just gives the total mem, not giving the used mem value.

    hServer = Engine_getServer(ce);

     

    Server_getNumMemSegs(hServer, &numSegs);

    for (ij = 0; ij < numSegs; ij++) {

        Server_getMemStat(hServer, ij, &memStat);

        if (strcmp(memStat.name, "DDRALGHEAP") == 0) {

            printf("DDRALGHEAP usage is %ld out of %ld available\n",

                    memStat.used, memStat.size);

        }

    }

     

    Q2:

    I want to initilize the content of memTab[] to "zero", like "calloc()", but, however, when I use "memset()", which seems no functionally at all. 

    How can i clean the content of memTab[]? 

    Thank you.

     

    David