This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

UNIVERSAL_process problem



Hi,

Board : dm8168 custom board

EZSDK 5.04.00.11

c6Accel 2.01.00.11

I am developing a custom dm8168 board.

And I would have created a DSP program,
using the c6accel, I would have to call the DSP function,
I am running on (UNIVERSAL_processAsync) asynchronous mode, will be performed in (UNIVERSAL_process) synchronous mode.

Please tell us why UNIVERSAL_processAsync or not called

Program are as follows.

C6Accel_setAsync(hC6);
C6accel_VLIB_extractLumaFromUYUV(hC6, pYuyvBuf, 720, 720, 480, pSrcBuf);
if (C6Accel_readCallType(hC6) == ASYNC){
     // Now wait for the callback
     printf("C6accel_VLIB_extractLumaFromUYUV() wait for callback,\n");
     ret = C6accel_waitAsyncCall(hC6);
     printf("ret(%d) = C6accel_waitAsyncCall()\n", ret);
}

Message when it is set to CE_DEBUG = 2 is as follows.

[DSP] [t=+426,647 us] [tid=0xb3d031e0] ti.sdo.ce.node: [+5] NODE> 0xb3d02be8 call(algHandle=0xb3d02ce8, msg=0x9f703e00); messageId=0x00010004
[DSP] [t=+000,088 us] [tid=0xb3d031e0] ti.sdo.ce.osal.Memory: [+E] Memory_cacheInv> Enter(addr=0x96cb1000, sizeInBytes=691200)
[DSP] [t=+000,252 us] [tid=0xb3d031e0] ti.sdo.ce.osal.Memory: [+X] Memory_cacheInv> return
[DSP] [t=+000,044 us] [tid=0xb3d031e0] ti.sdo.ce.osal.Memory: [+E] Memory_cacheInv> Enter(addr=0x96c00000, sizeInBytes=345600)
[DSP] [t=+000,156 us] [tid=0xb3d031e0] ti.sdo.ce.osal.Memory: [+X] Memory_cacheInv> return
[DSP] [t=+000,045 us] [tid=0xb3d031e0] ti.sdo.ce.universal.UNIVERSAL: [+E] UNIVERSAL_process> Enter (handle=0xb3d02ce8, inBufs=0xb3d0529c, outBufs=0xb3d05360, inOutBufs=0x0, inArgs=0x9f70407c, outArgs=0x9f7040a0)
[DSP] [t=+000,104 us] [tid=0xb3d031e0] ti.sdo.ce.VISA: [+5] VISA_enter(visa=0xb3d02ce8): algHandle = 0xb3d02d20
[DSP] [t=+000,056 us] [tid=0xb3d031e0] ti.sdo.ce.alg.Algorithm: [+E] Algorithm_activate> Enter(alg=0xb3d02d20)
[DSP] [t=+000,053 us] [tid=0xb3d031e0] ti.sdo.fc.dskt2: [+E] DSKT2_activateAlg> Enter (scratchId=2, alg=0xb3d031c8)
[DSP] [t=+000,059 us] [tid=0xb3d031e0] ti.sdo.fc.dskt2: [+2] DSKT2_activateAlg> Last active algorithm 0x0, current algorithm to be activated 0xb3d031c8
[DSP] [t=+000,073 us] [tid=0xb3d031e0] ti.sdo.fc.dskt2: [+X] DSKT2_activateAlg> Exit
[DSP] [t=+000,040 us] [tid=0xb3d031e0] ti.sdo.ce.alg.Algorithm: [+X] Algorithm_activate> Exit
[DSP] [t=+002,595 us] [tid=0xb3d031e0] ti.sdo.ce.VISA: [+5] VISA_exit(visa=0xb3d02ce8): algHandle = 0xb3d02d20
[DSP] [t=+000,068 us] [tid=0xb3d031e0] ti.sdo.ce.alg.Algorithm: [+E] Algorithm_deactivate> Enter(alg=0xb3d02d20)
[DSP] [t=+000,056 us] [tid=0xb3d031e0] ti.sdo.fc.dskt2: [+E] DSKT2_deactivateAlg> Enter (scratchId=2, algHandle=0xb3d031c8)
[DSP] [t=+000,060 us] [tid=0xb3d031e0] ti.sdo.fc.dskt2: [+4] DSKT2_deactivateAlg> Lazy deactivate of algorithm 0xb3d031c8
[DSP] [t=+000,063 us] [tid=0xb3d031e0] ti.sdo.fc.dskt2: [+X] DSKT2_deactivateAlg> Exit
[DSP] [t=+000,041 us] [tid=0xb3d031e0] ti.sdo.ce.alg.Algorithm: [+X] Algorithm_deactivate> Exit
[DSP] [t=+000,046 us] [tid=0xb3d031e0] ti.sdo.ce.universal.UNIVERSAL: [+X] UNIVERSAL_process> Exit (handle=0xb3d02ce8, retVal=0x0)
[DSP] [t=+000,064 us] [tid=0xb3d031e0] ti.sdo.ce.osal.Memory: [+E] Memory_cacheWb> Enter(addr=0x96c00000, sizeInBytes=345600)
[DSP] [t=+000,268 us] [tid=0xb3d031e0] ti.sdo.ce.osal.Memory: [+X] Memory_cacheWb> return
[DSP] [t=+000,044 us] [tid=0xb3d031e0] ti.sdo.ce.node: [+5] NODE> returned from call(algHandle=0xb3d02ce8, msg=0x9f703e00); messageId=0x00010004


Regard,

Hideki

  • Note that the CE_DEBUG output your showing is coming from the DSP side (notice the [DSP] prefix).

    The 'async' behavior is entirely on the Linux-side.  The difference between UNIVERSAL_process() and UNIVERSAL_processAsync()/process_Wait() is this:

    • UNIVERSAL_process() sends the processing job to the DSP and then waits for a reply before returning.
    • UNIVERSAL_processAsync() sends the processing job to the DSP and then returns to the user.  At some point later, the user will call UNIVERSAL_processWait() to wait for the reply.

    Your DSP-side algorithm doesn't know (nor care!) whether it's being called from a Linux-side UNIVERSAL_process() or UNIVERSAL_processAsync() call.  It just knows it received a job and has work to do.  So on the DSP-side, you won't see UNIVERSAL_processAsync() anywhere in your trace.  You will (and do in your trace output) see the DSP-side UNIVERSAL_process() call.

    Chris

  • Hi Chris,

    Thank you for your reply.

    "UNIVERSAL_process ()" "UNIVERSAL_processAsync ()" and is, It is the same UNIVERSAL_process that is output to the console if from the DSP side.

    If you use a "() UNIVERSAL_processAsync", Is it there if you are not going to asynchronous mode?
    I seem to have been return DSP processing is complete.

    I thought so because it was the same processing time by measuring the processing time

    Regards,

    Hideki

  • Hi ,

    Run the "c6accel" sample application, I have a benchmark test.

    Results have been published,

    Benchmarks using C6Accel Asynchronous call
    C6accel_DSP_fft16x16(),592 ,C6accel_DSP_fft16x16() wait for callback,7442
    C6accel_IMG_sobel_3x3_8(),502 ,C6accel_IMG_sobel_3x3_8() wait for callback,2574
    C6accel_IMG_sobel_3x3_16(),847 ,C6accel_IMG_sobel_3x3_16() wait for callback,4726

    When you run in my environment...

    Benchmarks using C6Accel Asynchronous call
    C6accel_DSP_fft16x16(),6749 ,C6accel_DSP_fft16x16() wait for callback,162
    C6accel_IMG_sobel_3x3_8(),2552 ,C6accel_IMG_sobel_3x3_8() wait for callback,155
    C6accel_IMG_sobel_3x3_16(),4669 ,C6accel_IMG_sobel_3x3_16() wait for callback,143

    It is not in the asynchronous mode.

    What's wrong?

    Do not anybody who similar symptoms occur?

    EZSDK v5.04.00.11

    c6accel v2.01.00.11

    CPU dm8168

    Hideki

  • Hi,

    If you look at the following source code of codec_engine, UNIVERSAL_processAsync () is present.
    "UNIVERSAL_processAsync" in the code where they will be logged.
    I believe if you run UNIVERSAL_processAsync () in the Linux applications side, the log is to be output, but you will no doubt?

    If you call the UNIVERSAL_processAsync (), you can still call the UNIVERSAL_process (), from the DSP, I will output "[DSP] .... UNIVERSAL_process>".

    Why?

    codec_engine_3_22_01_06

    <codec_engine>/packages/ti/sdo/ce/universal/universal_stub.c

    XDAS_Int32 UNIVERSAL_processAsync(UNIVERSAL_Handle handle,
    XDM1_BufDesc *inBufs, XDM1_BufDesc *outBufs, XDM1_BufDesc *inOutBufs,
    IUNIVERSAL_InArgs *inArgs, IUNIVERSAL_OutArgs *outArgs)
    {


    XDAS_Int32 retVal = UNIVERSAL_EFAIL;

    Bool checked = VISA_isChecked();

    Log_print5(Diags_ENTRY, "[+E] UNIVERSAL_processAsync> "
    "Enter (handle=0x%x, inBufs=0x%x, outBufs=0x%x, inArgs=0x%x, "
    "outArgs=0x%x)",
    (IArg)handle, (IArg)inBufs, (IArg)outBufs, (IArg)inArgs,
    (IArg)outArgs);

    if (handle) {
    IUNIVERSAL_Handle alg = VISA_getAlgHandle((VISA_Handle)handle);

    if (alg != NULL) {
    if (checked) {

    /* validate inArgs and outArgs */
    XdmUtils_validateExtendedStruct(inArgs, sizeof(inArgs),
    "inArgs");
    XdmUtils_validateExtendedStruct(outArgs, sizeof(outArgs),
    "outArgs");

    /* Validate inBufs and outBufs. */
    XdmUtils_validateSparseBufDesc1(inBufs, "inBufs");
    XdmUtils_validateSparseBufDesc1(outBufs, "outBufs");

    memset((void *)((XDAS_Int32)(outArgs) + sizeof(outArgs->size)),
    0, (sizeof(*outArgs) - sizeof(outArgs->size)));
    }

    retVal = processAsync(alg, inBufs, outBufs, inOutBufs, inArgs,
    outArgs);
    }
    }

    Log_print2(Diags_EXIT, "[+X] UNIVERSAL_processAsync> "
    "Exit (handle=0x%x, retVal=0x%x)", (IArg)handle, (IArg)retVal);

    return (retVal);


    }

    ====  Linux application source code   =======

    int C6accel_IMG_sobel_3x3_8
    ( C6accel_Handle hC6accel,
    const unsigned char *restrict in, /* Input image data */
    unsigned char *restrict out, /* Output image data */
    short cols, short rows /* Image dimensions */
    )
    {


    XDM1_BufDesc inBufDesc;
    XDM1_BufDesc outBufDesc;
    XDAS_Int32 InArg_Buf_size;
    IC6Accel_InArgs *CInArgs;
    UNIVERSAL_OutArgs uniOutArgs;
    int status;
    /* Define pointer to function parameter structure */
    IMG_sobel_3x3_8_Params *fp0;
    XDAS_Int8 *pAlloc;

    ACQUIRE_CODEC_ENGINE;

    /* Allocate the InArgs structure as it varies in size
    (Needs to be changed everytime we make a API call)*/
    InArg_Buf_size= sizeof(Fxn_struct)+
    sizeof(IMG_sobel_3x3_8_Params)+
    sizeof(CInArgs->size)+
    sizeof(CInArgs->Num_fxns);

    /* Request contiguous heap memory allocation for the extended input structure */
    pAlloc = (XDAS_Int8 *)Memory_alloc(InArg_Buf_size, &wrapperMemParams);
    CInArgs= (IC6Accel_InArgs *)pAlloc;

    /* Initialize .size fields for dummy input and output arguments */
    uniOutArgs.size = sizeof(uniOutArgs);

    /* Set up buffers to pass buffers in and out to alg */
    inBufDesc.numBufs = 1;
    outBufDesc.numBufs = 1;

    /* Fill in input/output buffer descriptor parameters and manage ARM cache*/
    /* See wrapper_c6accel_i.h for more details of operation */
    CACHE_WB_INV_INPUT_BUFFERS_AND_SETUP_FOR_C6ACCEL(in,0,cols * rows*sizeof(char));
    CACHE_INV_OUTPUT_BUFFERS_AND_SETUP_FOR_C6ACCEL(out,0,cols * rows*sizeof(char));

    /* Initialize the extended InArgs structure */
    CInArgs->Num_fxns=1;
    CInArgs->size= InArg_Buf_size;

    /* Set function Id and parameter pointers for first function call */
    CInArgs->fxn[0].FxnID= IMG_SOBEL_3X3_8_FXN_ID;
    CInArgs->fxn[0].Param_ptr_offset=sizeof(CInArgs->size)+sizeof(CInArgs->Num_fxns)+sizeof(Fxn_struct);

    /* Initialize pointers to function parameters */
    fp0 = (IMG_sobel_3x3_8_Params *)((XDAS_Int8*)CInArgs + CInArgs->fxn[0].Param_ptr_offset);

    /* Fill in the fields in the parameter structure */
    fp0->indata_InArrID1= INBUF0;
    fp0->outdata_OutArrID1= OUTBUF0;
    fp0->Col= cols;
    fp0->Row= rows;

    /* Call the actual algorithm */
    if (hC6accel->callType == ASYNC)
    {

    /* Update async structure */
    if (c6accelAsyncParams.asyncCallCount!=0){
    status = UNIVERSAL_EFAIL;
    printf("Async call failed as %d are still pending\n");
    }
    else{
    /* Context Saving */
    c6accelAsyncParams.asyncCallCount++;
    memcpy(&(c6accelAsyncParams.inBufs),&inBufDesc, sizeof (XDM1_BufDesc));
    memcpy(&(c6accelAsyncParams.outBufs), &outBufDesc,sizeof(XDM1_BufDesc));
    memcpy(&(c6accelAsyncParams.inArgs), CInArgs,sizeof(UNIVERSAL_InArgs));
    memcpy(&(c6accelAsyncParams.outArgs),&uniOutArgs,sizeof(UNIVERSAL_OutArgs));
    c6accelAsyncParams.pBuf = pAlloc;
    c6accelAsyncParams.pBufSize = InArg_Buf_size;
    /* Asynchronous Call to the actual algorithm */
    printf("call UNIVERSAL_processAsync()\n");
    status = UNIVERSAL_processAsync(hC6accel->hUni, &inBufDesc, &outBufDesc, NULL,(UNIVERSAL_InArgs *)CInArgs, &uniOutArgs);
    }
    }
    else{
    /* Synchronous Call to the actual algorithm */
    printf("call UNIVERSAL_process()\n");
    status = UNIVERSAL_process(hC6accel->hUni, &inBufDesc, &outBufDesc, NULL,(UNIVERSAL_InArgs *)CInArgs, &uniOutArgs);/* Free the InArgs structure */

    Memory_free(pAlloc, InArg_Buf_size, &wrapperMemParams);
    }

    RELEASE_CODEC_ENGINE;

    return status;

    }

    Regards,

    Hideki