PROCESSOR-SDK-J722S: Debugging MSC lib Function For Host Emulation

Part Number: PROCESSOR-SDK-J722S

Tool/software:

Hi, this is Yosep (Joseph) from StradVision.

We are trying to use a function defined in this library, libscalar.a.
Locally, it is located in the following directory
  • 92_j722s/vhwa_c_models/vpac3/lib/PC/x86_64/LINUX/release/libscalar.a
We are attempting to use 'scaler_top_processing' function below
in order to convert NV12 image to BGR, as defined in scalar_core.h
  • int scaler_top_processing(unsigned short *imgInput[2], unsigned short * imgOutput[SCALER_NUM_PIPES], Scaler_Config *config);

However, based on the return value of '0', and the output data stored in 'imgOutput', it seems that the function
is not behaving as expected.

And due to the unavailability of the source code, it is not possible to diagnose the problem further.

I am also sharing the actual values 'Scaler_Config *Config' was set to.


                Scaler_Config settings{};
                settings.G_inWidth[0] = 1920;
                settings.G_inHeight[0] = 1536;
                settings.G_inWidth[1] = 0;
                settings.G_inHeight[1] = 0;
                settings.bitWidth = 0;
                // Assign values to coef_sp[0]
                settings.coef_sp[0][0] = 3432;
                settings.coef_sp[0][1] = 0;
                settings.coef_sp[0][2] = -2799;
                settings.coef_sp[0][3] = 21845;
                settings.coef_sp[0][4] = 3432;

                // Assign values to coef_sp[1]
                settings.coef_sp[1][0] = 0;
                settings.coef_sp[1][1] = 21302;
                settings.coef_sp[1][2] = 0;
                settings.coef_sp[1][3] = -1;
                settings.coef_sp[1][4] = -1;

                uint32_t i;
                for (i = 0; i < 32; i++) {
                    settings.coef_mp[0].matrix[i][0] = 0;
                    settings.coef_mp[0].matrix[i][1] = 0;
                    settings.coef_mp[0].matrix[i][2] = 256;
                    settings.coef_mp[0].matrix[i][3] = 0;
                    settings.coef_mp[0].matrix[i][4] = 0;
                }
                for (i = 0; i < 32; i++) {
                    settings.coef_mp[1].matrix[i][0] = 0;
                    settings.coef_mp[1].matrix[i][1] = 0;
                    settings.coef_mp[1].matrix[i][2] = 0;
                    settings.coef_mp[1].matrix[i][3] = 256;
                    settings.coef_mp[1].matrix[i][4] = 0;
                }
                /* Coefficients for Nearest Neighbor */
                for (i = 0; i < 32; i++) {
                    settings.coef_mp[2].matrix[i][0] = 0;
                    settings.coef_mp[2].matrix[i][1] = 0;
                    settings.coef_mp[2].matrix[i][2] = 256;
                    settings.coef_mp[2].matrix[i][3] = 0;
                    settings.coef_mp[2].matrix[i][4] = 0;
                }
                for (i = 0; i < 32; i++) {
                    settings.coef_mp[3].matrix[i][0] = 0;
                    settings.coef_mp[3].matrix[i][1] = 0;
                    settings.coef_mp[3].matrix[i][2] = 0;
                    settings.coef_mp[3].matrix[i][3] = 256;
                    settings.coef_mp[3].matrix[i][4] = 0;
                }

                settings.cfg_Kernel[0].Sz_height = 5;
                settings.cfg_Kernel[0].Tpad_sz = 2;
                settings.cfg_Kernel[0].Bpad_sz = 2;
                settings.cfg_Kernel[0].Ln_offset = 0;
                settings.cfg_Kernel[1].Sz_height = 0;
                settings.cfg_Kernel[1].Tpad_sz = 0;
                settings.cfg_Kernel[1].Bpad_sz = 0;
                settings.cfg_Kernel[1].Ln_offset = 0;

                settings.unitParams[0].threadMap = 0;
                settings.unitParams[0].coefShift = 8;
                settings.unitParams[0].signedData = 0;
                settings.unitParams[0].x_offset = 0;
                settings.unitParams[0].y_offset = 220;
                settings.unitParams[0].outWidth = 896;
                settings.unitParams[0].outHeight = 512;
                settings.unitParams[0].initPhaseX = 2341;
                settings.unitParams[0].initPhaseY = 2340;
                settings.unitParams[0].hzScale = 8777;
                settings.unitParams[0].vtScale = 8776;
                settings.unitParams[0].satMode = 0;
                settings.unitParams[0].uvMode = 0;
                settings.unitParams[0].sp_vs_coef_src = 0;
                settings.unitParams[0].sp_vs_coef_sel = 0;
                settings.unitParams[0].sp_hs_coef_src = 0;
                settings.unitParams[0].sp_hs_coef_sel = 0;
                settings.unitParams[0].vs_coef_sel = 0;
                settings.unitParams[0].hs_coef_sel = 0;
                settings.unitParams[0].phase_mode = 0;
                settings.unitParams[0].filter_mode = 1;

I would also be happy to share the dump files of input and output image data,
but it would seem that it is not possible to share .nv12 or .bgr format files
by using drag-and-drop method functionality of this editor.


Other Environment Details
  1. GPU : NVIDIA GeForce RTX 4080
  2. Driver Version: 570.86.15
  3. CUDA Version: 12.8  
  4. OS : Ubuntu 20.04.6
  • MSC_SCALER_IMG_DUMP.zip

    After zipping the input and output image dump data, I was able to upload it.
    Please refer to the contents of the file.

    Input image size = 1920 * 1536 * 3 / 2
    Output Image size = 896 * 512 * 3

  • Hi Yosep, 

    Thanks for the information, I will review this. A couple of questions I have are:

    Is this being tested on the Vision-Apps SDK (Linux+RTOS) or the standard Linux SDK? Which SDK version are you using?

    I currently don't have a J722s board with me so I will try to get one asap.

    Thank you,
    Sarabesh S.

  • Hi Sarabesh,

    If I have understood your question correctly,
    we are actually not using neither VisionApps SDK nor a standard Linux SDK in this specific case.

    We are testing the library on a server system with the environment details provided above.
    One missing piece of information may be the CPU, which is an AMD Ryzen Threadripper PRO.

    What we are trying to do is, on a Linux host machine, import and use the libscalar.a library
    in our own SDK for image conversion.

    I hope this answers your question.
    Sorry for the late reply!

    Thanks,
    Yosep K.

  • Hi Yosep,

    Thanks for clarifying this. I am not sure I am the correct expert to help you here. I will redirect this ticket to the proper assignment. 

    Thank you,
    Sarabesh S.

  • Hi Yosep,

    But scalar does not support format conversion, from NV12 image to BGR. Scalar supports just scaling operation. 

    Regards,

    Brijesh

  • Hello Brijesh,
    is there a function that does the NV12 -> BGR conversion then?

    Thanks,
    Joseph.

  • Hi Joseph,

    No, not in the PC emulation mode. 

    In target mode, DSS WB path can support this conversion. 

    Regards,

    Brijesh

  • Per 7/23 weekly, 

    No, not in the PC emulation mode. 

    I will see if there is any sort of alternative to this, talking with Brijesh.

  • Thank you for the follow up.
    That would be great.

    Thank you.
    Yosep K.

  • Yoseop,

    We discussed internally in the back half of last week, investigating whether the DSS or the GPU has such capabilities.  Per that discussion, we confirmed that in PC Emulation mode, we do not have  any library function to do this conversion. Further, in target mode, we can use DSS, but DSS isn’t available in PC emulation mode.

    We then turned to the GPU.  While in target mode (meaning running directly on the board), the GPU can support the conversion, but in PC emulation mode, there is nothing in the Imagination toolkit that suggests the GPU is supported in that mode.

    Bottom line, it appears that Stradvision would have to write some custom implementation to convert this format. YUV 2 RGB is straight 3x3 matrix multiplication and there are many ways to convert from YUV420 to YUV444. One simple method is chroma replication.

    John 

  • Hell John,
    I understand that there is no color conversion function that is supported in PC emulation mode.
    Does this mean that it is possible to get some sort of support for image scaling??

    I have tried to "mimic" or implement both Image scaling and color conversion on our Host PC and
    both are causing slight differences in the final converted image.

    If I can get some sort of support for scaling, I think it could be helpful.

    thanks,
    Joseph

  • Yoseop, 

    Please note that the assigned engineer has been out this past week and will be returning next week. 

    John 

  • Hi Joseph,

    MSC scalar is supported in the host emulation mode, but color conversion is not. 

    Regards,

    Brijesh

  • Hello Brijesh,
    thank you for the reply.

    By "MSC scalar is supported in the host emulation mode",
    1. Are referring to "scaler_top_processing" function, which is in the main text of this thread?
    MSC Scalar == "scaler_top_processing" ??

    2. Or is there separate function that I can test out.

    If 1 is true, we go back to the beginning and require support on what the proper inputs are.
    If 2 is true, then please shed some light along that path.

    Thank you,
    Joseph K.

  • Hello Joseph,

    scaler_top_processing is the main API for accessing MSC scalar, but as i mentioned earlier, it can't support format conversion from NV12 to RGB.

    Regards,

    Brijesh

  • Hello Brijesh,
    So I should be able to at least scale the input to the desired dimensions using scaler_top_processing, disregarding format conversion ?
    In the current state, the scaler_top_processing is not functioning at all.

    Regards,
    Joseph

  • Hi Joseph,

    Yes, this is correct function. This API takes pointer to input and output image and pointer to configuration and based on this, it performs scaling operation.. 

    What do you mean by its not functioning at all? Are you seeing any error or incorrect output ??

    Regards,

    Brijesh

  • Hello Brijesh

    Yes, as I have mentioned in the main text of this thread, the function is returning '0', and the output image seems to be invalid.

    I have attached the scaler_config value that I used in the main text, and also in the very first comment of this thread, I have attached the input and output image data.

    Could you take a look that it and see if anything is off? Also what does a return value of '0' indicate?

    Thank you,
    Joseph

  • Hi Joseph,

    ok, i am requesting   to help here. 

    Regards,

    Brijesh

  • Hi Joseph, 

    A return value of 0 indicates that the function executed successfully without any errors.

    I have reviewed your configurations. Could you please explain why the following values were set as they are? I would like to understand the reasoning behind choosing these specific configuration values.

                    // Assign values to coef_sp[0]
                    settings.coef_sp[0][0] = 3432;
                    settings.coef_sp[0][1] = 0;
                    settings.coef_sp[0][2] = -2799;
                    settings.coef_sp[0][3] = 21845;
                    settings.coef_sp[0][4] = 3432;
    
                    // Assign values to coef_sp[1]
                    settings.coef_sp[1][0] = 0;
                    settings.coef_sp[1][1] = 21302;
                    settings.coef_sp[1][2] = 0;
                    settings.coef_sp[1][3] = -1;
                    settings.coef_sp[1][4] = -1;
                    
                    settings.unitParams[0].initPhaseX = 2341;
                    settings.unitParams[0].initPhaseY = 2340;
                    settings.unitParams[0].hzScale = 8777;
                    settings.unitParams[0].vtScale = 8776;
                    settings.unitParams[0].filter_mode = 1;



    Regards,

    Suneetha.

  • Hello Gullipalli,

    In short, these values originate from the TI board.
    Another teammate who specializes in porting our SW onto the TI board
    set a break point at the function call for scaler_top_processing,
    and he gave me the values of the variables that were visible at that exact point in time.

    The only values he set for the function are as seen in the code snippet, plus input image width, height, and in_img_format as NV12.

        // create coeff param
        tivx_vpac_msc_coefficients_t coeff_params;
        // resize : Nearest neighbor
        int32_t coeff_value[TIVX_VPAC_MSC_MAX_MP_COEFF_SET][TIVX_VPAC_MSC_MAX_TAP] = {
            {0, 0, 256, 0, 0},
            {0, 0, 0, 256, 0},
            {0, 0, 256, 0, 0},
            {0, 0, 0, 256, 0},
        };
    
        for (uint32_t set_idx = 0; set_idx < TIVX_VPAC_MSC_MAX_MP_COEFF_SET; set_idx++) {
            for (uint32_t phase_idx = 0; phase_idx < TIVX_VPAC_MSC_32_PHASE_COEFF; phase_idx++) {
                memcpy(&(coeff_params.multi_phase[set_idx][phase_idx * TIVX_VPAC_MSC_MAX_TAP]), &coeff_value[set_idx][0], sizeof(int32_t) * TIVX_VPAC_MSC_MAX_TAP);
            }
        }


    So the majority of the coefficients have been somehow calculated or otherwise originated from the TI board.

    Thanks,
    Joseph.

  • Hi Joseph.

    Thanks for sharing the details. Just to clarify — the function scaler_top_processing is part of the host emulation flow and normally does not require the TI board to run. From your note, it seems that your teammate set a breakpoint inside scaler_top_processing during execution on the TI board and captured the variable values at that point.

    Could you please confirm if these values were taken on the board execution, or while running in host emulation mode on the PC? This distinction will help us guide you better.

    Regards,

    Suneetha.

  • Hello Gullipalli,

    I have checked with my colleague,

    and the values were referenced during board execution.

    Thanks,
    Joseph.

  • Hi, Joseph, 

    Thank you for your confirmation. Please refer to the following example for the correct configuration to scale down an NV12 image.
    https://git.ti.com/cgit/processor-sdk/imaging/tree/kernels/hwa/test/test_vpac_msc_scale_multi_output.c?h=main#n587

    Regards,

    Suneetha.