This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

dm6437 YUV422 to RGB888, problem

Other Parts Discussed in Thread: TVP5150

 

Hello, everyone,

I am now doing image porcessing using DM6437.

I stored one frame in the DDR, and then I want to convert it from YUV(4:2:2) to RGB888, but when the image displayed on the screen, it likes this:

The code I used are as follows:

Int16 MedianFilter_test( )

{

        int height = 480;

int width = 720;

        tvp5150_init();    

    vpfe_init( 0x82000000, 720, 480);

_wait(3000000);

        VPFE_CCDC_PCR=0x0000000; 



for(i = 0; i < height; i++)

{

k = width*i*2;

l = width*i*2;

for(j = 0; j < width/2; j++)

{

cb = *((Uint8*)0x82000000 + 2 + 4*j + k);

y0 = *((Uint8*)0x82000000 + 1 + 4*j + k);

y1 = *((Uint8*)0x82000000 + 3 + 4*j + k);

cr = *((Uint8*)0x82000000 +  4*j + k);


r0 = (1.00000 * y0) + (0.00000 * cb) + (1.40200 * cr);

g0 = (1.00000 * y0) - (0.34414 * cb) - (0.71444 * cr);

b0 = (1.00000 * y0) + (1.72200 * cb) + (0.00000 * cr);


r1 = (1.00000 * y1) + (0.00000 * cb) + (1.40200 * cr);

g1 = (1.00000 * y1) - (0.34414 * cb) - (0.71444 * cr);

b1 = (1.00000 * y1) + (1.72200 * cb) + (0.00000 * cr);


if(r0 > 255) r0 = 255;

                        else if(r0 < 0) r0 = 0;

if(g0 > 255) g0 = 255;

else if(g0 < 0) g0 = 0;

if(b0 > 255) b0 = 255;

else if(b0 < 0) b0 = 0;

if(r1 > 255) r1 = 255;

else if(r1 < 0) r1 = 0;

if(g1 > 255) g1 = 255;

else if(g1 < 0) g1 = 0;

if(b1 > 255) b1 = 255;

else if(b1 < 0) b1 = 0;



*((Uint8*)0x83000000 + 6*j + l) = r0;

*((Uint8*)0x83000000 + 1 + 6*j + l) = g0;

*((Uint8*)0x83000000 + 2 + 6*j + l) = b0;

*((Uint8*)0x83000000 + 3 + 6*j + l) = r1;

*((Uint8*)0x83000000 + 4 + 6*j + l) = g1;

*((Uint8*)0x83000000 + 5 + 6*j + l) = b1;

}

}


vpbe_init( 0x83000000, 720, 480,0);   // Setup Back-End   

return 0;

}

 

could anyone help me?

Thank you very much!

 

Bing Lee

  • It  seems that  you had make a mistake in the color channle.  the yuv422 has many storage formats. such as  planner  half planner  interweave and so on.

    why donn't you use the vlib liarary  to do such things,

  • I expect you need to take offset from Cb and Cr.

    instead of cb and cr, please use (cb-128) and (cr-128).

    Also, to check the order of data, you may put  '0'  on coeff. for Cb and Cr. So, you can check whether you can get proper Grey image. Also, you need to check whether Y0 and Y1 are swapped or not. You may do the similar thing to check whether Cb and Cr are swapped or not.

     

    regards,

    Sang-Yong 

     

  • Dear all:

     Basically, the almost format to do image process is RGB888, so the first step to do is format conversion from YCbCr422 to RGB888.Why the efficiency is so low when we use to convert from YCbCr422 to RGB888, it takes almost 1 sec to process this conversion. Let alone to use on real time application. We try to use the optimization level in Build options, but the result is limited. Can anyone have any suggestion to improve the efficiency, below is the demo code.

     

    void ycbcr2rgb(Uint8* src, Int32 width, Int32 height, Uint8* des)

    {

            Int32 byte_count_line_rgb= width*3;

            Int32 byte_count_line_yuv=width*2;

            Int32 i,j,k;

            Uint8 temp[4];

            float p1 , p2 , p3;

     

            for(j=0 ; j <height ; j++) //convert from DDR2 buffer_in to buffer_out

            {

              k=0;

              for(i=0; i< byte_count_line_yuv ; i=i+4 )//1 line

              {

                temp[0] = *( Uint8*)(src+j*byte_count_line_yuv+i); //cb0

                temp[1] = *( Uint8*)(src+j*byte_count_line_yuv+i+1); //y0

                temp[2] = *( Uint8*)(src+j*byte_count_line_yuv+i+2); //cr0

                temp[3] = *( Uint8*)(src+j*byte_count_line_yuv+i+3); //y1

                    p1=(temp[1]-16)*1.164+(temp[2]-128)*1.596; //b0

                    p2=(temp[1]-16)*1.164-(temp[2]-128)*0.813-(temp[0]-128)*0.392; //g0

                    p3=(temp[1]-16)*1.164+(temp[0]-128)*2.017;  //r0

                    *(Uint8*)(des+j*byte_count_line_rgb+k)=(Uint8)p1 ;//b0;

                    *(Uint8*)(des+j*byte_count_line_rgb+k+1)=(Uint8)p2 ;//g0;

                    *(Uint8*)(des+j*byte_count_line_rgb+k+2)=(Uint8)p3 ;//r0;

     

                    p1=(temp[3]-16)*1.164+(temp[2]-128)*1.596; //b1

                    p2=(temp[3]-16)*1.164-(temp[2]-128)*0.813-(temp[0]-128)*0.392; //g1

                    p3=(temp[3]-16)*1.164+(temp[0]-128)*2.017;  //r1

                    *(Uint8*)(des+j*byte_count_line_rgb+k)=(Uint8)p1 ;//b1;

                    *(Uint8*)(des+j*byte_count_line_rgb+k+1)=(Uint8)p2 ;//g1;

                    *(Uint8*)(des+j*byte_count_line_rgb+k+2)=(Uint8)p3 ;//r1;

                    k=k+6;

              }

            }

     

    }

     

    Best regards,

    Alan

     

  • You may need to optimize the code more (also you may use approximation on do float operation, too (for example, pre-define coefficients like (int)(1.164 x 256+0.5) and use >>8 during operation), but I guess the main bottleneck comes from memory read/write.

    You need to use DMA if you don't use it now. Hope other experts chime in if you don't know how to use DMA.,

     

    regards,

    Sang-Yong

  • Dear Sang-Yong:

     We make a conclusion about what you mentioned above; there are some aspects to reduce the time of procedure:

    1.          Use pre-define coefficients

    2.          Use >> operator

    3.          Use DMA

    For our understanding, we can realize using item1&2 are to reduce the process time after compilation, however, how to use DMA to improve the bottleneck on memory read/write. We found the DMA is complicated and hard to use it flexibly, could you explain it more detail?

    We are starting to doubt whether we can use DM6437 on image process for real time application? Do you have any suggestion?

     

    Best regards,

    Alan

  • Alan,

    I am not an expert on DMA use.

    Could you make new question related to DMA, so it can be answered by different owner?

     

    regards,

    Sang-Yong

     

  • Dear Sang-Yong:

     Thanks, anyway.

     

    Best regards,

    Alan

  • Alan,

    Here's a possible hand-optimized version of your code. The main objective to avoid doing the same computation more than once. The compiler is probably doing the same. You could probably trade off accuracy for speed by using fixed-point. Depends on your processor.

    void ycbcr2rgb
    ( const Uint8 *src,   /* Input in YCbCr format. No alignment limitation. */
      int          width, /* Width in pixels */
      int          height,/* Height in pixels */
      Uint8       *des    /* Output in RGB888 format. No alignment limitation. */
    )
    {
      int   i;
      int   j;

      int   icb;
      int   iy0;
      int   icr;
      int   iy1;

      float fcb;
      float fy0;
      float fcr;
      float fy1;

      float frc;
      float fgc;
      float fbc;

      float fr;
      float fg;
      float fb;

      width /= 2; /* Width now means count of two pixels. */

      for(j=0; j <height; j++)
      {
         for(i=0; i< width; i++) //1 line
         {
           /* Read out two pixels cb,y0,cr,y1 */
           icb = *src++;
           iy0 = *src++;
           icr = *src++;
           iy1 = *src++;

           /* Offset values */
           icb -= 128;
           iy0 -= 16;
           icr -= 128;
           iy1 -= 16;

           /* Convert to float */
           fcb = (float)icb;
           fy0 = (float)iy0;
           fcr = (float)icr;
           fy1 = (float)iy1;

           /* Scale the Y values */
           fy0 *= 1.164F;
           fy1 *= 1.164F;

           /* Calc chroma values for RGB  */
           fbc = fcr*1.596F;
           fgc = fcr*0.813F + fcb*0.392F;
           frc = fcb*2.017F;

           /* Calc first RGB pixel */
           fb = fy0 + fbc;
           fg = fy0 - fgc;
           fr = fy0 + frc;

           /* Store first RGB pixel. */
          *des++ = (Uint8)fb;
          *des++ = (Uint8)fg;
          *des++ = (Uint8)fr;

           /* Calc second RGB pixel */
           fb = fy1 + fbc;
           fg = fy1 - fgc;
           fr = fy1 + frc;

           /* Store second RGB pixel */
          *des++ = (Uint8)fb;
          *des++ = (Uint8)fg;
          *des++ = (Uint8)fr;
        }
      }
    }

    A note about your code, the second RGB value is overwriting the first value. The indices should be 3,4,5, eg.

                    *(Uint8*)(des+j*byte_count_line_rgb+k+3)=(Uint8)p1 ;//b1;
                    *(Uint8*)(des+j*byte_count_line_rgb+k+4)=(Uint8)p2 ;//g1;
                    *(Uint8*)(des+j*byte_count_line_rgb+k+5)=(Uint8)p3 ;//r1;

    No guarantees it will work.

     

  • Dear Norman:

     Thanks for your suggestion. I’ll try it. By the way, do you have any idea about using DM6437 on image processing? For example, it is suitable for image recognition such as LDW and FCW?

     

     The error code you mentioned is right, I made a mistake during copy, thanks.

     

    Best regards,

    Alan

  • Dear Norman:

     After we try it, we found the improvement is limited. Just as Sang-Yong said, the bottleneck should be the memory read/write. So maybe we should focus our target on DMA. However, we appreciate your chime in, thanks!

     

    Best regards,

    Alan

  • Sorry, I don't anything about the DM6437 or image processing. I am not quite sure how DMA can help for the calculation part. DMA should help for moving data from memory to peripheral. As far I can tell, the DM6437 does not have a floating point processor. Software floating point operations are very expensive. Fixed point code is a bit tricky. Get it wrong and you'll get weird pictures. Here's some totally untested code to illustrate fixed-point math.

    /* Fixed point 32 bit = 16 bits whole + 16 bits fraction*/
    void ycbcr2rgb
    ( const Uint8 *src,   /* Input in YCbCr format. No alignment limitation. */
      int          width, /* Width in pixels */
      int          height,/* Height in pixels */
      Uint8       *des    /* Output in RGB888 format. No alignment limitation. */
    )
    {
      int   i;
      int   j;

      Int32 icb;
      Int32 iy0;
      Int32 icr;
      Int32 iy1;

      Int32 irc;
      Int32 igc;
      Int32 ibc;

      Int32 ir;
      Int32 ig;
      Int32 ib;

      const Int32 k1_164 = 0x000129FB; /* 1.164 */
      const Int32 k1_596 = 0x00019892; /* 1.596 */
      const Int32 k0_813 = 0x0000D01F; /* 0.813 */
      const Int32 k0_392 = 0x00006459; /* 0.392 */
      const Int32 k2_017 = 0x0002045A; /* 2.017 */
      const Int32 k128   = 0x00800000; /* 128 */
      const Int32 k16    = 0x00100000; /* 16 */

      width /= 2; /* Width now means count of two pixels. */

      for(j=0; j <height; j++)
      {
         for(i=0; i< width; i++) //1 line
         {
           /* Read out two pixels cb,y0,cr,y1 */
           icb = *src++;
           iy0 = *src++;
           icr = *src++;
           iy1 = *src++;

           /* Convert from int to fixed-point */
           icb <<= 16;
           iy0 <<= 16;
           icr <<= 16;
           iy1 <<= 16;

           /* Offset values */
           icb -= k128;
           iy0 -= k16;
           icr -= k128;
           iy1 -= k16;

           /* Scale the Y values */
           iy0 *= k1_164;
           iy1 *= k1_164;

           /* Calc chroma values for RGB  */
           ibc =  (icr*k1_596)>>16;
           igc = ((icr*k0_813)>>16) + ((icb*k0_392)>>16);
           irc =  (icb*k2_017)>>16;

           /* Calc first RGB pixel */
           ib = iy0 + ibc;
           ig = iy0 - igc;
           ir = iy0 + irc;

           /* Convert from fixed point to int */
           ib >>= 16;
           ig >>= 16;
           ir >>= 16;

           /* Bound to range 0-255 */
           if(ib < 0) ib = 0; else if(ib > 255) ib = 255;
           if(ig < 0) ig = 0; else if(ig > 255) ig = 255;
           if(ir < 0) ir = 0; else if(ir > 255) ir = 255;

           /* Store first RGB pixel. */
          *des++ = (Uint8)ib;
          *des++ = (Uint8)ig;
          *des++ = (Uint8)ir;

           /* Calc second RGB pixel */
           ib = iy1 + ibc;
           ig = iy1 - igc;
           ir = iy1 + irc;

           /* Convert to int */
           ib >>= 16;
           ig >>= 16;
           ir >>= 16;

           /* Bound to range 0-255 */
           if(ib < 0) ib = 0; else if(ib > 255) ib = 255;
           if(ig < 0) ig = 0; else if(ig > 255) ig = 255;
           if(ir < 0) ir = 0; else if(ir > 255) ir = 255;

           /* Store second RGB pixel */
          *des++ = (Uint8)ib;
          *des++ = (Uint8)ig;
          *des++ = (Uint8)ir;
        }
      }
    }

  • Dear Norman:

     One important thing you mentioned is fixed point process. The other is the huge image data process, which need to move from DDR2 to internal cache to process and then move the result to DDR2 again. So we want to know how to use DMA to accelerate the format conversion.

     By the way, the last code you provided is failed to display normally, maybe the result is wrong after calculating. We are not familiar with the fixed-point math, so we do not know what is wrong with the code. Do you have any comment?

    Best regards,

    Alan

     

  • DSP architecture is still new to me. On the ARM side, the data would remain in DDR2.

    Fixed point math is tricky because of possible overflow or underflow of calculations. I arbitrarily choose a 16.16 format for the example. I've found 22.10 works for yCbCr to RGB conversion on past projects. But it depends on the calculations and range of the numbers involved. My YCbCr format, RGB format and conversion was different than yours so I can't say exactly the correct precision. You could try 20.12 to see if makes any difference. Change all the shifts from 16 to 12. Shift all the constants right by 4 bits.

    The need for real-time sounds like you are streaming from sensor to display. Maybe reduce the image size to a minimum. Decimate your image before the conversion. Sometime the sensor or LCD have both YCbCr and RGB options. I've seen some HW where the sensor can be directly connected to the LCD.

  • Dear Norman:

     Thanks for your good suggestion, we will keep in mind.

    Best regards,

    Alan