This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Question about video pixel coordination

Could anybody tell me why a pixel has it's width and height?

As I know, we usually use "(u,v)" to describe the location of a pixel in image plane. But I also read some documents to describe the height and width of a pixel.  If a pixel sounds like a rectangle, I don't know which corner is described by "(u,v)", it means the top-left corner of the rectangle or the center of the rectangle.

In geometry, a point of 2-D can't be divided, I think a pixel is a point. But why some documents describe it as a square or rectangle?

 

I'm sorry , maybe this post is not related with Davinci. I hope someone can help me to resolve it.  Could anybody recommend me a community to talk about questions in video processing(algorithm and theory)?

Thanks.

  • Hi Lorry,

    Unfortunately, sometimes people document terms incorrectly, I have seen documents that use YUV, YCbCr, and YPbPr interchangeably when they are in fact not the same. 

    To answer your question, a pixel (picture element) is a point in the screen; visibly this point can be circular or square shaped depending on the technology used (you have to get really close to see), but it is still a single point in the video screen.  [edit] Pixels are defined in terms of their pixel format(RGB666, RGB888, YCbCr422, YCbCr420,...).

    I prefer to use (x,y) instead of (u,v); but just like in math, these are forms of defining the LOCATION of a pixel (or point).  Using (U,V) may confuse some reader to believe this has something to do with YUV pixel format; this is why (x,y) is preferred.

    Finally, the video resolution (or size of video window) is defined in terms of pixels; for example NTSC (720x480) implies you have 720 pixels of video data on each line and you have 480 lines of video data.

    I hope this helps; but please feel free to follow up with any additional questions you may have.

     

  • Thanks Juan, I want to find a way to understand "inverse perspective transformation" which is an algorithm transforming from image plane to ground plane.

    For example:

    In this image, from observer view we see that the two lanes are converged at a horizontal edge or "vanished point" in image plane. But in fact, these two lanes is parallel in ground plane.

    so I want to know, whether the size of pixel will affect the transform equation.

    I understand what you said.

    thank you for your help.

     

  • I do not think that the pixel properties of the output will have a direct effect on a perpspective projection rendering itself, though after you have performed the transformation you may have to scale the output to fit whatever display you are working with.

    Note that most digital displays use square pixels, meaning that the proportions of the resolution match the proportions of the physical display, so you would normally be scaling the entire output image (if you need to at all), if you did not have square pixels you would just have to scale in one direction more than the other.

  • Lorry,

    The pixel format should not affect the transformation equation; you can think of an equation as being made up of points (or pixels), they should not define how those pixels are represented.  However, as far a memory manipulation is concerned (you will likely need to manipulate memory as part of the equation), you will be much better off with 16-bit pixels or even 24-bit pixels as opposed to something that is not byte aligned (e.g. 12-bit pixels).  Our Davinci VPSS uses DMA transfers to read and write 32-bytes at a time from memory, hence 16-bit pixels are optimal for capturing/displaying; these are just system level things to consider.

  • thanks for your suggestions.

  • Hi Juan,

        That is to say, in "perspective project" what I have to consider is equation x=f(u,v); y = h(u,v) in which (x,y) represents coordination in ground plane and (u,v) represents in image plane.

        And I don't need to worry about the width or the height of pixel.

  • correct, you can think of x,y and u,v as set of coordinates defining an 2-d plane; the equations are just a way to convert pixels from one plane to another.

    As Bernie suggested, the shape of the pixels are likely square (common these days) and should not matter.

     

  • Assuming that this is the sort of thing you are trying to do, I do believe that pixel shape will not matter as long as the source matches the display, if the pixels were too wide you would just end up with a wider image, as long as the vanishing point was aligned properly it would not matter if the road was very wide or very narrow, you would have the same effect, and having different pixel shapes would essentially just lead to having a wider or narrower road. This being said, as long as your output display is using the same pixel shape you would get the proper image out, if it was not you could scale before or after your transform and still get the proper image.

  • Thanks Bernie, this IEEE document is very helpful. :)

  • I recommend a document about Perspective Projection, that's  an US patent. Patent No. is 5933544, this document uses a new way to describe how to calculate from 2D to 3D. I hope it will help someone who is interested in this algorithm.

  • Thank you for contributing back to this community. :)