This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

Multi-buffering questions

Anonymous
Anonymous

Normal 0 false false false EN-US ZH-CN X-NONE

Hi All,

 

I would like to ask a question on multi-buffering:

  1. When is it needed?
  2. How many buffers are needed?

 

First, the processing time (interrupt service routine execution time) needed for each frame varies between applications, and might also be different between frames for a program. For convenience, let’s assume it to be a constant number (time period).

 

With this convenient assumption, we discuss three cases:

 

Case 1: if this number is very small, quantitatively, smaller than the interval between an EAV (end of frame) and the next SAV (start of frame), for example, only takes half of this interval, then the processing of the last frame can be finished before the next frame starts. In this case, no buffering is needed.

 

Case 2: let this number be a bit larger, quantitatively, larger than the EAV-to-SAV interval but smaller than one frame time, say, takes only 0.8 frame. Then two buffering is enough. This limit ( t < one frame time) is very important. With this satisfied, we can run the program for an infinitely long video stream (frame) input without any backlog and hazardous effect.

 

Case 3: Let this number exceed the time for one frame. How many buffers do we then need?

 

First, it is evident that in this case we can only process a finite amount of frames. If we let the program run infinitely long, since one frame of input cannot be processed in one frame time, DDR2 memory will be filled and “explode”.

 

Therefore we can, at each time, process only a finite number of frames. When the number of frames dealt with each time is fixed, how many buffers are needed?

 

As an abstract discussion of principle, let’s assume there is only a single core in the processor so no parallelism exists.

 

For each frame, let’s assume the time needed for its processing is P frames, and there are in total N frames to be processed. In the following table which is an illustration, we let P = 3.



Normal 0 false false false EN-US ZH-CN X-NONE

The leftmost column is the time axis based on frame count. Within each line, every cell represents the address needed to store a single frame in the memory.

 

Yellow background marks the entry of a new frame into the memory.

 

Red and green means the current frame is being processed, but green marks the last “frame count” for its processing. Because we have made P=3, each frame takes three frames’ time to process.

 

The number in each cell comprises two parts. The part in normal size is its index, corresponding to its number in the frame sequence. The subscript denotes, after the latest frame count (it can either have just been processed or not), how many frames’ processing time it still needs to for the processing to finish. Every unprocessed frame has subscript 3. For example, there is 13 at the left corner, and after one frame time’s processing, it become 12, next 11, and eventually 10, which we call 1finish. After processing has finished on frame 1, its space in the memory is now freed and the new frame 5 will be stored in its position.

 

This table describes the procedure of execution for a total of 14 (15) frames’ periods. It is largely self-evident.

 

An argument I would like to make is that for a sequence of N frames, each frame requiring P frames’ time to process, the buffering needed is approximately N × (P-1) / P. N is usually not very small, so the fraction part can be ignored for this approximation.

 

This is because, after every new P frames come in, the processing on one frame has finished and one frame space is freed. The “net” space occupied by this new P frames is therefore P-1 frames, and this is how the formula  N × (P-1) / P comes from.

 

When P=1, which corresponds to case 2 of the forgoing classification, N × (1-1) / 1 =0 still makes sense. Zero basically means that no buffering is needed, and has we have seen in case 2, N can be infinite (N ) but still just one frame buffering space is needed for alternating double buffering (Ping-Pong).

 

When



Normal 0 false false false EN-US ZH-CN X-NONE

So this is a tentative quantitative discussion on the number of buffers needed. It is merely a simplified study on the principle which could be different from the practice.

 

Regarding the logic, is it correct? I welcome any one’s opinion on this.

 

 


Sincerely,

Zheng

  • Zheng,

    In its simplest form you have the following...

    1 buffer will cause 'tearing', which is where the buffer reading process and the buffer writing process 'overlap' each other.

    2 buffers can eliminate tearing, but can cause excessive 'stutter', which is where either the reading process has to wait for the writing process (hence repeat a frame) or the writing process has to wait for the reading process (hence skipping a frame). Excessive stutter may be caused though where the two processes 'fight' against each other.

    3 buffers will always allow no tearing and minimum stutter since there is always a buffer of 'slack' between the read and write processes. Buffer management still needs to be handled correctly, but it is simply a case of either 2) above (If one process 'catches up' with the other process then use the same buffer again) but the extra buffer slack means that the two processes are never fighting against each other.

    This triple buffering works for all scenarios where the read and write frame rates are not exactly the same, even cases where the rates are very different.

    In this discussion the writing and reading processes can be anything. For example the writing could be from a camera and the reading could be the display, or the writing could be from the video decoder streaming video over Ethernet and the reading could be the display, or the writing could be a camera and the reading could be a video encoder etc....

    BR,

    Steve

  • Anonymous
    0 Anonymous in reply to Steve Clynes

    Normal 0 false false false EN-US ZH-CN X-NONE

    Normal 0 false false false EN-US ZH-CN X-NONE

    Dear Steve,

     

    I need to study each of your points. But,

     

    "This triple buffering works for all scenarios where the read and write frame rates are not exactly the same, even cases where the rates are very different."

     

    How do you define "triple buffering"? If each frame input data takes 10 frames' time to process, doesn't 10 frame input require 10 x 9/10 = 90 frames' intermediate storage? I don't think there is any way to squeeze it anymore.

     

     

    Zheng

  • Zheng,

    A buffer is a complete entity of data to be processed.

    Now, if you need to process every single input frame (or buffer) and are able to effectively control the output rate such that the net input frame rate and output frame rate are identical, then this is a different analysis. If you can control either the input or the output rate then you only need 2 buffers since you can then make sure that the input and output are aligned hence don't fight each other.

    I was assuming that the source (input) frame rate and output frame rate were completely independent, hence asynchronous to each other, and that skipping and/or repeating is OK.

    If the input and output frame rates are asynchronous and un-locked, and you want to process all frames, then you would need an infinite number of buffers if you do not do skip and repeat. If you do skip and repeat, then you need 3 buffers.

    What exactly is the problem you are trying to solve?

    BR,

    Steve