This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

GLESv2 Texture streaming + rotation

Other Parts Discussed in Thread: DM3730

Hi,

I use a board based on DM3730.

Kernel : 2.6.37

Graphics SDK : 4.06.00.01

Server X : 1.9.0

I patch your example code of bc-cat-0.2.0 in order to display in fullscreen 4 buffers cmem (size 1280x720) with GL extension "GL_IMG_texture_stream2".

My size screen is 1280x720 in RGB24.

In addition, I add glFinish() before eglSwapBuffers() with the attribute FlushBehaviour=2 in file /etc/powervr.ini to completely flush the commands.

When I do not use the rotation, the GL commands take 16 or 17 ms.

But when I use vrfb rotation (90° or 270°), the GL commands take 38 or 39 ms.

Can you explain me this difference?

Regards,

Yoann

  • How are you using vrfb under X ?  Are you also seeing increased CPU loading under rotation condition ?

  • I load the kernel module omapfb.ko with parameters : "vrfb=y rotate=1"

    Then, I load modules drm, pvrsrvkm and bufferclass_ti.

    And, I launch the server X.

    So the framebuffer is rotated :

    ~ # fbset

    mode "1280x720-60"
            # D: 72.432 MHz, H: 44.656 kHz, V: 60.265 Hz
            geometry 1280 720 1280 6480 32
            timings 13806 190 120 13 3 32 5
            accel false
            rgba 8/16,8/8,8/0,0/0
    endmode

    / # fbset

    mode "720x1280-52"
            # D: 72.432 MHz, H: 68.204 kHz, V: 52.424 Hz
            geometry 720 1280 720 1280 32
            timings 13806 190 120 13 3 32 5
            accel false
            rgba 8/16,8/8,8/0,0/0
    endmode

    I can launch a binary to display cmem buffers with GLESv2.

    CPU loading without rotate : binary 5% and X 3%

    CPU loading with rotate 90° : binary 2% and X 2%

    And, when I initialize the data of cmem buffers, I observe a visual problem with the rotation. In fact, some "stairs" appears.

  • One possibility is that the larger vrfb stride introduces (memory, bandwidth) latencies in transfer.

  • How do I verify this possibility ?

    And, how do I optimize the latencies?

  • When I don't use the rotation, I haven't error.

    But, when I use the rotation 90°, I have these errors:

    (EE) pvr(0): [dri] PVRDRI2CreateFlipChain: Couldn't create flip chain
    (EE) pvr(0): [dri] PVRDRI2AssignAndExportBuffers: Couldn't create flip chain

    In addition, when I set a screen with size 640x480 in RGB24, I observe a spurious result.

    The GL commands take 16 ms without rotate and 12 ms with rotate. And I observe "stairs" in rotation.

    When I don't use vrfb, it seems the commands GL are synchronised with the vertical synchronization.

    But, when I use vrfb, something is broken or bypass.

  • I installed the graphics sdk in debug mode. And I observe that when vrfb is set the maximum of swap chain buffers is 1.

    How do I activate vrfb and keep the mechanism of swap chain buffers in order to have the vertical synchronization?

    Regards,

  • Hi,

    Can you please share your bootargs?

    Also what are your observations for 180 degrees ? Is is the same observation?

    We tried running unit test app at 90 degree rotation (set through bootargs) and do not see the error message you have mentioned.

    Logs below-

    root@am37x-evm:/opt/gfxlibraries/gfx_rel_es3.x# cat /proc/cmdline
    console=ttyO0,115200n8 noinitrd rw ip=dhcp root=/dev/nfs nfsroot=172.24.132.46:/home1/prathap/nfs/targetNFS45,nolock mem=128M vram=32M omapfb.rotate=1 omapfb.vrfb=y omapfb.debug=y omapfb.vram=0:24M,1:0M,2:0M
    root@am37x-evm:/opt/gfxlibraries/gfx_rel_es3.x#
    root@am37x-evm:/opt/gfxlibraries/gfx_rel_es3.x#

    //The below unit test app shows 2 traingles flipping fastly. The display looks fine & we do not see any errors.
    root@am37x-evm:/opt/gfxlibraries/gfx_rel_es3.x# ./xgles1test1  -f 3000
    --------------------- started ---------------------
    (II) pvr(0): [dri] Drawable 0x342610 - Creating buffer (att 1, 240 x 240, f 0) at 0x3429e0
    (II) pvr(0): [dri] Drawable 0x342610 - Creating buffer (att 0, 240 x 240, f 0) at 0x342cd0
    (II) pvr(0): [dri] Drawable 0x342610 - Destroying buffer (att 1, 240 x 240) at 0x3429e0
    (II) pvr(0): [dri] Drawable 0x342610 - Destroying buffer (att 0, 240 x 240) at 0x342cd0
    --------------------- finished ---------------------
    (II) Keyboard: Close
    (II) Main Touch Screen: Close
    root@am37x-evm:/opt/gfxlibraries/gfx_rel_es3.x# (II) pvr(0): [DRI2] Setup complete
    (II) pvr(0): [DRI2]   DRI driver: pvr
    (II) EXA(0): Driver allocated offscreen pixmaps
    (II) EXA(0): Driver registered support for the following operations:
    (II)         Solid
    (II)         Copy
    (II)         Composite (RENDER acceleration)
    (II)         UploadToScreen
    (==) pvr(0): DPMS enabled
    (==) pvr(0): Direct rendering enabled
    (EE) pvr(0): PVRDisplayCommandNoArgs: drmCommandWrite failed (-22)
    (EE) pvr(0): PVRDisplayScreenInitFinalize: PVRDisplayCommandNoArgs failed (-22)
    (==) RandR enabled
    (EE) AIGLX error: dlopen of /usr/local/XSGX/lib/dri/pvr_dri.so failed (/usr/local/XSGX/lib/dri/pvr_dri.so: cannot open shared object file: No such file or directory)
    (EE) AIGLX: reverting to software rendering
    (II) AIGLX: Screen 0 is not DRI capable
    (II) AIGLX: Loaded and initialized /usr/local/XSGX/lib/dri/swrast_dri.so
    (II) GLX: Initialized DRISWRAST GL provider for screen 0
    (**) Keyboard: always reports core events
    (**) Keyboard: Device: "/dev/input/event0"
    (II) Keyboard: Found keys
    (II) Keyboard: Configuring as keyboard
    (**) Main Touch Screen: always reports core events
    (**) Main Touch Screen: Device: "/dev/input/event1"
    (II) Main Touch Screen: Found absolute axes
    (II) Main Touch Screen: Found x and y absolute axes
    (II) Main Touch Screen: Found absolute touchscreen
    (II) Main Touch Screen: Configuring as touchscreen
    (**) Main Touch Screen: YAxisMapping: buttons 4 and 5
    (**) Main Touch Screen: EmulateWheelButton: 4, EmulateWheelInertia: 10, EmulateWheelTimeout: 200
    (II) Main Touch Screen: initialized for absolute axes.

    root@am37x-evm:/opt/gfxlibraries/gfx_rel_es3.x#

  • Hi,

    ~ # cat /proc/cmdline
    console=ttyO0,115200n8 consoleblank=0 mpurate=auto mem=414M@0x80000000 root=/dev/mmcblk1p3 rw rootfstype=ext4 rootwait omapfb.vram=0:32M omapfb.mode=dvi:1280x720MR-32@60 omapfb.vrfb=y omapfb.rotate=1

    When I launch your test "./xgles1test1  -f 3000", I haven't error. But, I can see that the demo is slower than without rotation.

    My test uses "texture streaming" and your test doesn't use this extension.

    You can find my test in this zip file :

    0083.bc-cat-0.2.0-fullscreen.zip

  • Hi,

    We tried measuring the performance of xgles1test1 unit test app with/without rotation.We did not see any performance drop.

    We will try the tests with texture streaming as well & see if its an issue specific to that with rotation enabled.

    Do you see the issue only at 90 degree or even at 180?

    Meanwhile can you also try following with rotation enabled(for 90 degree- omapfb.rotate=1) in bootargs -

    /etc/init.d/rc.pvr stop

    root@am37x-evm:~# fbset -xres 480 -yres 640 -vxres 480 -vyres 640
    root@am37x-evm:~# fbset

    mode "480x640"
        geometry 480 640 480 640 16
        timings 0 0 0 0 0 0 0
        rgba 5/11,6/5,5/0,0/0
    endmode

    /etc/init.d/rc.pvr start

    Now run your test app & see if you still see the error.

    Thanks,

    Prathap.

  • I see the issue at 90 degree and 180 degree.

    fbset is correct.

  • Hi,

    Thanks. Need your inputs on further tests mentioned below -

    1)Increase vram size in bootargs ie omapfb.vram=0:48M & let us know if you still see the flip chain creation failure issue.

    2)Edit xorg.conf, you can make option "FlipChain" "false" under device section. This should be done before start of X-server & see the result with this.

    3)Set width, height explicitly in your app. For eg in the unit test app, you can pass command line argument like ./xgles1test1 -w 480 -h 640.

    4)Try with non-Xorg build & let us know your observations/results.

    Thanks,

    Prathap.

     

     

  • Hi,

    1) It does not change anything. I still see the flip chain creation failure issue.

    2) I don't see the flip chain creation failure issue but I have the same problem.

    3) I don't understand because with your test, I don't see the problem.

    4) We work on an system with Xorg. We don't want delete Xorg.

    Did you test with my app?


    Regards,

    Yoann

  • Hi Yoann,

    Thanks. I want to rule out any setup/environment issues here.

    I want to match both our environements as close as possible. I am running tests on default LCD. Can you also please run your app on default LCD?

    Also try to use exactly the same bootargs as i am using below-

    Can you enable debug build & get the information on number of swap chain buffers. It should not be 1.

    For eg  without VRFB, with setting of 8MB vram on my EVM as below -

    setenv bootargs 'console=ttyO0,115200n8 noinitrd rw ip=dhcp root=/dev/nfs nfsroot=<hostip>:/home1/prathap/nfs/targetNFS45,nolock mem=128M vram=8M omapfb.vram=0:8M'

    I get the output on inserting gfx drivers-

    [  228.609039]  omaplfb: Device 0: Framebuffer virtual address: 0xc9000000
    [  228.609069] omaplfb: Device 0: Framebuffer size: 8388608
    [  228.609069] omaplfb: Device 0: Framebuffer virtual width: 480
    [  228.609100] omaplfb: Device 0: Framebuffer virtual height: 1920
    [  228.609100] omaplfb: Device 0: Framebuffer width: 480
    [  228.609100] omaplfb: Device 0: Framebuffer height: 640
    [  228.609130] omaplfb: Device 0: Framebuffer stride: 960
    [  228.609130] omaplfb: Device 0: LCM of stride and page size: 61440
    [  228.663330] omaplfb: Device 0: Maximum number of swap chain buffers: 13
    [  228.670379] omaplfb: Device 0: PVR Device ID: 1

    With VRFB , 8MB VRAM setting -

    console=ttyO0,115200n8 noinitrd rw ip=dhcp root=/dev/nfs nfsroot=<hostip>:/home1/prathap/nfs/targetNFS45,nolock mem=128M vram=8M omapfb.rotate=2 omapfb.vrfb=y omapfb.vram=0:8M

    [  287.812316] omaplfb: Device 0: Framebuffer physical address: 0x70000000
    [  287.812347] omaplfb: Device 0: Framebuffer virtual address: 0xd1800000
    [  287.812377] omaplfb: Device 0: Framebuffer size: 7864320
    [  287.812377] omaplfb: Device 0: Framebuffer virtual width: 480
    [  287.812408] omaplfb: Device 0: Framebuffer virtual height: 1920
    [  287.812408] omaplfb: Device 0: Framebuffer width: 480
    [  287.812408] omaplfb: Device 0: Framebuffer height: 640
    [  287.812438] omaplfb: Device 0: Framebuffer stride: 4096
    [  287.812438] omaplfb: Device 0: LCM of stride and page size: 4096
    [  287.866607] omaplfb: Device 0: Maximum number of swap chain buffers: 3
    [  287.873565] omaplfb: Device 0: PVR Device ID: 1

    Can you match the above outputs & then please try running your app on default LCD (480x640) and let us know your observations on the debug print & application behaviour with/without VRFB?

    Also capture fbset command outputs before and after running your app in both scenarios ie with/without VRFB. Let me know if you see difference in vyres between these outputs

    Thanks,

    Prathap.

  • Hi Prathap,


    Without VRFB, with setting of 32MB vram on my DM3730 board as below :

    ~ # cat /proc/cmdline
    console=ttyO0,115200n8 consoleblank=0 mpurate=auto mem=414M@0x80000000 root=/dev/mmcblk1p3 rw rootfstype=ext4 rootwait omapfb.vram=0:32M omapfb.mode=dvi:640x480MR-32@60

    omaplfb: Device 0: Framebuffer physical address: 0x97e00000
    omaplfb: Device 0: Framebuffer virtual address: 0xdb000000
    omaplfb: Device 0: Framebuffer size: 33554432
    omaplfb: Device 0: Framebuffer virtual width: 640
    omaplfb: Device 0: Framebuffer virtual height: 480
    omaplfb: Device 0: Framebuffer width: 640
    omaplfb: Device 0: Framebuffer height: 480
    omaplfb: Device 0: Framebuffer stride: 2560
    omaplfb: Device 0: LCM of stride and page size: 20480
    omaplfb: Device 0: Maximum number of swap chain buffers: 27
    omaplfb: Device 0: PVR Device ID: 1

    / # fbset
    mode "640x480-60"
            # D: 25.198 MHz, H: 31.498 kHz, V: 59.996 Hz
            geometry 640 480 640 480 32
            timings 39685 48 16 33 10 96 2
            accel false
            rgba 8/16,8/8,8/0,0/0
    endmode

    With VRFB , 32MB VRAM setting

    ~ # cat /proc/cmdline
    console=ttyO0,115200n8 consoleblank=0 mpurate=auto mem=414M@0x80000000 root=/dev/mmcblk1p3 rw rootfstype=ext4 rootwait omapfb.vram=0:32M omapfb.mode=dvi:640x480MR-32@60 omapfb.vrfb=y omapfb.rotate=1

    omaplfb: Device 0: Framebuffer physical address: 0x73000000
    omaplfb: Device 0: Framebuffer virtual address: 0xdb000000
    omaplfb: Device 0: Framebuffer size: 5242880
    omaplfb: Device 0: Framebuffer virtual width: 480
    omaplfb: Device 0: Framebuffer virtual height: 640
    omaplfb: Device 0: Framebuffer width: 480
    omaplfb: Device 0: Framebuffer height: 640
    omaplfb: Device 0: Framebuffer stride: 8192
    omaplfb: Device 0: LCM of stride and page size: 8192
    omaplfb: Device 0: Maximum number of swap chain buffers: 1
    omaplfb: Device 0: PVR Device ID: 1

    / # fbset
    mode "480x640-57"
            # D: 25.198 MHz, H: 39.373 kHz, V: 57.478 Hz
            geometry 480 640 480 640 32
            timings 39685 48 16 33 10 96 2
            accel false
            rgba 8/16,8/8,8/0,0/0
    endmode

    So, my maximum number of swap chain buffers is 1.

    Why do you have 3 buffers ?

    Regards,

    Yoann

  • Hi Yoann,

    Thanks for the logs. As you can observe from above logs in case of VRFB enabled bootargs, you are getting a fixed frame buffer size value irrespective of whatever you are setting for vram size in bootargs.

    omaplfb: Device 0: Framebuffer size: 5242880

    Without VRFB in bootargs, you are getting the frame buffer size according to what is being passed for vram size in bootargs.

    omaplfb: Device 0: Framebuffer size: 33554432

    You can try setting vram size to different values like 16, 8 MB etc and you will see that you will get the prints of frame buffer size matching the vram size you passed in bootargs without VRFB.  But with VRFB, you will always get a constant value. So this looks like an issue with the kernel (frame buffer driver) giving out constant value for frame buffer screen size with VRFB enabled. 

    Number of buffers in gfx driver is getting calculated based on formula - frame buffer size/ Buffer size  (line 1045. Also pasted below for your reference)

    Check code - GFX_Linux_KM/services4/3rdparty/dc_omapfb3_linux/omaplfb_displayclass.c

    psDevInfo->sDisplayInfo.ui32MaxSwapChainBuffers = (IMG_UINT32)(psDevInfo->sFBInfo.ulFBSize / psDevInfo->sFBInfo.ulRoundedBufferSize);

    frame buffer size is being read from screen_size member of fb_info structure in kernel & this screen_size gets set according to vram size configured in bootargs

    Buffer size = stride * height (where stride = width * bpp)

    Now based on above logic  in case of VRFB not enabled with 32 MB vram setting -

    Number of buffers = vram size/(stride*height) = 33554432/(2560*480) = ~ 27 buffers   (Here Stride = width* bpp = 640 *4 = 2560)

    Now for case where VRFB is enabled (Here vram size & stride is fixed as you can see from your logs)

    Number of buffers = vram size/(stride*height) = 5242880/(8192*640) = 1 buffer (Here Stride = width * bpp = 2048 *4 = 8192. For VRFB hardware width of 2048 is fixed).

    So you are getting only 1 buffer in case of VRFB enabled as shown above.

    I am getting 3 buffers for same resolution because i am trying on default LCD which is 16 bpp (RGB 565).

    You can try setting resolution to 320x240 so that you can get more than 1 buffer.

    We will explain this kernel frame buffer driver limitation in sgxdbg guide & also document that vsync mode is supported in Xorg driver without VRFB only.

    Thanks,

    Prathap.