This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

TDA2SX: how about the gpu performance ? such as BW and render capability

Part Number: TDA2SX


Hi team,

Could you kindly give some info about the GPU performance? since the customer's app break down when openGL render eight images to frameBuffer, every image file is .jpg format, the resolution is 1280*720 and  size is 450KBytes, the openGL would render every image to icon size area(width: 150pixels and height 100pixels).

Almost every time, the app would break down once openGL call glDrawArrays func to draw image. the render frequency is about 25fps, since the data is not very big, and the customer think that work loading of openGL is not heavy. So the question is why app breaks down and what's the limitation of GPU performance and resource.

BTW, operation steps when openGL render the screen:
1. load 8 pictures from tf card which mounted on the board, load one picture and render one icon, and repeat this procedure;
2. don't resize the jpg image before openGL render it, resizing image is finished by openGL.

Could you help check this case? Thanks.

Best Regards,

Cherry

  • Hi,

    May I know is there any updates?

    Thanks and Best Regards,

    Cherry

  • Cherry,

    Sorry, the issue description is not very clear.

    What exact do you mean by app would break down once OpenGL call glDrawArrays?

    Is there a reference to a testcase that can be reproduced on the TI EVM? Or some logs to indicate the problem?

    We need a reference testcase or logs or a clear description of the problem to proceed.

    Regards

    Karthik

  • Hi Karthik,

    The updates from customer are as follows:


    1, the usecase (partly) which app used:
        Capture -> Dup
        Dup -> Sync_disp -> Gate_camera_display -> Merge_display -> SgxFrmcpy (A15) -> Display
      
        in SgxFrmCpy link, the customer put the frameRender function in SgxFrmCpy processFrame function, i.e. SgxFrmCpy link would call frameRender function to render the screen, such as the video captured by cameras and ui widgets which rendered by openGL in this function when SgxFrmCpy link get new data from previous link.

    2,the app could process user's touch operation on touch screen, and switch to responding ui, that means frameRender could render different ui widgets when sgxFrmCpy link receive new image data.

    3,the app would break down when frameRender function try to render the setting ui widgets almost on every time.

    4,when app breaks down, the customer found it's call openGL glDrawArrays function to render some icons.

    5, the icon comes from jpg image file in tf card which mounted on board, the processdure as below:
        1, read image from tfcard;
        2, bind the image to tex (a data struct which is used to render iamge and word);
        3, call openGL api function to render the image as icon, resize the image and draw on the framebuffer;

    6,there are 8 icons need to render, but app is always killed by linux os. so, the customer wanna know the details and performance of GPU.

    btw:
    this is the function to render the icons:

    void className::fixed_paint_tex(TEX tex,float px,float py,float lx,float ly,float xx,float yy,float winWidth,float winHeight,float width,float height,bool stretch,float angle,float opacity)
    {

        GLboolean  cull_enable=glIsEnabled(GL_CULL_FACE );
        glEnable(GL_CULL_FACE);
        glCullFace(GL_BACK);

        float x0,y0,x1,y1;
        if(stretch){
            x0=(xx-winWidth/2)/(winWidth/2);
            y0=(winHeight/2-yy)/(winHeight/2);
            x1=(xx+width-winWidth/2)/(winWidth/2);
            y1=(winHeight/2-(yy+height))/(winHeight/2);
        }else{
            x0=(xx-winWidth/2)/(winWidth/2);
            y0=(winHeight/2-yy)/(winHeight/2);
            x1=(xx+lx-winWidth/2)/(winWidth/2);
            y1=(winHeight/2-(yy+ly))/(winHeight/2);
        }

        //////////////////////////////////////////////////////////////////
        float vertices[30+1]={
            x0, y0, -0.1,  px/(float)tex.tex_width, py/(float)tex.tex_height,      //   0        2/3
            x0, y1, -0.1,  px/(float)tex.tex_width, (py+ly)/(float)tex.tex_height, //
            x1, y0, -0.1, (px+lx)/(float)tex.tex_width, py/(float)tex.tex_height,      //
            x1, y0, -0.1, (px+lx)/(float)tex.tex_width, py/(float)tex.tex_height,      //
            x0, y1, -0.1,  px/(float)tex.tex_width, (py+ly)/(float)tex.tex_height, //
            x1, y1, -0.1, (px+lx)/(float)tex.tex_width, (py+ly)/(float)tex.tex_height,  //   1/4       5
            0
        };

        float arc_angle=angle*M_PI/180.0f;

        float  mid_x=(x0+x1)/2.0;
        float  mid_y=(y0+y1)/2.0;

        if(angle > 0.000001 ||  angle < -0.000001)   /* omit calculation when angle == 0; */
        {
            for(int i=0;i<30;i+=5)
            {
                xx=(vertices[i]-mid_x)*cos(arc_angle)-(vertices[i+1]-mid_y)*sin(arc_angle)+mid_x;
                yy=(vertices[i]-mid_x)*sin(arc_angle)+(vertices[i+1]-mid_y)*cos(arc_angle)+mid_y;
                vertices[i]=xx;
                vertices[i+1]=yy;
            }
        }


        GLuint program = shaderObj->get_program();
        glUseProgram(program);
        glUniform1i(glGetUniformLocation(program, "sTexture"), 0);
        glUniform1f(glGetUniformLocation(program, "opacity"), opacity);

        glActiveTexture(GL_TEXTURE0);
        glBindTexture(GL_TEXTURE_2D, tex.texid);

        glDisable(GL_DEPTH_TEST);

        if(b_blend)
        {
            glEnable(GL_BLEND);
            glBlendFunc(GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA);
        }
        else
            glDisable(GL_BLEND);

        glBindBuffer(GL_ARRAY_BUFFER, 0);
        glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, 0);
        glEnableVertexAttribArray(0);
        glEnableVertexAttribArray(1);

        glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, 20, vertices);
        glVertexAttribPointer(1, 2, GL_FLOAT, GL_FALSE, 20, vertices+3);


        glDrawArrays(GL_TRIANGLES, 0, 6);   /*  app breaks down when calls this function. */

        glDisableVertexAttribArray(0);
        glDisableVertexAttribArray(1);

        if(cull_enable)
            glEnable(GL_CULL_FACE);
        else
            glDisable(GL_CULL_FACE);
    }

    Thanks and Best Regards,

    Cherry

  • Hi,

    May I know is there any updates about the additional info above?

    Thanks and Best Regards,

    Cherry

  • Hi Cherry,

    What does app breakdown mean?

    Are you getting GPU crash or Linux Kernel panic?

    Do you have some logs with the errors?

    Regards,
    Stanley

  • Hi Stanley,

    Thanks for your support here!
    Below is app break down log.

    apps_jh6.out  is the app name.
    when app calls the gldrawArrays(...) func, it breaks down and log printed as below.

    linux kernel is unstable when app breaks down. it means that sometimes linux kernel breaks down either and resart, and sometimes, app halt, but linux still works well.

    [ 1858.599018] apps_jh6.out invoked oom-killer: gfp_mask=0x24000c4, order=0, oom_score_adj=0

    [ 1858.600066] apps_jh6.out cpuset=/ mems_allowed=0

    [ 1858.600679] CPU: 0 PID: 400 Comm: apps_jh6.out Tainted: G        W  O    4.4.84 #5
    [ 1858.601641] Hardware name: Generic DRA74X (Flattened Device Tree)
    [ 1858.602417] Backtrace:
    [ 1858.602755] [<c00131c4>] (dump_backtrace) from [<c00133c0>] (show_stack+0x18/0x1c)
    [ 1858.603715]  r7:ee353478 r6:60070013 r5:00000000 r4:c0842ad0
    [ 1858.604479] [<c00133a8>] (show_stack) from [<c0252128>] (dump_stack+0x8c/0xa0)
    [ 1858.605406] [<c025209c>] (dump_stack) from [<c0120608>] (dump_header+0x5c/0x1ac)
    [ 1858.606346]  r7:ee353478 r6:00000000 r5:edf47adc r4:ee353000
    [ 1858.607103] [<c01205ac>] (dump_header) from [<c00d5e38>] (oom_kill_process+0x2fc/0x448)
    [ 1858.608118]  r10:c082b8f8 r9:000104a4 r8:000000a2 r7:ee353478 r6:0006414c r5:edf47adc
    [ 1858.609149]  r4:ee353000
    [ 1858.609487] [<c00d5b3c>] (oom_kill_process) from [<c00d62e4>] (out_of_memory+0x2f0/0x32c)
    [ 1858.610523]  r10:c082b8f8 r9:000104a4 r8:c082b8f8 r7:c082bb78 r6:0006414c r5:edf47adc
    [ 1858.611551]  r4:ee353000
    [ 1858.611890] [<c00d5ff4>] (out_of_memory) from [<c00db1ac>] (__alloc_pages_nodemask+0x924/0x964)
    [ 1858.612992]  r10:c0867680 r9:024000c4 r8:00000000 r7:c0867690 r6:edf46000 r5:00000000
    [ 1858.614019]  r4:00000000
    [ 1858.614425] [<c00da888>] (__alloc_pages_nodemask) from [<bf0059fc>] (NewAllocPagesLinuxMemArea+0xcc/0x278 [pvrsrvkm])
    [ 1858.615769]  r10:00004000 r9:00000000 r8:00000000 r7:bf0345ec r6:000011f0 r5:0000047c
    [ 1858.616797]  r4:f26a41f0
    [ 1858.617233] [<bf005930>] (NewAllocPagesLinuxMemArea [pvrsrvkm]) from [<bf000a18>] (OSAllocPages_Impl+0xe4/0xfc [pvrsrvkm])
    [ 1858.618632]  r10:00000000 r9:ede75840 r8:00800000 r7:02014200 r6:edf47c34 r5:00000203
    [ 1858.619660]  r4:02014200
    [ 1858.620096] [<bf000934>] (OSAllocPages_Impl [pvrsrvkm]) from [<bf00892c>] (BM_ImportMemory+0x284/0x580 [pvrsrvkm])
    [ 1858.621406]  r5:00000203 r4:edee1080
    [ 1858.621987] [<bf0086a8>] (BM_ImportMemory [pvrsrvkm]) from [<bf0129dc>] (RA_Alloc+0xb8/0x29c [pvrsrvkm])
    [ 1858.623188]  r10:00000040 r9:ee368200 r8:bf0086a8 r7:edf47ca0 r6:00000040 r5:00800000
    [ 1858.624220]  r4:ee368200
    [ 1858.624661] [<bf012924>] (RA_Alloc [pvrsrvkm]) from [<bf008d00>] (BM_Alloc+0xd8/0x50c [pvrsrvkm])
    [ 1858.625786]  r10:00000040 r9:ee368200 r8:00800000 r7:c31ce608 r6:c2d39ac0 r5:edee1080
    [ 1858.626818]  r4:00000203
    [ 1858.627257] [<bf008c28>] (BM_Alloc [pvrsrvkm]) from [<bf009350>] (AllocDeviceMem+0xb4/0x194 [pvrsrvkm])
    [ 1858.628448]  r10:bf033b98 r9:edee1080 r8:00800000 r7:00000003 r6:edf47d4c r5:c31ce600
    [ 1858.629476]  r4:00000000
    [ 1858.629916] [<bf00929c>] (AllocDeviceMem [pvrsrvkm]) from [<bf009cf8>] (_PVRSRVAllocDeviceMemKM+0xb8/0x224 [pvrsrvkm])
    [ 1858.631270]  r9:eddd30c0 r8:f1a05000 r7:ede755c0 r6:ee382200 r5:edee1080 r4:00000003
    [ 1858.632396] [<bf009c40>] (_PVRSRVAllocDeviceMemKM [pvrsrvkm]) from [<bf01565c>] (PVRSRVAllocDeviceMemBW+0x194/0x40c [pvrsrvkm])
    [ 1858.633849]  r7:ede755c0 r6:00000000 r5:f1a06000 r4:00000000
    [ 1858.634707] [<bf0154c8>] (PVRSRVAllocDeviceMemBW [pvrsrvkm]) from [<bf0181dc>] (BridgedDispatchKM+0x94/0x25c [pvrsrvkm])
    [ 1858.636083]  r8:f1a06000 r7:f1a05000 r6:bf0154c8 r5:ede755c0 r4:edf47e68
    [ 1858.637076] [<bf018148>] (BridgedDispatchKM [pvrsrvkm]) from [<bf004d74>] (PVRSRV_BridgeDispatchKM+0x180/0x338 [pvrsrvkm])
    [ 1858.638475]  r8:00000040 r7:eddd30c0 r6:000000ac r5:c01c6707 r4:edf47e68
    [ 1858.639421] [<bf004bf4>] (PVRSRV_BridgeDispatchKM [pvrsrvkm]) from [<c031d888>] (drm_ioctl+0x140/0x454)
    [ 1858.640611]  r7:ee3c2c00 r6:c08a0310 r5:0000001c r4:edf47e68
    [ 1858.641367] [<c031d748>] (drm_ioctl) from [<c0133df4>] (do_vfs_ioctl+0x3f0/0x614)
    [ 1858.642316]  r10:00000000 r9:edf46000 r8:78c92e4c r7:00000010 r6:eddd3000 r5:ee32e648
    [ 1858.643349]  r4:78c92e4c
    [ 1858.643687] [<c0133a04>] (do_vfs_ioctl) from [<c0134054>] (SyS_ioctl+0x3c/0x64)
    [ 1858.644616]  r10:00000000 r9:edf46000 r8:78c92e4c r7:401c6440 r6:eddd3000 r5:00000010
    [ 1858.645644]  r4:eddd3001
    [ 1858.645985] [<c0134018>] (SyS_ioctl) from [<c000fae0>] (ret_fast_syscall+0x0/0x34)
    [ 1858.646945]  r9:edf46000 r8:c000fc84 r7:00000036 r6:401c6440 r5:78c92e4c r4:0000001c
    [ 1858.648038] Mem-Info:
    [ 1858.648370] active_anon:32705 inactive_anon:0 isolated_anon:0
    [ 1858.648370]  active_file:3983 inactive_file:4625 isolated_file:0
    [ 1858.648370]  unevictable:0 dirty:12 writeback:31 unstable:0
    [ 1858.648370]  slab_reclaimable:741 slab_unreclaimable:1912
    [ 1858.648370]  mapped:35661 shmem:0 pagetables:401 bounce:0
    [ 1858.648370]  free:314564 free_pcp:162 free_cma:31785
    [ 1858.652569] DMA free:107040kB min:1556kB low:1944kB high:2332kB active_anon:11996kB inactive_anon:0kB active_file:48kB inactive_file:40kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:753664kB managed:332080kB mlocked:0kB dirty:20kB writeback:0kB mapped:116004kB shmem:0kB slab_reclaimable:2964kB slab_unreclaimable:7648kB kernel_stack:1200kB pagetables:228kB unstable:0kB bounce:0kB free_pcp:12kB local_pcp:0kB free_cma:103476kB writeback_tmp:0kB pages_scanned:1100 all_unreclaimable? yes
    [ 1858.658174] lowmem_reserve[]: 0 0 1253 1253
    [ 1858.658780] HighMem free:1151804kB min:512kB low:3804kB high:7096kB active_anon:118824kB inactive_anon:0kB active_file:15884kB inactive_file:18460kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1307648kB managed:1307648kB mlocked:0kB dirty:28kB writeback:124kB mapped:26640kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:1376kB unstable:0kB bounce:0kB free_pcp:4kB local_pcp:4kB free_cma:23664kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
    [ 1858.664379] lowmem_reserve[]: 0 0 0 0
    [ 1858.664894] DMA: 22*4kB (UEHC) 14*8kB (UHC) 13*16kB (HC) 4*32kB (H) 5*64kB (HC) 2*128kB (C) 0*256kB 3*512kB (C) 2*1024kB (HC) 2*2048kB (HC) 24*4096kB (C) = 107096kB
    [ 1858.667005] HighMem: 551*4kB (UMC) 258*8kB (UMC) 333*16kB (UMC) 145*32kB (UMC) 184*64kB (UMC) 118*128kB (UMC) 85*256kB (UMC) 73*512kB (UMC) 37*1024kB (UM) 29*2048kB (UMC) 233*4096kB (MC) = 1151900kB
    [ 1858.669398] 8598 total pagecache pages
    [ 1858.669880] 0 pages in swap cache
    [ 1858.670322] Swap cache stats: add 0, delete 0, find 0/0
    [ 1858.671004] Free swap  = 0kB
    [ 1858.671383] Total swap = 0kB
    [ 1858.671763] 515328 pages RAM
    [ 1858.672133] 326912 pages HighMem/MovableOnly
    [ 1858.672678] 105396 pages reserved
    [ 1858.673101] 51200 pages cma reserved
    [ 1858.673568] [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name
    [ 1858.674738] [  119]     0   119      566      406       4       2        0             0 sh
    [ 1858.675802] [  172]     0   172   339108    68381     406       3        0             0 apps_jh6.out
    [ 1858.676983] Out of memory: Kill process 172 (apps_jh6.out) score 162 or sacrifice child
    [ 1858.678327] Killed process 172 (apps_jh6.out) total-vm:1356432kB, anon-rss:130596kB, file-rss:142928kB
    [ 1858.679658] apps_jh6.out: page allocation failure: order:0, mode:0x24000c4
    [ 1858.680561] CPU: 0 PID: 400 Comm: apps_jh6.out Tainted: G        W  O    4.4.84 #5
    [ 1858.681524] Hardware name: Generic DRA74X (Flattened Device Tree)
    [ 1858.682297] Backtrace:
    [ 1858.682632] [<c00131c4>] (dump_backtrace) from [<c00133c0>] (show_stack+0x18/0x1c)
    [ 1858.683594]  r7:c082866c r6:60070013 r5:00000000 r4:c0842ad0
    [ 1858.684354] [<c00133a8>] (show_stack) from [<c0252128>] (dump_stack+0x8c/0xa0)
    [ 1858.684404] apps_jh6.out: page allocation failure: order:1, mode:0x26000c0
    [ 1858.6890 r6:20000013 r5:00000000 r4:c0842ad0
    [ 1858.755913] [<c00133a8>] (show_stack) from [<c0252128>] (dump_stack+0x8c/0xa0)
    [ 1858.756837] [<c025209c>] (dump_stack) from [<c00d7ee4>] (warn_alloc_failed+0xe4/0x124)
    [ 1858.757841]  r7:c0867690 r6:00000000 r5:00000001 r4:026000c0
    [ 1858.758598] [<c00d7e04>] (warn_alloc_failed) from [<c00daa4c>] (__alloc_pages_nodemask+0x1c4/0x964)
    [ 1858.759745]  r3:00040001 r2:00000000
    [ 1858.760215]  r6:ede68000 r5:00000000 r4:00000000
    [ 1858.760831] [<c00da888>] (__alloc_pages_nodemask) from [<c00db4a8>] (alloc_kmem_pages_node+0x28/0xb4)
    [ 1858.761999]  r10:6e6803f8 r9:edde8c00 r8:c000fc84 r7:6e67fec8 r6:00000001 r5:ecd5f800
    [ 1858.763030]  r4:026000c0
    [ 1858.763371] [<c00db480>] (alloc_kmem_pages_node) from [<c003655c>] (copy_process+0x124/0x14c8)
    [ 1858.764462]  r7:6e67fec8 r6:6e6803f8 r5:ecd5f800 r4:003d0f00
    [ 1858.765215] [<c0036438>] (copy_process) from [<c0037a34>] (_do_fork+0x78/0x334)
    [ 1858.766143]  r10:6e6803f8 r9:ede68000 r8:c000fc84 r7:00000000 r6:00000000 r5:0022ca98
    [ 1858.767171]  r4:003d0f00
    [ 1858.767508] [<c00379bc>] (_do_fork) from [<c0037de4>] (SyS_clone+0x28/0x30)
    [ 1858.768392]  r10:00000000 r9:ede68000 r8:c000fc84 r7:00000078 r6:00000000 r5:0022ca98
    [ 1858.769420]  r4:6e6803f8
    [ 1858.769758] [<c0037dbc>] (SyS_clone) from [<c000fae0>] (ret_fast_syscall+0x0/0x34)
    [ 1858.770985] Mem-Info:
    [ 1858.771291] active_anon:32705 inactive_anon:0 isolated_anon:0
    [ 1858.771291]  active_file:3983 inactive_file:4621 isolated_file:0
    [ 1858.771291]  unevictable:0 dirty:12 writeback:31 unstable:0
    [ 1858.771291]  slab_reclaimable:714 slab_unreclaimable:1912
    [ 1858.771291]  mapped:35661 shmem:0 pagetables:401 bounce:0
    [ 1858.771291]  free:315836 free_pcp:52 free_cma:31785
    [ 1858.775563] DMA free:111540kB min:1556kB low:1944kB high:2332kB active_anon:11996kB inactive_anon:0kB active_file:48kB inactive_file:24kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:753664kB managed:332080kB mlocked:0kB dirty:20kB writeback:0kB mapped:116004kB shmem:0kB slab_reclaimable:2856kB slab_unreclaimable:7648kB kernel_stack:1200kB pagetables:228kB unstable:0kB bounce:0kB free_pcp:204kB local_pcp:60kB free_cma:103476kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
    [ 1858.781122] lowmem_reserve[]: 0 0 1253 1253
    [ 1858.781702] HighMem free:1151804kB min:512kB low:3804kB high:7096kB active_anon:118824kB inactive_anon:0kB active_file:15884kB inactive_file:18460kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1307648kB managed:1307648kB mlocked:0kB dirty:28kB writeback:124kB mapped:26640kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:1376kB unstable:0kB bounce:0kB free_pcp:4kB local_pcp:0kB free_cma:23664kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
    [ 1858.787335] lowmem_reserve[]: 0 0 0 0
    [ 1858.787839] DMA: 183*4kB (UMEHC) 84*8kB (UMEHC) 41*16kB (MEHC) 13*32kB (MEH) 11*64kB (UMEHC) 15*128kB (UEC) 2*256kB (U) 3*512kB (C) 2*1024kB (HC) 2*2048kB (HC) 24*4096kB (C) = 111596kB
    [ 1858.790220] HighMem: 551*4kB (UMC) 258*8kB (UMC) 333*16kB (UMC) 145*32kB (UMC) 184*64kB (UMC) 118*128kB (UMC) 85*256kB (UMC) 73*512kB (UMC) 37*1024kB (UM) 29*2048kB (UMC) 233*4096kB (MC) = 1151900kB
    [ 1858.792659] 8598 total pagecache pages
    [ 1858.793140] 0 pages in swap cache
    [ 1858.793566] Swap cache stats: add 0, delete 0, find 0/0
    [ 1858.794231] Free swap  = 0kB
    [ 1858.794927] Total swap = 0kB
    [ 1858.795300] 515328 pages RAM
    [ 1858.795671] 326912 pages HighMem/MovableOnly
    [ 1858.796215] 105396 pages reserved
    [ 1858.796638] 51200 pages cma reserved
    [ 1858.987238] virtio_rpmsg_bus virtio2: msg received with no recipient
    [ 1859.120497] virtio_rpmsg_bus virtio2: msg received with no recipient
    [ 1859.253782] virtio_rpmsg_bus virtio2: msg received with no recipient
    [ 1859.387170] virtio_rpmsg_bus virtio2: msg received with no recipient
    Thanks and Best Regards,
    Cherry
  • Hi Stanley,

    Could you check the info above?

    Thanks and Best Regards,

    Cherry

  • Hi Cherry,

    This issue doesn't have anything to do with GPU performance.

    From the trace, it looks like memory allocation issue.

    Was the SGX framecopy use case working before the customer added their changes?

    Regards,
    Stanley