This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

memory fragmentation issue in AM335x based custom board



We are using AM335x based custom board with 256MB of RAM and 4GB of eMMC, We are using Linux Kernel 3.12.

In below mentioned use case we see that page allocation fails and then "Detected aborted journal" error observed, and after that partition is mounted read-only to avoid further data corruption.

a.       Create big rootfs containing 4k size files. Approximately 81000.
b.       Mount spare RFS partition of size 500MB.
c.       Untar big rootf.tar.gz onto spare mounted spare partition.
d.       With this use case we could reproduce issue every single time.
e.       I have attached the log where “Detected aborted journal” error is reproduced and just before that meminfo shows that 16KB memory block is not available.( refer full log in  attached logfile 1_syslog)
I have attached two log files, and shell script to create 4K size files
[47359.152553] 47145 total pagecache pages
[47359.152569] 0 pages in swap cache
[47359.152579] Swap cache stats: add 0, delete 0, find 0/0
[47359.152588] Free swap  = 0kB
[47359.152594] Total swap = 0kB
[47359.163599] 65536 pages of RAM
[47359.163610] 1738 free pages
[47359.163616] 2662 reserved pages
[47359.163624] 5665 slab pages
[47359.163631] 312658 pages shared
[47359.163639] 0 pages swap cached
[47364.483654] Mem-info:
[47364.483668] Normal per-cpu:
[47364.483682] CPU    0: hi:   90, btch:  15 usd:  28
[47364.483710] active_anon:7464 inactive_anon:20 isolated_anon:0
[47364.483710]  active_file:3168 inactive_file:41035 isolated_file:0
[47364.483710]  unevictable:422 dirty:19 writeback:0 unstable:0
[47364.483710]  free:4459 slab_reclaimable:4089 slab_unreclaimable:1111
[47364.483710]  mapped:2356 shmem:43 pagetables:244 bounce:0
[47364.483710]  free_cma:47
[47364.483771] Normal free:17836kB min:2004kB low:2504kB high:3004kB active_anon:29856kB inactive_anon:80kB active_file:12672kB inactive_file:164140kB unevictable:1688kB isolated(anon):0kB isolated(file):0kB present:260096kB managed:251496kB mlocked:1688kB dirty:76kB writeback:0kB mapped:9424kB shmem:172kB slab_reclaimable:16356kB slab_unreclaimable:4444kB kernel_stack:1160kB pagetables:976kB unstable:0kB bounce:0kB free_cma:188kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[47364.483783] lowmem_reserve[]: 0 0 0
[47364.483797] Normal: 2529*4kB (UEMRC) 893*8kB (UEMC) 16*16kB (UM) 6*32kB (M) 0*64kB 1*128kB (M) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB 0*8192kB = 17836kB
[47364.483855] 44590 total pagecache pages
[47364.483871] 0 pages in swap cache
[47364.483881] Swap cache stats: add 0, delete 0, find 0/0
[47364.483888] Free swap  = 0kB
[47364.483895] Total swap = 0kB
[47364.494968] 65536 pages of RAM
[47364.494978] 4767 free pages
[47364.494985] 2662 reserved pages
[47364.494993] 5200 slab pages
[47364.495000] 310103 pages shared
[47364.495008] 0 pages swap cached
[47368.503176] mmcqd/1: page allocation failure: order:2, mode:0x200020
[47368.503216] CPU: 0 PID: 613 Comm: mmcqd/1 Tainted: P           O 3.12.10-005-ts-armv7l #1
[47368.503282] [<c0012d24>] (unwind_backtrace+0x0/0xf4) from [<c0011130>] (show_stack+0x10/0x14)
[47368.503317] [<c0011130>] (show_stack+0x10/0x14) from [<c0087548>] (warn_alloc_failed+0xe0/0x118)
[47368.503346] [<c0087548>] (warn_alloc_failed+0xe0/0x118) from [<c008a3ac>] (__alloc_pages_nodemask+0x74c/0x8f8)
[47368.503375] [<c008a3ac>] (__alloc_pages_nodemask+0x74c/0x8f8) from [<c00b2e8c>] (cache_alloc_refill+0x328/0x620)
[47368.503398] [<c00b2e8c>] (cache_alloc_refill+0x328/0x620) from [<c00b3224>] (__kmalloc+0xa0/0xe8)
[47368.503433] [<c00b3224>] (__kmalloc+0xa0/0xe8) from [<c0212904>] (edma_prep_slave_sg+0x84/0x388)
[47368.503469] [<c0212904>] (edma_prep_slave_sg+0x84/0x388) from [<c02eeb98>] (omap_hsmmc_request+0x414/0x508)
[47368.503497] [<c02eeb98>] (omap_hsmmc_request+0x414/0x508) from [<c02d9240>] (mmc_start_request+0xc4/0xe0)
[47368.503519] [<c02d9240>] (mmc_start_request+0xc4/0xe0) from [<c02da028>] (mmc_start_req+0x2d8/0x38c)
[47368.503540] [<c02da028>] (mmc_start_req+0x2d8/0x38c) from [<c02e7310>] (mmc_blk_issue_rw_rq+0xb4/0x9d8)
[47368.503562] [<c02e7310>] (mmc_blk_issue_rw_rq+0xb4/0x9d8) from [<c02e7dd8>] (mmc_blk_issue_rq+0x1a4/0x468)
[47368.503583] [<c02e7dd8>] (mmc_blk_issue_rq+0x1a4/0x468) from [<c02e8760>] (mmc_queue_thread+0x88/0x118)
[47368.503609] [<c02e8760>] (mmc_queue_thread+0x88/0x118) from [<c004d8b8>] (kthread+0xb4/0xb8)
[47368.503633] [<c004d8b8>] (kthread+0xb4/0xb8) from [<c000e298>] (ret_from_fork+0x14/0x3c)
[47368.503644] Mem-info:
[47368.503653] Normal per-cpu:
[47368.503664] CPU    0: hi:   90, btch:  15 usd:  17
[47368.503691] active_anon:7530 inactive_anon:20 isolated_anon:0
[47368.503691]  active_file:3142 inactive_file:44610 isolated_file:0
[47368.503691]  unevictable:422 dirty:32 writeback:0 unstable:0
[47368.503691]  free:1149 slab_reclaimable:3734 slab_unreclaimable:1130
[47368.503691]  mapped:2362 shmem:43 pagetables:256 bounce:0
[47368.503691]  free_cma:207
[47368.503752] Normal free:4596kB min:2004kB low:2504kB high:3004kB active_anon:30120kB inactive_anon:80kB active_file:12568kB inactive_file:178440kB unevictable:1688kB isolated(anon):0kB isolated(file):0kB present:260096kB managed:251496kB mlocked:1688kB dirty:128kB writeback:0kB mapped:9448kB shmem:172kB slab_reclaimable:14936kB slab_unreclaimable:4520kB kernel_stack:1176kB pagetables:1024kB unstable:0kB bounce:0kB free_cma:828kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[47368.503765] lowmem_reserve[]: 0 0 0
[47368.503782] Normal: 833*4kB (UEMRC) 158*8kB (MC) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB 0*8192kB = 4596kB
[47368.503833] 48144 total pagecache pages
[47368.503849] 0 pages in swap cache
[47368.503858] Swap cache stats: add 0, delete 0, find 0/0
[47368.503866] Free swap  = 0kB
[47368.503873] Total swap = 0kB
[47368.503885] SLAB: Unable to allocate memory on node 0 (gfp=0x20)
[47368.503896]   cache: kmalloc-16384, object size: 16384, order: 2
[47368.503909]   node 0: slabs: 7/7, objs: 7/7, free: 0
[47368.503932] omap_hsmmc 481d8000.mmc: prep_slave_sg() failed
[47368.503945] omap_hsmmc 481d8000.mmc: MMC start dma failure
[47368.512369] mmcblk0: unknown error -1 sending read/write command, card status 0x900
[47368.512578] blk_update_request: 371 callbacks suppressed
[47368.512591] end_request: I/O error, dev mmcblk0, sector 1371894
[47368.512613] Buffer I/O error on device mmcblk0p16, logical block 145275
[47368.512623] lost page write due to I/O error on mmcblk0p16
[47368.512892] end_request: I/O error, dev mmcblk0, sector 1371896
[47368.512906] Buffer I/O error on device mmcblk0p16, logical block 145276
[47368.512916] lost page write due to I/O error on mmcblk0p16
[47368.513011] end_request: I/O error, dev mmcblk0, sector 1371898
[47368.513024] Buffer I/O error on device mmcblk0p16, logical block 145277
[47368.513034] lost page write due to I/O error on mmcblk0p16
[47368.513107] end_request: I/O error, dev mmcblk0, sector 1371900
[47368.513179] end_request: I/O error, dev mmcblk0, sector 1371902
[47368.513251] end_request: I/O error, dev mmcblk0, sector 1371904
[47368.513322] end_request: I/O error, dev mmcblk0, sector 1371906
[47368.513405] end_request: I/O error, dev mmcblk0, sector 1371908
[47368.513478] end_request: I/O error, dev mmcblk0, sector 1371910
[47368.513549] end_request: I/O error, dev mmcblk0, sector 1371912
[47368.796212] Aborting journal on device mmcblk0p16-8.
[47368.802513] EXT4-fs error (device mmcblk0p16): ext4_journal_check_start:56: Detected aborted journal
[47368.812303] EXT4-fs (mmcblk0p16): Remounting filesystem read-only

I don't see this error if I disable journal for particular partition.
This issue is NOT reproducible in Beaglebone Black, I had changed BBB RAM size to 256MB(from kernel commandline parameter) to bring platform to same memory level .
 
Any suggestion to resolve this issue ?
 
I had posted it in BBB google groups but haven't found solution yet :(
 
  • Hi Ankur,

    Could you specify software release and kernel version?

    BR
    Tsvetolin Shulev
  • Hi Tsvetolin,

    Thank you for reply,
    We are using Linux kernel 3.12 and build system provided by timesys.

    Timesys takes TI SDK as their base for creating their build system, I will get exact TI SDK version and update the post.

    Meanwhile I have tested this issue in 4.1 kernel also but no luck.

    Thank you,

    Regards,
    Ankur
  • Ankur,

    I have two suppositions about the reason causing the issue.
    1. Change the rootfs file system to ext3 instead of ext4.
    2. Hardware problem with eMMC chip. This assumption can be verified as test the issue on other device.
    An other rough workaround for solving similar issue is to remove the journal from the filesystem. It is described at:
    www.linuxquestions.org/.../

    BR
    Tsvetolin Shulev
  • Hi Tsvetolin,

    Thank you for reply,
    1. It would be difficult at this stage to change the file system, as we are nearing shipping :(
    However just to understand, what will be advantage of using ext3 file system?
    2. Well, we need some sort of pointers for hardware problem in case any, But from log it looks like it is fragmentation issue :(
    I had already posted h/w related question below and worked on suggestions,
    e2e.ti.com/.../1974065

    3. Disabling journal causes other fs error when there is power disconnect and next boot takes lot of time to correct those errors so we don't want to go for it . Moreover if root cause is something else disabling journal might just reduce the probability of reproduction.

    We want to find out the root cause of this problem and tackle it.

    Regards,
    Ankur

  • Hi Ankur,

    I have also got similar issue as you described above.

    Nov 28 10:39:23.681 kernel: [1128309.461153] mmcqd/1: page allocation failure: order:2, mode:0x204020
    Nov 28 10:39:23.685 kernel: [1128309.468075] CPU: 0 PID: 555 Comm: mmcqd/1 Not tainted 3.14.26-3.0.20b #1
    Nov 28 10:39:24.631 kernel: [1128309.475311] Backtrace:
    Nov 28 10:39:24.631 kernel: [1128309.478084] Function entered at [<c0011300>] from [<c0011490>]
    Nov 28 10:39:24.631 kernel: [1128309.484403] r6:00000002 r5:00204020 r4:00000000 r3:00208840
    Nov 28 10:39:24.632 kernel: [1128309.490578] Function entered at [<c0011478>] from [<c03c8960>]
    Nov 28 10:39:24.632 kernel: [1128309.496898] Function entered at [<c03c8940>] from [<c00876c8>]
    Nov 28 10:39:24.632 kernel: [1128309.503217] Function entered at [<c00875cc>] from [<c008a0dc>]
    Nov 28 10:39:24.632 kernel: [1128309.509535] r3:dda3dc1c r2:00000000
    Nov 28 10:39:24.632 kernel: [1128309.513493] r7:c04eda60 r6:00000034 r5:00000002 r4:00204020
    Nov 28 10:39:24.632 kernel: [1128309.519665] Function entered at [<c0089904>] from [<c00b21d8>]
    Nov 28 10:39:24.632 kernel: [1128309.525982] r10:00008020 r9:00000000 r8:00004020 r7:00000000 r6:dd801640 r5:dd80c680
    Nov 28 10:39:24.632 kernel: [1128309.534447] r4:dd800500
    Nov 28 10:39:24.634 kernel: [1128309.537304] Function entered at [<c00b1fa4>] from [<c00b2604>]
    Nov 28 10:39:24.634 kernel: [1128309.543621] r10:00000001 r9:00000000 r8:dd8e4c10 r7:600f0113 r6:00008020 r5:dd800500
    Nov 28 10:39:24.634 kernel: [1128309.552085] r4:00008020
    Nov 28 10:39:24.634 kernel: [1128309.554942] Function entered at [<c00b2580>] from [<c01eae88>]
    Nov 28 10:39:24.634 kernel: [1128309.561260] r7:dda38000 r6:0000014b r5:dd904260 r4:00000000
    Nov 28 10:39:24.635 kernel: [1128309.567432] Function entered at [<c01eadb8>] from [<c02c56d4>]
    Nov 28 10:39:24.635 kernel: [1128309.573749] r10:dd904260 r9:00000000 r8:dd9eaa80 r7:00000000 r6:dda11a00 r5:dda11940
    Nov 28 10:39:24.636 kernel: [1128309.582214] r4:dd9ea800
    Nov 28 10:39:24.637 kernel: [1128309.585070] Function entered at [<c02c53a0>] from [<c02b01c0>]
    Nov 28 10:39:24.639 kernel: [1128309.591386] r10:00000001 r8:dda1192c r7:dda3dec4 r6:dda11a00 r5:dd9ea800 r4:dda11940
    Nov 28 10:39:24.639 kernel: [1128309.599857] Function entered at [<c02afff8>] from [<c02b14c0>]
    Nov 28 10:39:24.639 kernel: [1128309.606174] r7:dda3dec4 r6:dda11a40 r5:00000000 r4:dd9ea800
    Nov 28 10:39:24.640 kernel: [1128309.612346] Function entered at [<c02b1254>] from [<c02bed78>]
    Nov 28 10:39:24.640 kernel: [1128309.618664] r10:dd9f9800 r9:dd9f9800 r8:dda1193c r7:dda17120 r6:dda11804 r5:dda11940
    Nov 28 10:39:24.641 kernel: [1128309.627129] r4:00000000
    Nov 28 10:39:24.641 kernel: [1128309.629985] Function entered at [<c02be97c>] from [<c02bf744>]
    Nov 28 10:39:24.641 kernel: [1128309.636303] r10:dd9ea800 r9:dda11800 r8:dda2fa28 r7:dda11800 r6:dda11804 r5:dd9f9800
    Nov 28 10:39:24.641 kernel: [1128309.644768] r4:dda17120
    Nov 28 10:39:24.642 kernel: [1128309.647625] Function entered at [<c02bf360>] from [<c02c0074>]
    Nov 28 10:39:24.642 kernel: [1128309.653941] r10:00000000 r9:00000000 r8:dda2fa28 r7:122c8001 r6:dda3c000 r5:dda1180c
    Nov 28 10:39:24.642 kernel: [1128309.662406] r4:dda11804
    Nov 28 10:39:24.644 kernel: [1128309.665263] Function entered at [<c02bffd4>] from [<c004ab6c>]
    Nov 28 10:39:24.645 kernel: [1128309.671579] r8:00000000 r7:c02bffd4 r6:dda11804 r5:00000000 r4:dda15e00 r3:dd9c2400
    Nov 28 10:39:24.645 kernel: [1128309.679957] Function entered at [<c004aa98>] from [<c000e978>]
    Nov 28 10:39:24.646 kernel: [1128309.686274] r7:00000000 r6:00000000 r5:c004aa98 r4:dda15e00
    Nov 28 10:39:24.646 kernel: [1128309.692441] Mem-info:
    Nov 28 10:39:24.647 kernel: [1128309.695018] Normal per-cpu:
    Nov 28 10:39:24.651 kernel: [1128309.698149] CPU 0: hi: 186, btch: 31 usd: 169
    Nov 28 10:39:24.651 kernel: [1128309.703408] active_anon:11184 inactive_anon:28 isolated_anon:0
    Nov 28 10:39:24.652 kernel: [1128309.703408] active_file:48083 inactive_file:48118 isolated_file:0
    Nov 28 10:39:24.653 kernel: [1128309.703408] unevictable:0 dirty:4097 writeback:15236 unstable:0
    Nov 28 10:39:24.653 kernel: [1128309.703408] free:11149 slab_reclaimable:5448 slab_unreclaimable:3459
    Nov 28 10:39:24.654 kernel: [1128309.703408] mapped:1849 shmem:46 pagetables:116 bounce:0
    Nov 28 10:39:24.655 kernel: [1128309.703408] free_cma:4361
    Nov 28 10:39:24.655 kernel: [1128309.738753] Normal free:44596kB min:2864kB low:3580kB high:4296kB active_anon:44736kB inactive_anon:112kB active_file:192332kB inactive_file:192472kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:523264kB m
    Nov 28 10:39:24.655 kernel: [1128309.784303] lowmem_reserve[]: 0 0 0
    Nov 28 10:39:24.660 kernel: [1128309.788180] Normal: 7849*4kB (EMRC) 1000*8kB (MRC) 263*16kB (C) 31*32kB (C) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB 0*8192kB = 44596kB
    Nov 28 10:39:24.661 kernel: [1128309.802855] 96247 total pagecache pages
    Nov 28 10:39:24.661 kernel: [1128309.807083] 0 pages in swap cache
    Nov 28 10:39:24.661 kernel: [1128309.810757] Swap cache stats: add 0, delete 0, find 0/0
    Nov 28 10:39:24.662 kernel: [1128309.816433] Free swap = 0kB
    Nov 28 10:39:24.662 kernel: [1128309.819645] Total swap = 0kB
    Nov 28 10:39:24.663 kernel: [1128309.862004] 131072 pages of RAM
    Nov 28 10:39:24.663 kernel: [1128309.865501] 11453 free pages
    Nov 28 10:39:24.663 kernel: [1128309.868714] 2694 reserved pages
    Nov 28 10:39:24.663 kernel: [1128309.872201] 7870 slab pages
    Nov 28 10:39:24.664 kernel: [1128309.875321] 360954 pages shared
    Nov 28 10:39:24.664 kernel: [1128309.878808] 0 pages swap cached
    Nov 28 10:39:24.676 kernel: [1128309.882306] SLAB: Unable to allocate memory on node 0 (gfp=0x4020)
    Nov 28 10:39:24.676 kernel: [1128309.888994] cache: kmalloc-16384, object size: 16384, order: 2
    Nov 28 10:39:24.676 kernel: [1128309.895504] node 0: slabs: 9/9, objs: 9/9, free: 0
    Nov 28 10:39:24.677 kernel: [1128309.901177] edma-dma-engine edma-dma-engine.0: edma_prep_slave_sg: Failed to allocate a descriptor
    Nov 28 10:39:24.677 kernel: [1128309.910862] omap_hsmmc 481d8000.mmc: prep_slave_sg() failed
    Nov 28 10:39:24.677 kernel: [1128309.916949] omap_hsmmc 481d8000.mmc: MMC start dma failure
    Nov 28 10:39:24.677 kernel: [1128310.392320] mmcblk0: unknown error -1 sending read/write command, card status 0x900
    Nov 28 10:39:24.678 kernel: [1128310.401035] end_request: I/O error, dev mmcblk0, sector 824011
    Nov 28 10:39:24.678 kernel: [1128310.407440] EXT4-fs warning (device mmcblk0p3): ext4_end_bio:317: I/O error -5 writing to inode 33351 (offset 0 size 0 starting block 412006)
    Nov 28 10:39:24.678 kernel: [1128310.421028] Buffer I/O error on device mmcblk0p3, logical block 146933
    Nov 28 10:39:24.678 kernel: [1128310.428549] end_request: I/O error, dev mmcblk0, sector 824013
    Nov 28 10:39:24.679 kernel: [1128310.434959] EXT4-fs warning (device mmcblk0p3): ext4_end_bio:317: I/O error -5 writing to inode 33352 (offset 0 size 0 starting block 412007)
    Nov 28 10:39:24.679 kernel: [1128310.448541] Buffer I/O error on device mmcblk0p3, logical block 146934
    Nov 28 10:39:24.679 kernel: [1128310.455849] end_request: I/O error, dev mmcblk0, sector 824015
    Nov 28 10:39:24.679 kernel: [1128310.462192] EXT4-fs warning (device mmcblk0p3): ext4_end_bio:317: I/O error -5 writing to inode 33353 (offset 0 size 0 starting block 412008)
    Nov 28 10:39:24.680 kernel: [1128310.475776] Buffer I/O error on device mmcblk0p3, logical block 146935
    Nov 28 10:39:24.680 kernel: [1128310.483017] end_request: I/O error, dev mmcblk0, sector 824017
    Nov 28 10:39:24.680 kernel: [1128310.489397] EXT4-fs warning (device mmcblk0p3): ext4_end_bio:317: I/O error -5 writing to inode 33354 (offset 0 size 0 starting block 412009)
    Nov 28 10:39:24.680 kernel: [1128310.502969] Buffer I/O error on device mmcblk0p3, logical block 146936
    Nov 28 10:39:24.681 kernel: [1128310.510235] end_request: I/O error, dev mmcblk0, sector 824019
    Nov 28 10:39:24.681 kernel: [1128310.516610] EXT4-fs warning (device mmcblk0p3): ext4_end_bio:317: I/O error -5 writing to inode 33355 (offset 0 size 0 starting block 412010)
    Nov 28 10:39:24.683 kernel: [1128310.530198] Buffer I/O error on device mmcblk0p3, logical block 146937
    Nov 28 10:39:24.683 kernel: [1128310.537457] end_request: I/O error, dev mmcblk0, sector 824021
    Nov 28 10:39:24.684 kernel: [1128310.543799] EXT4-fs warning (device mmcblk0p3): ext4_end_bio:317: I/O error -5 writing to inode 33356 (offset 0 size 0 starting block 412011)
    Nov 28 10:39:24.684 kernel: [1128310.557378] Buffer I/O error on device mmcblk0p3, logical block 146938
    Nov 28 10:39:24.684 kernel: [1128310.564643] end_request: I/O error, dev mmcblk0, sector 824023
    Nov 28 10:39:24.684 kernel: [1128310.570983] EXT4-fs warning (device mmcblk0p3): ext4_end_bio:317: I/O error -5 writing to inode 33357 (offset 0 size 0 starting block 412012)
    Nov 28 10:39:24.684 kernel: [1128310.584560] Buffer I/O error on device mmcblk0p3, logical block 146939
    Nov 28 10:39:24.685 kernel: [1128310.591784] end_request: I/O error, dev mmcblk0, sector 824025
    Nov 28 10:39:24.685 kernel: [1128310.598163] EXT4-fs warning (device mmcblk0p3): ext4_end_bio:317: I/O error -5 writing to inode 33359 (offset 0 size 0 starting block 412013)
    Nov 28 10:39:24.685 kernel: [1128310.611733] Buffer I/O error on device mmcblk0p3, logical block 146940
    Nov 28 10:39:24.685 kernel: [1128310.618986] end_request: I/O error, dev mmcblk0, sector 824027
    Nov 28 10:39:24.685 kernel: [1128310.625356] EXT4-fs warning (device mmcblk0p3): ext4_end_bio:317: I/O error -5 writing to inode 33358 (offset 0 size 0 starting block 412015)
    Nov 28 10:39:24.685 kernel: [1128310.638926] Buffer I/O error on device mmcblk0p3, logical block 146941
    Nov 28 10:39:24.685 kernel: [1128310.645983] Buffer I/O error on device mmcblk0p3, logical block 146942
    Nov 28 10:39:24.686 kernel: [1128310.653233] end_request: I/O error, dev mmcblk0, sector 824031
    Nov 28 10:39:24.686 kernel: [1128310.659608] EXT4-fs warning (device mmcblk0p3): ext4_end_bio:317: I/O error -5 writing to inode 33360 (offset 0 size 0 starting block 412016)


    How did you resolve this issue?
  • Hi Pavan,

    Unfortunately Issue is not yet resoved. :(
    what is the use case where you see this error ? What is the system configuration, processor, RAM size, eMMC/NAND size ?
    I am just trying to see if there is similarity in yours and my board.

    Thank you,

    Regards,
    Ankur
  • Thank you Ankur for your response,

    Linux OS: TI SDK8


    RAM size: 512Mb

    EMMC size: 4GB

    The issue occured when our box is softupgraded, In detail we have 3 partitions(1,2,3), if there is any new software available, it will mount next partition and downloads the software(rootfile system) from server which is tar'ed to the current partition and then extracts to the mounted next partition. Before reboots it sets the uboot variables to boot from next partition.

    So in our case kernel crash happened when the next partition is mounted, downloaded and installed. So after reboot it switched next partition but hung at kernel level as below.

    INIT: version 2.88 booting
    INIT: /etc/inittab[1]: missing action field
    INIT: /etc/inittab[2]: id field too long (max 4 characters)
    INIT: /etc/inittab[3]: id field too long (max 4 characters)
    INIT: /etc/inittab[4]: id field too long (max 4 characters)
    INIT: /etc/inittab[5]: id field too long (max 4 characters)
    INIT: /etc/inittab[6]: id field too long (max 4 characters)
    
    Enter runlevel: [   61.611509]  remoteproc0: failed to load am335x-pm-firmware.elf
    [   61.617799]  remoteproc0: powering up wkup_m3
    [   61.632409] wkup_m3 44d00000.wkup_m3: rproc_boot failed

    Then when i checked the varlogs found that kernel crash and suspected that issue could be of. the disk space not available or RAM memory not available due to any memleak at user space level.
    So to the next partition not fully copied the software or something corrupted thats why hanged at kernel level

    Any guess what could be the reason here?

  • Hi Pavan,

    Sorry for late reply, I was bit busy so couldn't look at this and reply.

    You are having 512MB of RAM and kernel version is also 3.14(higher than what I am using).

    I am working on couple of solutions and one of it looks promising, however just to see if issue you are seeing is same or not, can you run below test in your board?

    I want you to create tar containing 4k size files in it. And you can create tar of the size little less than backup rootfs.

    e.g. I have backup partition of 500MB so I can take 480MB of tar file and it would contain 122880 number of 4k size files.

    I am attaching script to create 4k size files from urandom device. Run the attached script and

    1. create 4k size files

    2. tar all those files

    3. mount a backup partition and

    4. start untaring that big tar on the mounted partition.

    5. See if you get any file system error when you run this use case

    6. If you get error please put logs here.

    Thank you,

    Regards,

    Ankur

  • Thats Great Ankur!! Thank You for your help!

    I will try with your script today and do post the logs as well.

    Just want to know one more thing, how frequent this error scenario is occuring in your production?

    In our case we have observed in only one board out of 100 plus boards. And moreover only one time it is observed in this board.

    -Pavan