This thread has been locked.

If you have a related question, please click the "Ask a related question" button in the top right corner. The newly created question will be automatically linked to this question.

[FAQ] AM6x: Yocto build "hangs" for hours with no progress - What's going on?

(applicable to all TI Sitara and Jacinto devices such as AM62x, AM62Ax, AM64x, AM65x, and so on)

Symptoms

When doing a Yocto build as per SDK instructions for example of the SDK default image like this...

MACHINE=am62xx-evm bitbake -k tisdk-default-image

...the build starts, but then after a while it no longer seems to proceed in any way at all. The progress display showing which tasks are running and their status do not update. When checking CPU loading/activity (for example, using the 'htop' utility), there is no significant system loading. This seems to happen especially when building larger images (such as the tisdk-default-image vs. the tisdk-base-image), especially with later/newer SDKs such as SDK v8.6 or SDK v9.0.

For example, the build console can look like this (with no changes for hours):

NOTE: recipe python3-tensorflow-lite-2.8.0-r0: task do_compile: Started
NOTE: recipe open62541-1.0.1+gitAUTOINC+e4309754fc-r0: task do_compile: Started
NOTE: recipe powervr-graphics-5.8-r1: task do_compile: Started
NOTE: recipe gtk+3-3.24.14-r0: task do_compile: Started
NOTE: recipe glmark2-20191226+AUTOINC+72dabc5d72-r0.arago0: task do_compile: Started

Issue Analysis, Discussion, and Solutions

Yocto builds, especially those for more complex/larger recipes can use a significant amount of system resources. This is the case when heavy build parallelism is enabled - which it is by default. In case the Yocto build process attempts to use more memory (RAM) than available, the Linux Kernel out-of-memory (OOM) killer which is a process in charge of preventing processes from exhausting the build machine's memory will terminate critical Yocto build processes, in a way that goes undetected by the Yocto build itself, leaving the overall build hanging. To see if your build is indeed affected by this issue, terminate your Yocto build (CTRL+C for multiple times should do the trick), and then search the Kernel log for signs of the OOM killer having become active. If your Kernel log looks like the below example with recent entries, then your build failure was likely due to the OOM killer.

$ sudo dmesg | grep oom
[sudo] password for a0797059:
[3024499.564459] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/user.slice/user-1001.slice/session-2.scope,task=Cooker,pid=2364037,uid=1001
[3024499.564480] Out of memory: Killed process 2364037 (Cooker) total-vm:1302284kB, anon-rss:36840kB, file-rss:0kB, shmem-rss:24kB, UID:1001 pgtables:2284kB oom_score_adj:0
[3085219.883311] xz invoked oom-killer: gfp_mask=0x140dca(GFP_HIGHUSER_MOVABLE|__GFP_COMP|__GFP_ZERO), order=0, oom_score_adj=0
[3085219.883338]  oom_kill_process.cold+0xb/0x10
[3085219.883342]  __alloc_pages_may_oom+0x117/0x1e0
[3085219.883473] [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
[3085219.883492] [    668]   108   668     3706       32    65536      179          -900 systemd-oomd
[3085219.884409] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/user.slice/user-1001.slice/session-2.scope,task=cc1plus,pid=3120743,uid=1001
[3085219.884423] Out of memory: Killed process 3120743 (cc1plus) total-vm:4242032kB, anon-rss:4094828kB, file-rss:0kB, shmem-rss:0kB, UID:1001 pgtables:8232kB oom_score_adj:0

The issue can be addressed by either increasing the amount of RAM available for the build (adding physical RAM modules, increasing swap space, not loading the build machine with other tasks), or since this may not always be practical, by re-configuring the Yocto build parallelism to build less things in parallel. Note that the earlier example was from a build machine with 64GB of RAM, 20 CPU cores, building an AM64 SDK v9.0 tisdk-default-image, demonstrating that this also may happen if you think you have a rather powerful machine.

Yocto build parallelism is controlled through the BB_NUMBER_THREADS and PARALLEL_MAKE variables, and those settings can be easily changed by updating (uncommenting and changing) the following section in the build's conf/local.conf file:

#
# Parallelism Options
#
# These two options control how much parallelism BitBake should use. The first
# option determines how many tasks bitbake should run in parallel:
#
# BB_NUMBER_THREADS ?= "1"
#
# The second option controls how many processes make should run in parallel when
# running compile tasks:
#
# PARALLEL_MAKE ?= "-j 1"
#
# For a quad-core machine, BB_NUMBER_THREADS = "4", PARALLEL_MAKE = "-j 4" would
# be appropriate for example
#
# NOTE: By default, bitbake will choose the number of processeors on your host
# so you should not need to set this unless you are wanting to lower the number
# allowed.
#

In case of build issues, try setting BB_NUMBER_THREADS to "1" for debug purposes, and see if the build can now proceed with this. Note that before re-trying your build you may first need to manually clean/kill any remaining Yocto processes that may still be running on your machine (like bitbake-server). If this works, then you can then try to find a more suitable (faster) setting for this variable which still works on your given machine by using a value that is less than the number of CPUs available (but larger than 1).