As a kernel developer, everyday I need to compile and install custom kernels, and any improvement in this workflow means to be more productive. While installing my fresh compiled modules, I noticed that it would be stuck in amdgpu
compression for some time:
XZ /usr/lib/modules/6.2.0-tonyk/kernel/drivers/gpu/drm/amd/amdgpu/amdgpu.ko.xz
XZ format
My target machine is the Steam Deck, that uses .xz
for compressing the modules. Giving that we want gamers to be able to install as many games as possible, the OS shouldn’t waste much disk space. amdgpu
, when compiled with debug symbols can use a good hunk of space. Here’s the comparison of disk size of the module uncompressed, and then with .zst
and .xz
compression:
360M amdgpu.ko
61M amdgpu.ko.zst
38M amdgpu.ko.xz
This more compact module comes with a cost: more CPU time for compression.
Multithread compression
When I opened htop
, I saw that only a lonely thread was doing the hard work to compress amdgpu
, even that compression is a task easily parallelizable. I then hacked scripts/Makefile.modinst
so XZ
would use as many threads as possible, with the option -T0
. In my main build machine, modules_install
was running 4 times faster!
# before the patch
$ time make modules_install -j16
Executed in 100.08 secs
# after the patch
$ time make modules_install -j16
Executed in 28.60 secs
Then, I submitted a patch to make this default for everyone: [PATCH] kbuild: modinst: Enable multithread xz compression
However, as Masahiro Yamada noticed, we shouldn’t be spawning numerous threads in the build system without the user request. Until today we specify manually how many threads we should run with make -jX
.
Hopefully, Nathan Chancellor suggested that the same results can be achieved using XZ_OPT=-T0
, so we still can benefit from this without the patch. I experimented with different -TX
and -jY
values, but in my notebook the most efficient values were X = Y = nproc
. You can check some results bellow:
$ make modules_install
174.83 secs
$ make modules_install -j8
100.55 secs
$ make modules_install XZ_OPT=-T0
81.51 secs
$ make modules_install -j8 XZ_OPT=-T0
53.22 sec