Skip to content

Commit

Permalink
05user: report ENOSPC in initrd when entering e-shell
Browse files Browse the repository at this point in the history
When a user tries to boot a live system with an appended rootfs without
enough RAM, the kernel will unpack the initramfs and rootfs until it
runs out of memory, and then continue booting with a partial OS image.
This causes random confusing boot errors, such as a service failing
because its binary is missing, or the rootfs squashfs failing to mount
with some error that looks like filesystem corruption.

Make this failure less confusing by having timeout.sh try to write a small
file, and if that fails, tell the user that they need more RAM.  Suppress
reporting of unit failures in this case, since the unit logs will just
muddy the issue.

This works reliably when the rootfs image can't be unpacked, i.e. when
the system has a moderate amount of RAM.  It doesn't work reliably when
we're so low on RAM that the initramfs image can't be unpacked (because
the compressed initramfs and rootfs images have used up our RAM); in that
case we might be missing important binaries/libraries and boot can fail
at any point.  It would be possible to construct a more robust detection
mechanism, e.g. by running a statically-linked program that's unpacked
first.  But this can't be completely robust (/sbin/init might not have
been unpacked, so we might not boot at all) and doesn't seem worth it for
now.

This code will also trigger if we run out of RAM in the
coreos.live.rootfs_url case, which is nice but less important, since that
case produces a clear error during rootfs fetch.

Closes coreos/fedora-coreos-tracker#1055.
  • Loading branch information
bgilbert committed Apr 21, 2022
1 parent 79b1221 commit 13106a6
Show file tree
Hide file tree
Showing 2 changed files with 25 additions and 6 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,8 @@ install_unit_wants() {
install() {
inst_multiple \
cut \
date
date \
dd

inst_hook emergency 99 "${moddir}/timeout.sh"

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -53,11 +53,29 @@ downloaded from GitHub:
EOF
fi
echo "Displaying logs from failed units: ${failed}"
for unit in ${failed}; do
# 10 lines should be enough for everyone
journalctl -b --no-pager --no-hostname -u ${unit} -n 10
done

# If this is a live boot, check for ENOSPC in initramfs filesystem
# Try creating a 64 KiB file, in case a small file was deleted on
# service failure
# https://github.com/coreos/fedora-coreos-tracker/issues/1055
if [ -f /etc/coreos-live-initramfs ] && \
! dd if=/dev/zero of=/tmp/check-space bs=4K count=16 2>/dev/null; then
cat <<EOF
------
Ran out of memory when unpacking initrd filesystem. Ensure your system has
at least 2 GiB RAM if booting with coreos.live.rootfs_url, or 4 GiB otherwise.
------
EOF
# Don't show logs from failed units, since they'll just be
# random misleading errors.
else
echo "Displaying logs from failed units: ${failed}"
for unit in ${failed}; do
# 10 lines should be enough for everyone
journalctl -b --no-pager --no-hostname -u ${unit} -n 10
done
fi
fi

# Regularly prompt with time remaining. This ensures the prompt doesn't
Expand Down

0 comments on commit 13106a6

Please sign in to comment.