kdump: Refactor devnode setting / derivation

Thanks to Emil (@xexaxo) suggestion, we hereby implement a less fragile
way of obtaining the "/home" mount point. Emil suggested that instead of
using device name directly, we could use the generic link, as in:
"/dev/disk/by-partsets/shared/home".

In principle the change would be simple, but it proved to be a bit tricky
due to the early boot stage kdump executes - in such point we don't have
this link available, so we need to rely in the full device name directly
on kdump collection. We achieve that by saving this information in the
kdump initrd - this is not completely safe, see the CAVEAT below.

Also, we improved kdump loading script by using "findmnt", a less
fragile / more elegant way of getting the "/home" mount point.

CAVEAT: NVMe multipathing introduced a "randomness" level to device
naming on Linux, so "nvme0n1" could be "nvme1n1" in some boots, if we
have more than one device. There is a kernel parameter to avoid that
("nvme_core.multipath=0"), see [0] for more information.
Due to this reason, we could in theory have different NVMe device
names between regular kernel boot and the kdump one, hence causing a
failure in kdump collection.
But this is pretty much safe since we don't have multiple NVMe
devices, also we could disable multipath in kernel config
(CONFIG_NVME_MULTIPATH) or use the above cmdline.

[0] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1792660/

Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
This commit is contained in:
Guilherme G. Piccoli
2022-02-22 19:02:08 -03:00
parent e77d43c08f
commit 725d6d7149
5 changed files with 37 additions and 13 deletions

View File

@ -93,6 +93,16 @@
# https://lists.archlinux.org/pipermail/aur-general/2022-January/036767.html
# https://aur.archlinux.org/packages/makedumpfile/#comment-843853
#
# (e) NVMe multipathing introduced a "randomness" level to device naming on
# Linux, so "nvme0n1" could be "nvme1n1" in some boots, if we have more than
# one NVMe device. There's a kernel parameter to avoid that
# ("nvme_core.multipath=0"). So, since we rely in getting the NVMe device name
# to be used in kdump during the regular boot process, we could in theory have
# different names between regular kernel boot and the kdump one, hence causing
# a failure in kdump collection. But this is pretty much safe now since we
# don't have multiple NVMe devices, also we could disable multipath in kernel
# config (CONFIG_NVME_MULTIPATH) or use the above cmdline.
#
#
# TODOs
# ###########################################################################
@ -103,10 +113,8 @@
# images - it happens due to our package installing files on directory
# "/usr/lib/dracut/modules.d" which triggers the unfortunate initramfs rebuild.
#
# * We have a "fragile" way of determining a mount point required for Kdump;
# this is something to improve maybe, in order to make the Kdump more reliable.
# Also in the list of fragile things, VDF parsing is...complicated. Something
# that would be nice to improve as well.
# * VDF parsing would benefit from some improvement, it's at least "fragile"
# for now, to be generous...but that seems a bit complicated.
#
# * Pstore ramoops back-end has some limitations that we're discussing with
# the kernel community - right now we can only collect ONE dmesg and its

View File

@ -24,8 +24,10 @@ if [ ! -f $VMCORE ]; then
reboot -f
fi
DEVN="$(cat /usr/lib/kdump/kdump.devnode)"
mkdir -p "/kdump_path"
if ! mount "/dev/${MOUNT_DEVNODE}" /kdump_path; then
if ! mount "${DEVN}" /kdump_path; then
reboot -f
fi
@ -39,7 +41,7 @@ if [ "${FULL_COREDUMP}" -ne 0 ]; then
sync "${KDUMP_FOLDER}/vmcore.compressed"
fi
umount "/dev/${MOUNT_DEVNODE}"
umount "${DEVN}"
sync
reboot -f

View File

@ -53,8 +53,8 @@ fi
. /etc/default/kdump
# Fragile way for finding the proper mount point for DEVNODE:
DEVN_MOUNTED=$(mount |grep "${MOUNT_DEVNODE}" | head -n1 | cut -f3 -d\ )
# Find the proper mount point for /home:
DEVN_MOUNTED="$(findmnt "${MOUNT_DEVNODE}" -fno TARGET)"
KDUMP_FOLDER="${DEVN_MOUNTED}/${KDUMP_FOLDER}"
echo "${KDUMP_FOLDER}" > "${KDUMP_MNT}"

View File

@ -9,12 +9,13 @@
# /usr/lib/kdump/kdump-load.sh initrd
#
# Mount-related options - the DEVNODE must exist and be available during the
# kdump script execution. The KDUMP_FOLDER will be create if doesn't exist.
# The KDUMP_MNT is just a temporary file that will carry the mounted folder
# path across boot-time scripts.
# Mount-related options - the DEVNODE points to the /home directory link;
# this is used to derive the numerical devnode for kdump, since the link
# is not present so early in the system boot. The KDUMP_FOLDER will be
# created if doesn't exist. The KDUMP_MNT is just a temporary file that
# carries the mounted folder path across boot-time scripts.
MOUNT_DEVNODE="nvme0n1p8"
MOUNT_DEVNODE="/dev/disk/by-partsets/shared/home"
KDUMP_FOLDER="/.steamos/offload/var/kdump"
KDUMP_MNT="/tmp/kdump.mnt"

View File

@ -18,6 +18,13 @@ installkernel() {
}
install() {
# Having a valid /etc/default/kdump is essential for kdump.
if [ ! -f "/etc/default/kdump" ]; then
return 1
fi
. /etc/default/kdump
# First clear all unnecessary firmwares/drivers added by drm in order to
# reduce the size of this minimal initramfs being created. This should
# be already done via command-line arguments, but let's play safe and delete
@ -31,6 +38,12 @@ install() {
inst makedumpfile
mkdir -p $initdir/usr/lib/kdump
# Determine the numerical devnode for kdump, and save it on initrd;
# notice that partset link is not available that early in boot time.
DEVN="$(readlink -f "${MOUNT_DEVNODE}")"
echo "${DEVN}" > $initdir/usr/lib/kdump/kdump.devnode
cp -LR --preserve=all /usr/lib/kdump/* $initdir/usr/lib/kdump/
cp -LR --preserve=all /etc/default/kdump $initdir/usr/lib/kdump/kdump.etc