kdump-load/save-logs: Log submission decoupling / major refactor

This is a pretty big refactor in the logic / goals of this kdump
implementation.

* WHY?

We want to decouple completely the log submission mechanism from the
kdump tooling, for mainly two reasons: reuse this submission API/mechanism
in other log collection tools, and to allow upstreaming the kdump tooling
for Arch Linux generically, not embedding SteamOS particulars to it.

* HOW:

First of all, we dropped the log submission bits from this codebase.
We also deleted the particulars of SteamOS/Deck in the log naming,
like collecting the serial of the device if "Jupiter" model is found
in the DMI info or getting the Steam user account via the VDF file.
All of that will happen in a later stage of the log processing, done by
*another tool* that shall rename the logs and transmit them to the
Valve servers.

While at it, we've done other small changes in the logic to make this
kdump tool more generic and reliable, like allowing the collection
of kdump *AND* pstore logs (not choosing one of them).

* CAVEATS / TODO:

More to come in this front, we still definitely need to remove more
references to SteamOS and clear a bit the code from its particulars.
Important also is to update the README to reflect the changes made
by the upstreaming effort.

Mea culpa: these changes are invasive, switch some logic and
expectations around the package, so making them fully bisectable
would be way harder than not. Hence, please take that into account:
this series should be tested/merged as a whole, it's not guaranteed
that individual patches work correctly in a standalone fashion.

Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
This commit is contained in:
Guilherme G. Piccoli
2022-12-19 14:24:52 -03:00
parent a6c4866e3a
commit 48fb326733
3 changed files with 75 additions and 270 deletions

View File

@ -50,7 +50,6 @@ grub_update() {
# This function is responsible for creating the kdump initrd, either
# via command-line call or in case initrd doesn't exist during kdump load.
create_initrd() {
mkdir -p "${KDUMP_FOLDER}"
rm -f "${KDUMP_FOLDER}/kdump-initrd-$(uname -r).img"
echo "Creating the kdump initramfs for kernel \"$(uname -r)\" ..."
@ -87,7 +86,7 @@ fi
. /usr/share/kdump/kdump.conf
# Find the proper mount point for /home:
# Find the proper mount point expected for kdump collection:
DEVN_MOUNTED="$(findmnt "${MOUNT_DEVNODE}" -fno TARGET)"
# Create the kdump folder here, as soon as possible, given the
@ -98,6 +97,8 @@ mkdir -p "${KDUMP_FOLDER}"
echo "${KDUMP_FOLDER}" > "${KDUMP_MNT}"
sync "${KDUMP_MNT}"
# Notice that at this point it's required to have the full
# KDUMP_FOLDER, so this must remain after the DEVNODE operations above.
if [ "$1" = "initrd" ]; then
create_initrd
exit 0

View File

@ -39,8 +39,3 @@ GRUB_AUTOSET=1
# relies in having an available RAM buffer on /proc/iomem with at least 5MiB
# in size.
USE_PSTORE_RAM=1
# By default, collected logs are submitted automatically to Valve servers.
# Setting LOG_SUBMISSION to '0' will disable this behavior; but notice that
# even with the log submission disabled, the logs are saved locally.
LOG_SUBMISSION=1

View File

@ -5,68 +5,11 @@
# Copyright (c) 2021 Valve.
# Maintainer: Guilherme G. Piccoli <gpiccoli@igalia.com>
#
# This is the SteamOS kdump/pstore log collector and submitter; this script
# prepares the pstore/kdump collected data and submit it to the services that
# handle support at Valve. It considers pstore as a first alternative, if no
# logs found (or if pstore is not mounted for some reason), tries to check
# if kdump logs are present.
# This is the SteamOS kdump/pstore log collector; this script prepares the
# pstore/kdump collected data and save it in the local disk, in the next
# successful boot.
#
# Function to bail-out in case logs can't/shouldn't be sent to Valve servers.
# Arg1: folder / Arg2: full filename
save_locally_and_bail() {
mkdir -p "$1"
mv "$2" "$1"
LOG_FNAME="$(basename "$2")"
logger "kdump-steamos: logs not submitted, only saved locally ($1/${LOG_FNAME})"
exit 0
}
# Next function is used to get Steam Account/ID from the VDF file for
# the current user; in case it fails, STEAM_ID and STEAM_ACCOUNT vars
# aren't updated.
# Arg1: devnode information to determine the /home mount point.
get_steam_account_id() {
# Step 1: get the /home mount point.
HOMEFLD="$(findmnt "$1" -fno TARGET)"
# Step 2: determine the username; notice that
# UID_MIN = 1000, UID_MAX = 60000 from "/etc/login.defs", but
# getent takes long time to check all of them, so we restrict
# to UID = 1000 only.
USERN="$(getent passwd 1000 | cut -f1 -d:)"
# Let's determine the VDF file location using the above info.
LOGINVDF="${HOMEFLD}/${USERN}/.local/share/Steam/config/loginusers.vdf"
if [ ! -s "${LOGINVDF}" ]; then # bail if no valid VDF is found.
return
fi
# Step 3: Parse the VDF file to obtain Account/ID; the following AWK
# command was borrowed from: https://unix.stackexchange.com/a/663959.
NUMREG=$(grep -c AccountName "${LOGINVDF}")
IDX=1
while [ ${IDX} -le "${NUMREG}" ]; do
MR=$(awk -v n=${IDX} -v RS='}' 'NR==n{gsub(/.*\{\n|\n$/,""); print}' "${LOGINVDF}" | grep "MostRecent" | cut -f4 -d\")
if [ "$MR" -ne 1 ]; then
IDX=$((IDX + 1))
continue
fi
STEAM_ACCOUNT=$(awk -v n=${IDX} -v RS='}' 'NR==n{gsub(/.*\{\n|\n$/,""); print}' "${LOGINVDF}" | grep "AccountName" | cut -f4 -d\")
# Get also the Steam ID, used in the POST request to Valve servers; this
# is a bit fragile, but there's no proper VDF parse tooling it seems...
LN=$(grep -n "AccountName.*${STEAM_ACCOUNT}\"" "${LOGINVDF}" | cut -f1 -d:)
LN=$((LN - 2))
STEAM_ID=$(sed -n "${LN}p" "${LOGINVDF}" | cut -f2 -d\")
break
done
}
# We do some validation to be sure KDUMP_MNT pointed path is valid...
# That and having a valid /usr/share/kdump/kdump.conf are essential conditions.
if [ ! -s "/usr/share/kdump/kdump.conf" ]; then
@ -85,19 +28,17 @@ if [ ! -d "${KDUMP_MAIN_FOLDER}" ]; then
fi
LOGS_FOUND=0
KDUMP_LOGS_FOLDER="${KDUMP_MAIN_FOLDER}/logs"
KDUMP_TMP_FOLDER="${KDUMP_MAIN_FOLDER}/.tmp"
# Use UTC timezone to match kdump collection
CURRENT_TSTAMP=$(date -u +"%Y%m%d%H%M")
CURRENT_EPOCH=$(date +"%s")
# We assume pstore is mounted by default, in this location;
# if not, we get a 0 and don't loop.
# By default, pstore is mounted in this location; if it isn't, we bail-out.
# Notice we currently only support the logs generated by the ramoops backend.
PSTORE_CNT=$(find /sys/fs/pstore/* 2>/dev/null | grep -c ramoops)
if [ "${PSTORE_CNT}" -ne 0 ]; then
# Dump the pstore logs in the <...>/kdump/logs/pstore subfolder.
PSTORE_FOLDER="${KDUMP_LOGS_FOLDER}/pstore"
PSTORE_FOLDER="${KDUMP_TMP_FOLDER}/pstore"
mkdir -p "${PSTORE_FOLDER}"
LOOP_CNT=0
@ -114,23 +55,25 @@ if [ "${PSTORE_CNT}" -ne 0 ]; then
done
LOGS_FOUND=${LOOP_CNT}
# Logs should live on logs/ folder (no subfolders), due to the zip file
mv "${PSTORE_FOLDER}"/* "${KDUMP_LOGS_FOLDER}/" 2>/dev/null
# Logs should live on <...>/.tmp folder, due to the zip compression.
mv "${PSTORE_FOLDER}"/* "${KDUMP_TMP_FOLDER}/" 2>/dev/null
rm -rf "${PSTORE_FOLDER}"
fi
# Enter the else block in case we don't have pstore logs - maybe we
# have kdump logs then.
else
# Now, we proceed the same way if there are kdump data.
KDUMP_CRASH_FOLDER="${KDUMP_MAIN_FOLDER}/crash"
KDUMP_CNT=$(find "${KDUMP_CRASH_FOLDER}"/* -type d 2>/dev/null | wc -l)
KDUMP_CNT=$(find "${KDUMP_CRASH_FOLDER}"/* -type d 2>/dev/null | wc -l)
if [ "${KDUMP_CNT}" -ne 0 ]; then
# Dump the kdump logs in the <...>/kdump/logs/kdump subfolder.
KD_FOLDER="${KDUMP_LOGS_FOLDER}/kdump"
KD_FOLDER="${KDUMP_TMP_FOLDER}/kdump"
mkdir -p "${KD_FOLDER}"
LOOP_CNT=0
while [ "${KDUMP_CNT}" -gt 0 ]; do
CRASH_CURRENT=$(find "${KDUMP_CRASH_FOLDER}"/* -type d 2>/dev/null | head -n1)
# When collecting the vmcore/dmesg during kdump, folder is
# saved with its name == the timestamp of the collection.
CRASH_TSTAMP=$(basename "${CRASH_CURRENT}")
if [ -s "${CRASH_CURRENT}/dmesg.txt" ]; then
@ -140,7 +83,8 @@ else
fi
# We don't care about submitting a vmcore, but let's save it if such file exists.
# We won't pack vmcores in the zip blob, but let's save
# it in case it was collected as well.
if [ -s "${CRASH_CURRENT}/vmcore.compressed" ]; then
SAVED_FILE="${KDUMP_CRASH_FOLDER}/vmcore.${CRASH_TSTAMP}"
mv "${CRASH_CURRENT}/vmcore.compressed" "${SAVED_FILE}"
@ -151,180 +95,45 @@ else
rm -rf "${CRASH_CURRENT}"
KDUMP_CNT=$((KDUMP_CNT - 1))
LOOP_CNT=$((LOOP_CNT + 1))
done
LOGS_FOUND=$((LOGS_FOUND + LOOP_CNT))
# Logs should live on logs/ folder (no subfolders), due to the zip file
mv "${KD_FOLDER}"/* "${KDUMP_LOGS_FOLDER}/" 2>/dev/null
# Logs should live on .tmp folder, due to the zip compression.
mv "${KD_FOLDER}"/* "${KDUMP_TMP_FOLDER}/" 2>/dev/null
rm -rf "${KD_FOLDER}"
fi
fi
# If we have pstore and/or kdump logs, let's process them in order to submit...
# If we have pstore and/or kdump logs, let's process them...
KDUMP_LOGS_FOLDER="${KDUMP_MAIN_FOLDER}/logs"
if [ ${LOGS_FOUND} -ne 0 ]; then
mkdir -p "${KDUMP_LOGS_FOLDER}"
PNAME="$(dmidecode -s system-product-name)"
if [ "${PNAME}" = "Jupiter" ]; then
SN="$(dmidecode -s system-serial-number)"
else
SN=0
fi
# First we collect some more info, like DMI data, os-release, etc;
DMI_FNAME="${KDUMP_TMP_FOLDER}/dmidecode.${CURRENT_TSTAMP}"
dmidecode > "${DMI_FNAME}"
STEAM_ACCOUNT=0
STEAM_ID=0
get_steam_account_id "${MOUNT_DEVNODE}"
# Here we collect some more info, like DMI data, os-release, etc;
# TODO: Add Steam application / Proton / Games logs collection...
dmidecode > "${KDUMP_LOGS_FOLDER}/dmidecode.${CURRENT_TSTAMP}"
BUILD_FNAME="${KDUMP_LOGS_FOLDER}/build.${CURRENT_TSTAMP}"
BUILD_FNAME="${KDUMP_TMP_FOLDER}/build.${CURRENT_TSTAMP}"
cp "/etc/os-release" "${BUILD_FNAME}"
VERSION_FNAME="${KDUMP_LOGS_FOLDER}/version.${CURRENT_TSTAMP}"
VERSION_FNAME="${KDUMP_TMP_FOLDER}/version.${CURRENT_TSTAMP}"
uname -r > "${VERSION_FNAME}"
# Before compressing the logs, save a crash summary
CRASH_SUMMARY="${KDUMP_LOGS_FOLDER}/crash_summary.${CURRENT_TSTAMP}"
SED_EXPR="/Kernel panic \-/,/Kernel Offset\:/p"
sed -n "${SED_EXPR}" "${KDUMP_LOGS_FOLDER}"/dmesg* > "${CRASH_SUMMARY}"
sync "${BUILD_FNAME}" "${VERSION_FNAME}" "${CRASH_SUMMARY}"
sync "${DMI_FNAME}" "${BUILD_FNAME}" "${VERSION_FNAME}"
# Create the dump compressed pack.
LOG_FNAME="steamos-${SN}-${STEAM_ACCOUNT}.${CURRENT_TSTAMP}.zip"
LOG_FNAME="${KDUMP_MAIN_FOLDER}/${LOG_FNAME}"
zip -9 -jq "${LOG_FNAME}" "${KDUMP_LOGS_FOLDER}"/* 1>/dev/null 2>&1
LOG_FNAME="kdump-${CURRENT_TSTAMP}.zip"
LOG_FNAME="${KDUMP_LOGS_FOLDER}/${LOG_FNAME}"
zip -9 -jq "${LOG_FNAME}" "${KDUMP_TMP_FOLDER}"/* 1>/dev/null
sync "${LOG_FNAME}" 2>/dev/null
if [ ! -s "${LOG_FNAME}" ]; then
logger "kdump-steamos: couldn't create the log archive, aborting..."
exit 0
else
logger "kdump-steamos: logs saved locally (check ${KDUMP_LOGS_FOLDER})"
fi
fi
##############################
# Log submission mechanism #
##############################
NOT_SENT_FLD="${KDUMP_MAIN_FOLDER}/not_sent_logs"
SENT_FLD="${KDUMP_MAIN_FOLDER}/sent_logs"
# The POST request requires a valid Steam ID.
if [ "${STEAM_ID}" -eq 0 ]; then
logger "kdump-steamos: invalid Steam ID, cannot submit logs"
LOG_SUBMISSION=0 # force to enter next conditional
fi
# If users don't want to submit the logs (or Steam ID is invalid),
# just save them locally and bail out.
if [ "${LOG_SUBMISSION}" -eq 0 ]; then
rm -rf "${KDUMP_LOGS_FOLDER}"
save_locally_and_bail "${NOT_SENT_FLD}" "${LOG_FNAME}"
fi
# Construct the POST request fields...
REQ_DUMP_SZ="$(stat --printf="%s" "${LOG_FNAME}")"
REQ_PRODUCT="holo"
REQ_BUILD="$(grep "BUILD_ID" "${BUILD_FNAME}" | cut -f2 -d=)"
REQ_VER="$(cat "${VERSION_FNAME}")"
REQ_PLATFORM="linux"
REQ_TIME="${CURRENT_EPOCH}"
STACK_SED_EXPR="/ Call Trace\:/,/ RIP\:/p"
REQ_STACK="$(sed -n "${STACK_SED_EXPR}" "${CRASH_SUMMARY}" | sed "1d")"
REQ_NOTE="$(cat "${CRASH_SUMMARY}")"
POST_REQ="steamid=${STEAM_ID}&have_dump_file=1&dump_file_size=${REQ_DUMP_SZ}&product=${REQ_PRODUCT}&build=${REQ_BUILD}"
POST_REQ="${POST_REQ}&version=${REQ_VER}&platform=${REQ_PLATFORM}&crash_time=${REQ_TIME}&stack=${REQ_STACK}&note=${REQ_NOTE}&format=json"
# Now we can safely delete this folder.
rm -rf "${KDUMP_LOGS_FOLDER}"
# Network validation before log submission
LOOP_CNT=0
MAX_LOOP=99
TEST_URL="steampowered.com"
while [ ${LOOP_CNT} -lt ${MAX_LOOP} ]; do
if ping -i 0.5 -w 2 -c 2 "${TEST_URL}" 1>/dev/null 2>&1; then
break
fi
LOOP_CNT=$((LOOP_CNT + 1))
sleep 1
done
# Bail out in case we have network issues
if [ ${LOOP_CNT} -ge ${MAX_LOOP} ]; then
logger "kdump-steamos: network issue - cannot send logs"
save_locally_and_bail "${NOT_SENT_FLD}" "${LOG_FNAME}"
fi
# These URLs are hardcoded based on Valve's server information.
START_URL="https://api.steampowered.com/ICrashReportService/StartCrashUpload/v1"
FINISH_URL="https://api.steampowered.com/ICrashReportService/FinishCrashUpload/v1"
CURL_ERR="${KDUMP_MAIN_FOLDER}/.curl_err"
RESPONSE_FILE="${KDUMP_MAIN_FOLDER}/.curl_response"
if ! curl -X POST -d "${POST_REQ}" "${START_URL}" 1>"${RESPONSE_FILE}" 2>"${CURL_ERR}"; then
logger "kdump-steamos: curl issues - failed in the log submission POST (err=$?)"
#rm -f "${RESPONSE_FILE}" # keep this for now, as debug information
save_locally_and_bail "${NOT_SENT_FLD}" "${LOG_FNAME}"
fi
RESPONSE_PUT_URL="$(jq -r '.response.url' "${RESPONSE_FILE}")"
RESPONSE_GID="$(jq -r '.response.gid' "${RESPONSE_FILE}")"
# Construct the PUT request based on the POST response
CURL_PUT_HEADERS="${KDUMP_MAIN_FOLDER}/.curl_put_headers"
PUT_HEADERS_LEN=$(jq '.response.headers.pairs | length' "${RESPONSE_FILE}")
# Validate the response headers; allow a maximum of 20 arguments for now...
if [ "${PUT_HEADERS_LEN}" -le 0 ] || [ "${PUT_HEADERS_LEN}" -gt 20 ]; then
logger "kdump-steamos: unsupported number of response headers (${PUT_HEADERS_LEN}), aborting..."
#rm -f "${RESPONSE_FILE}" # keep this for now, as debug information
save_locally_and_bail "${NOT_SENT_FLD}" "${LOG_FNAME}"
fi
LOOP_CNT=0
while [ ${LOOP_CNT} -lt "${PUT_HEADERS_LEN}" ]; do
NAME="$(jq -r ".response.headers.pairs[${LOOP_CNT}].name" "${RESPONSE_FILE}")"
VAL="$(jq -r ".response.headers.pairs[${LOOP_CNT}].value" "${RESPONSE_FILE}")"
echo "${NAME}: ${VAL}" >> "${CURL_PUT_HEADERS}"
LOOP_CNT=$((LOOP_CNT + 1))
done
rm -f "${RESPONSE_FILE}"
if ! curl -X PUT --data-binary "@${LOG_FNAME}" -H "@${CURL_PUT_HEADERS}" "${RESPONSE_PUT_URL}" 1>/dev/null 2>"${CURL_ERR}"; then
logger "kdump-steamos: curl issues - failed in the log submission PUT (err=$?)"
#rm -f "${CURL_PUT_HEADERS}" # keep this for now, as debug information
save_locally_and_bail "${NOT_SENT_FLD}" "${LOG_FNAME}"
fi
if ! curl -X POST -d "gid=${RESPONSE_GID}" "${FINISH_URL}" 1>/dev/null 2>"${CURL_ERR}"; then
logger "kdump-steamos: curl issues - failed in the log finish POST (err=$?)"
#rm -f "${CURL_PUT_HEADERS}" # keep this for now, as debug information
save_locally_and_bail "${NOT_SENT_FLD}" "${LOG_FNAME}"
fi
# If we reached this point, the zipped log should have been submitted
# succesfully; save a local copy as well.
# TODO: implement a clean-up routine to just keep up to N logs...
rm -f "${CURL_PUT_HEADERS}"
rm -f "${CURL_ERR}"
mkdir -p "${SENT_FLD}"
logger "kdump-steamos: successfully submitted crash logs to Valve"
mv "${LOG_FNAME}" "${SENT_FLD}"
LOG_FNAME="$(basename "${LOG_FNAME}")"
logger "kdump-steamos: logs also saved locally (${SENT_FLD}/${LOG_FNAME})"
fi
rm -rf "${KDUMP_TMP_FOLDER}"