Why did my remote device reboot?
Imagine a remote system hundreds of kilometers away from civilization. A customer reports that their system sometimes reboots without any interaction and is then not reachable for several minutes. What was the trigger for the reboot? In this article, we look at a concept which uses reserved memory. It is quite generic and benefits from the fact that RAM does not lose data on a restart as long as it is not turned off and there are no other hardware/software mechanisms to clear the memory.
For this article an Apalis iMX8QM from Toradex was used with Reference Multimedia Image 5.1.0.
Ramoops
In a previous article about ramoops we saw a mechanism to log kernel panics and other log messages across reboots by using a reserved memory region inside RAM. Here we will use a similar concept but implement a separate kernel module which uses reserved memory region to store information about why the system was rebooting.
Why did the device reboot?
The ramoops solution is nice to figure out why a system crashed and to readout the backtrace after a kernel oops. However, it doesn't allow us to read why exactly a system was rebooting. In this article we try to find a solution to get information about why the device was restarting. We are interested in the following reboot reasons:
- Software reboot (reboot command)
- Power-off (command or unplugging)
- Kernel Panic
- Watchdog
- Power drop
Reset reason module
We implement a small platform driver which uses reserved memory to store information about why a system was rebooting. Let us have a look at the device tree:
reserved-memory {
#address-cells = <2>;
#size-cells = <2>;
ranges;
reset_reason_mem: reset_reason@880100000 {
#address-cells = <2>;
#size-cells = <2>;
no-map;
reg = <0x00000008 0x80100000 0x0 0x1000>; // 4kB at 0x880100000
};
};
reset-reason {
compatible = "reset-reason";
memory-region = <&reset_reason_mem>;
};
The kernel module is compatible to "reset-reason" and expects a phandle to a reserved memory region.
Implementation
We implement a platform driver which will implement the functionality described above. The source code for this driver can be found on Github. In this section we will discuss some parts of the kernel module in more detail.
In the probe function of the driver we need to allocated the memory-region and remap it from physical to virtual address space:
node = of_parse_phandle(pdev->dev.of_node, "memory-region", 0);
if (!node) {
dev_err(&pdev->dev, "Memory region missing\n");
return -EINVAL;
}
rmem = of_reserved_mem_lookup(node);
of_node_put(node);
pdata->regs = ioremap(rmem->base, rmem->size);
if (pdata->regs == NULL) {
dev_err(&pdev->dev, "Can not remap memory\n");
return -ENOMEM;
}
If an event like reboot, panic or watchdog timeout happens we need to get informed about that. We do this by registering to notifier chains. Note that the watchdog_notifier_list is a feature we will discuss later. Here the code block which registers the callbacks:
register_reboot_notifier(&reboot_notify_block);
atomic_notifier_chain_register(&panic_notifier_list, &panic_notify_block);
atomic_notifier_chain_register(&watchdog_notifier_list, &watchdog_notify_block);
A notify block contains a callback which will be called when the event happens.
static struct notifier_block panic_notify_block = {
.notifier_call = panic_notify,
};
The actual handler for the event looks pretty simple, here the panic notifier:
static int panic_notify(struct notifier_block *this, unsigned long event, void *ptr)
{
write_reset_reg(pdata->regs, OOPS_PATTERN);
pdata->oops_pending = true;
return 0;
}
write_reset_reg will write a pattern (32 bit value) to the reserved memory region. To make the algorithm more robust we additionally store a 32 bit crc so that we can verify the value is valid after a reboot.
When loading the module we read the reset reason again at the same address in the reserved memory, verify that the crc is valid and if so we know why the system was rebooting. If the crc is not valid we assume the device was rebooting because of a power-cycle. It would be possible to get even more information, if we read data from the PMIC, RTC, etc. However, this is more device specific. The approach shown here should work in most use cases.
Watchdog pretimeout
One thing that needs some additional words is the watchdog pretimeout. There are a lot of Linux watchdog drivers which support this feature. It allows us to set a pretimeout which will trigger before the actual watchdog triggers. If we e.g. set a pretimeout of 1s and the watchdog to 10s then the pretimeout will trigger after 9s. With that feature we can do one last action before the watchdog will reset the system. In our example we store the value 0x781f9ce2 inside the reserved memory region.
The pretimeout is implemented in a way that it expects a governor to be set. We could write our driver in a way that it also registers a watchdog pretimeout governor. However, here we implement a small governor which then calls an attomic notfier call chain. This driver will be an in-tree kernel module so that we can choose it as default governor. The patch to add this governor can be found on Github.
We register the governor as follows:
static int __init watchdog_gov_notifier_register(void)
{
return watchdog_register_governor(&watchdog_gov_notifier);
}
When the pretimeout occurs the following function is called:
static void pretimeout_notifier(struct watchdog_device *wdd)
{
atomic_notifier_call_chain(&watchdog_notifier_list, 0, wdd);
}
Compiling for Apalis iMX8 with Kernel toradex_5.4-2.1.x-imx
In this section we will build the kernel with the notifier governor, the reset reason module and a utility to test the watchdog pretimeout.
We first checkout the reset reason module.
build@machine:~$ git clone git@github.com:embear-engineering/sample-kernel-modules.git
Then we checkout the kernel and patch it with the pretimeout notifier governor.
build@machine:~$ git clone -b toradex_5.4-2.1.x-imx git://git.toradex.com/linux-toradex.git
build@machine:~$ cd linux-toradex
build@machine:~/linux-toradex$ git am ../sample-kernel-modules/reset-reason/0001-watchdog-pretimeout-add-an-atomic-notifier-governor.patch
An additional patch to fix the pretimeout on the iMX8 is required. This is only required because there is a bug in the driver and shouldn't be required for other watchdog drivers. The patch should land in upstream kernel at one point:
diff --git a/drivers/watchdog/imx_sc_wdt.c b/drivers/watchdog/imx_sc_wdt.c
index a84a29f72bd5..27b84ab5d6f0 100644
--- a/drivers/watchdog/imx_sc_wdt.c
+++ b/drivers/watchdog/imx_sc_wdt.c
@@ -196,16 +196,12 @@ static int imx_sc_wdt_probe(struct platform_device *pdev)
watchdog_stop_on_reboot(wdog);
watchdog_stop_on_unregister(wdog);
- ret = devm_watchdog_register_device(dev, wdog);
- if (ret)
- return ret;
-
ret = imx_scu_irq_group_enable(SC_IRQ_GROUP_WDOG,
SC_IRQ_WDOG,
true);
if (ret) {
dev_warn(dev, "Enable irq failed, pretimeout NOT supported\n");
- return 0;
+ goto register_device;
}
imx_sc_wdd->wdt_notifier.notifier_call = imx_sc_wdt_notify;
@@ -216,7 +212,7 @@ static int imx_sc_wdt_probe(struct platform_device *pdev)
false);
dev_warn(dev,
"Register irq notifier failed, pretimeout NOT supported\n");
- return 0;
+ goto register_device;
}
ret = devm_add_action_or_reset(dev, imx_sc_wdt_action,
@@ -226,7 +222,8 @@ static int imx_sc_wdt_probe(struct platform_device *pdev)
else
dev_warn(dev, "Add action failed, pretimeout NOT supported\n");
- return 0;
+register_device:
+ return devm_watchdog_register_device(dev, wdog);
}
static int __maybe_unused imx_sc_wdt_suspend(struct device *dev)
We store this patch in a file and apply it as well:
build@machine:~/linux-toradex$ patch -p1 < <path to patch file>
We modify the device tree file arch/arm64/boot/dts/freescale/imx8qm-apalis-v1.1-eval.dts in the following way:
// SPDX-License-Identifier: GPL-2.0+ OR X11
/*
* Copyright 2020 Toradex
*/
/dts-v1/;
#include "imx8qm-apalis-v1.1.dtsi"
#include "imx8-apalis-eval.dtsi"
/ {
model = "Toradex Apalis iMX8QM V1.1 on Apalis Evaluation Board";
compatible = "toradex,apalis-imx8-v1.1-eval",
"toradex,apalis-imx8-eval",
"toradex,apalis-imx8",
"fsl,imx8qm";
reserved-memory {
#address-cells = <2>;
#size-cells = <2>;
ranges;
reset_reason_mem: reset_reason@880100000 {
#address-cells = <2>;
#size-cells = <2>;
no-map;
reg = <0x00000008 0x80100000 0x0 0x1000>; // 4kB at 0x880100000
};
};
reset-reason {
compatible = "reset-reason";
memory-region = <&reset_reason_mem>;
};
};
This will instruct the reset-reason module to use 4kB at 0x880100000 to store the reset reason. We wouldn't need 4kB but depending on the architecture 4kB is the minimum amount of memory which can be used per block.
Now we get the .config file from the module:
build@machine:~/linux-toradex$ export ARCH=arm64
build@machine:~/linux-toradex$ export CROSS_COMPILE=aarch64-linux-gnu-
build@machine:~/linux-toradex$ export LOCALVERSION="+git.a2f08dfd79ae"
build@machine:~/linux-toradex$ scp root@<module IP>:/proc/config.gz .
build@machine:~/linux-toradex$ zcat config.gz > .config
By default the pretimeout feature is disabled. We enable it by appending the following lines to the .config file:
CONFIG_WATCHDOG_SYSFS=y
CONFIG_WATCHDOG_PRETIMEOUT_GOV=y
CONFIG_WATCHDOG_PRETIMEOUT_GOV_NOOP=y
CONFIG_WATCHDOG_PRETIMEOUT_GOV_PANIC=y
CONFIG_WATCHDOG_PRETIMEOUT_GOV_NOTIFIER=y
CONFIG_WATCHDOG_PRETIMEOUT_DEFAULT_GOV_NOTIFIER=y
CONFIG_LOCALVERSION_AUTO=n
Now we can compile the kernel and the device tree.
build@machine:~/linux-toradex$ make Image.gz
build@machine:~/linux-toradex$ make freescale/imx8qm-apalis-v1.1-eval.dtb
If we get asked about the default governor we choose notifier as proposed.
We can now deploy the new kernel and device tree file:
build@machine:~/linux-toradex$ scp arch/arm64/boot/Image.gz root@<module IP>:/boot
build@machine:~/linux-toradex$ scp arch/arm64/boot/dts/freescale/imx8qm-apalis-v1.1-eval.dtb root@<module IP>:/boot
The new kernel will have the notifier governor enabled by default:
root@apalis-imx8:~$ cat /sys/class/watchdog/watchdog0/pretimeout_governor
notifier
Please note that when I was testing the new kernel, U-Boot wasn't able to load the device tree anymore. I guess it's a problem with the addresses. To make it boot again I had to change fdt_addr_r and fdt_high in U-Boot:
Apalis iMX8 $ setenv fdt_addr_r 0x8a000000
Apalis iMX8 $ setenv fdt_high 0xFFFFFFFFFFFFFFFF
This moves the device tree load address a bit higher and also instructs U-Boot to not relocate the device tree file. This should normally not be a problem but seems to be an issue for this specific iMX8 U-Boot.
Building the reset reason module
Now that we have a modified kernel, we can build the kernel module. We already checked out the sample-kernel-modules repository in the step above. We change to this directory again and build the module:
build@machine:~$ cd
build@machine:~$ export ARCH=arm64
build@machine:~$ export CROSS_COMPILE=aarch64-linux-gnu-
build@machine:~$ export LOCALVERSION="+git.a2f08dfd79ae"
build@machine:~$ export KDIR=~/linux-toradex
build@machine:~$ cd sample-kernel-modules/reset-reason
build@machine:~/sample-kernel-modules/reset-reason$ make -j
build@machine:~/sample-kernel-modules/reset-reason$ scp reset-reason.ko root@<module IP>:~
This is all we need to do. We can now load the module and use it as shown in the next section. To test if the watchdog works properly we should also compile the watchdog test program in the same repository. This program starts the watchdog (by opening the watchdog device) and then configures a timeout and pretimeout. This is necessary because normally no pretimeout is configured and the only available interface to configure a pretimeout is via ioctl.
build@machine:~/sample-kernel-modules/reset-reason$ cd watchdog
build@machine:~/sample-kernel-modules/reset-reason/watchdog$ export CC=aarch64-linux-gnu-gcc
build@machine:~/sample-kernel-modules/reset-reason/watchdog$ make -j
build@machine:~/sample-kernel-modules/reset-reason/watchdog$ scp watchdog-test root@<module IP>:~
Now we are ready to load the module and test the functionality.
Using the kernel module
We already deployed the module and the watchdog test application. We can now load the module:
root@apalis-imx8:~$ insmod reset-reason.ko
root@apalis-imx8:~$ cat /sys/devices/platform/reset-reason/reset_reason
power-cycle
The module should show power-cycle if it wasn't loaded before. Even if it was after a reboot. We now make the module load automatically:
root@apalis-imx8:~$ cp reset-reason.ko /lib/modules/5.4.77-5.1.0+git.a2f08dfd79ae/extra/
root@apalis-imx8:~$ depmod -a
root@apalis-imx8:~$ echo reset-reason > /etc/modules-load.d/reset-reason.conf
We can do a reboot and see if the module correctly detects it:
root@apalis-imx8:~$ reboot
...
root@apalis-imx8:~$ cat /sys/devices/platform/reset-reason/reset_reason
reboot
The same for a kernel panic:
root@apalis-imx8:~$ echo 1 > /proc/sys/kernel/panic
root@apalis-imx8:~$ echo c > /proc/sysrq-trigger
...
root@apalis-imx8:~$ cat /sys/devices/platform/reset-reason/reset_reason
panic
If a watchdog triggers:
root@apalis-imx8:~$ ./watchdog-test -d /dev/watchdog
...
root@apalis-imx8:~$ cat /sys/devices/platform/reset-reason/reset_reason
watchdog
If we press the reset button only for a short amount of time we can simulate a voltage drop:
root@apalis-imx8:~$ cat /sys/devices/platform/reset-reason/reset_reason
unknown (e.g. voltage dip)
And again after a power off for more than 10s (so that the RAM gets cleared):
root@apalis-imx8:~$ cat /sys/devices/platform/reset-reason/reset_reason
power-cycle
Here a demo on how it looks like when using it on an actual device:
Reset reason module compared to ramoops
In comparison to ramoops the reset reason module isn't part of the kernel. However, the nice thing about it is that it can directly provide information why Linux was restarting. It can even differentiate between kernel panic, watchdog and normal reset. So to know why the system was rebooting this modules has some advantages. Therefore, a combination of ramoops and the reset reason module seems to be a good idea if we want to monitor kernel issues on our remote device far away.