10 次中有 1 次,systemd 在重启期间挂起。我不明白原因。我应该看什么/在哪里解决问题?我正在使用 systemd v196 并且无法将其升级到版本 >=198,因为后者需要最新的内核(支持 cgroups),无法根据客户要求进行更新。我想知道是否有合理的方法可以发现这种行为的原因并让systemd无条件重启系统。
请注意,此链接没有帮助:http : //freedesktop.org/wiki/Software/systemd/Debugging/#index2h1
正如你可以在那里读到的:
关机永不结束
如果正常重启或关机等待几分钟后仍然无法完成,则上述创建关机日志的方法无济于事,必须使用其他方法获取日志。对调试启动问题有用的两个选项也可用于关闭问题:
Run Code Online (Sandbox Code Playgroud)use a serial console use a debug shell - not only is it available from early boot, it also stays active until late shutdown.
我正在使用串行控制台,出于某种原因,我什至可以登录,因为 eth 接口已启动或已启动(在重新启动步骤期间断开连接后)。
我看不出原因。
# cat /etc/systemd/system/
basic.target.wants/ getty.target.wants/ multi-user.target.wants/ sysinit.target.wants/
dbus-org.freedesktop.NetworkManager.service local-fs-pre.target.wants/ sockets.target.wants/ syslog.service
display-manager.service local-fs.target.wants/ swap.target
Run Code Online (Sandbox Code Playgroud)
注意 swap.target 。它在那里,但我们根本不使用交换分区。我试图屏蔽交换,但挂起问题仍然存在。控制台的最后一行是:
[OK] Stopped target shutdown.
Run Code Online (Sandbox Code Playgroud)
编辑:正如我所说,我可以通过 ssh 通过 eth 重新登录。
现在我将向您展示两个日志。第一个日志发生在重启/shutdwon 挂起时,而第二个日志发生在重启成功时:
挂起案例,输出总是这样(完整日志):
[ OK ] Stopped Network Time Service (one-shot ntpdate mode).
Stopping Modem and VPN connections autoconnect...
Stopping Login Service...
Stopping LSB: Avahi mDNS/DNS-SD Daemon...
[ OK ] Stopped Monitoring free system resources.
[ OK ] Stopped Monitoring dropbear socket.
[ OK ] Stopped Login Service.
[ OK ] Stopped Modem and VPN c[ OK ] Stopped Getty on tty1.
[ OK ] Stopped Serial Getty on ttyO0.
[ OK ] Unmounted /var/lib/opkg.
[ OK ] Stopped Network Manager.
[ OK ] Stopped LSB: Avahi mDNS/DNS-SD Daemon.
Stopping D-Bus System Message Bus...
[ OK ] Stopped target Remote File Systems.
[ OK ] Stopped Suspend manager.
Stopping X Server...
[ OK ] Stopped X Server.
Stopping System Logging Service...
[ OK ] Stopped System Logging Service.
[ 77.580000] g_ether gadget: using random self ethernet address
[ 77.580000] g_ether gadget: using random host ethernet address
[ 77.590000] usb0: MAC 6e:0d:de:b0:33:4f
[ 77.590000] usb0: HOST MAC 62:7a:81:02:f3:ff
[ 77.600000] g_ether gadget: Ethernet Gadget, version: Memorial Day 2008
[ 77.600000] g_ether gadget: g_ether ready
[ 77.610000] musb-hdrc musb-hdrc.0: MUSB HDRC host driver
[ 77.610000] musb-hdrc musb-hdrc.0: new USB bus registered, assigned bus number 2
[ 77.620000] usb usb2: New USB device found, idVendor=1d6b, idProduct=0002
[ 77.630000] usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[ 77.640000] usb usb2: Product: MUSB HDRC host driver
[ 77.640000] usb usb2: Manufacturer: Linux 2.6.37 musb-hcd
[ 77.650000] usb usb2: SerialNumber: musb-hdrc.0
[ 77.650000] hub 2-0:1.0: USB hub found
[ 77.660000] hub 2-0:1.0: 1 port detected
[ 77.690000] ADDRCONF(NETDEV_UP): usb0: link is not ready
[ OK ] Stopped target Reboot.
[ OK ] Stopped Reboot.
[ OK ] Stopped target Unmount All Filesystems.
[ OK ] Stopped target Shutdown.
[ 78.330000] <46>systemd-journald[328]: Received SIGUSR1
<hang>
Run Code Online (Sandbox Code Playgroud)
正常重启:
Unmounting /var/lib/opkg...
[ OK ] Stopped target Network.
Stopping SSH Per-Connection Server...
[ OK ] Stopped target Graphical Interface.
[ OK ] Stopped target Multi-User.
Stopping Monitoring free system resources...
Stopping Monitoring dropbear socket...
Stopping Network Time Service (one-shot ntpdate mode)...
[ OK ] Stopped Network Time Service (one-shot ntpdate mode).
Stopping Modem and VPN connections autoconnect...
Stopping Login Service...
Stopping LSB: Avahi mDNS/DNS-SD Daemon...
[ OK ] Stopped Monitoring free system resources.
[ OK ] Stopped Monitoring dropbear socket.
[ OK ] Stopped Login Service.
[ OK ] Unmounted /var/lib/opkg.
Stopping Network Manager...
[ OK ] Stopped Getty on tty1.
[ OK ] Stopped Network Manager.
[ OK ] Stopped Serial Getty on ttyO0.
[ OK ] Stopped Suspend manager.
[ OK ] Stopped LSB: Avahi mDNS/DNS-SD Daemon.
Stopping D-Bus System Message Bus...
Stopping X Server...
Stopping Permit User Sessions...
[ OK ] Stopped Permit User Sessions.
[ OK ] Stopped target Remote File Systems.
[ OK ] Stopped X Server.
[ OK ] Stopped D-Bus System Message Bus.
Stopping System Logging Service...
[ OK ] Stopped System Logging Service.
[ OK ] Stopped target Basic System.
[ OK ] Stopped target Sockets.
[ OK ] Closed dropbear.socket.
[ OK ] Closed D-Bus System Message Bus Socket.
[ OK ] Stopped target System Initialization.
Stopping Import configuration from SD card...
[ OK ] Stopped Import configuration from SD card.
Stopping Load Kernel Modules...
Stopping Apply Kernel Variables...
[ OK ] Stopped Apply Kernel Variables.
[ OK ] Stopped target Local File Systems.
Unmounting /var...
Unmounting /tmp...
[ OK ] Closed Syslog Socket.
[ OK ] Failed unmounting /var.
[ OK ] Unmounted /tmp.
[ OK ] Stopped Load Kernel Modules.
[ OK ] Reached target Unmount All Filesystems.
[ OK ] Stopped target Local File Systems (Pre).
Stopping Remount Root and Kernel File Systems...
[ OK ] Stopped Remount Root and Kernel File Systems.
[ OK ] Reached target Shutdown.
[ 52.340000] omap_wdt: Unexpected close, not stopping!
Sending SIGTERM to remaining processes...
[ 52.490000] <46>systemd-journald[335]: Received SIGTERM
Sending SIGKILL to remaining processes...
Unmounting file systems.
Unmounting /sys/fs/fuse/connections.
Unmounting /var.
All filesystems unmounted.
Deactivating swaps.
All swaps deactivated.
Run Code Online (Sandbox Code Playgroud)
更新:
经过一番调查和调试,我发现了关机中断的原因,虽然我仍然无法解决。发生的情况是,由于某些原因,在关闭完成之前启动了一个自定义服务,这使得关闭过程挂起。这是挂起的一种情况。另一种挂起是当关机没有中断但在某个时刻停止时。为此,在解决所有冲突和其他可能的一次一个挂起之前,我想无条件激活硬件看门狗。为了通过 systemd 做到这一点,我单独或一起启用并测试了 RuntimeWatchdogSec 和 ShutdownWatchdogSec。不幸的是,他们没有帮助。通过查看源代码,
我被困住了。我要问你的是找到一种方法: 1.至少从关闭开始的点开始无条件启用看门狗2. 以简单的方式检测并解决所有冲突
优选第一种解决方案。
我冒险提出一个解决方案:尝试添加
Before=basic.target
Run Code Online (Sandbox Code Playgroud)
到 /usr/lib/systemd/system/dbus.service。
我对你的日志中的一个奇怪现象感到震惊,这让我想起了一段时间前我在 Arch Linux 论坛上读到的一个事故:这个系统会在重启时挂起。上面提供了解决方案,理由是挂起是由某些服务在停止后尝试与 d-bus 通信引起的:
因此,通过在 basic.target 之前对其进行排序,它不仅在达到基本目标之前启动,而且还确保它一直存在,直到在关闭期间 basic.target 被关闭之后。
在您的不健康日志中,我们实际上看到基本系统并未停止,而在健康日志中已正确停止。
如果这不起作用,并且考虑到您无法升级,您是否考虑过降级?
shutdown.target
默认情况下与所有其他单元冲突,以便在关闭过程开始时自动停止它们。反之亦然——如果另一个单元启动,它就会shutdown.target
停止。所以问题是某些东西导致某些东西在关闭期间启动,这会覆盖关闭过程。
这个问题应该在 systemd v198 中得到修复,这使得关闭工作变得“不可替代”。
归档时间: |
|
查看次数: |
36136 次 |
最近记录: |