停止服务时单元陷入故障状态(状态 = 143)

ser*_*gei 10 linux systemd systemctl

这是我的问题。我在其上运行 CentOS 和 java 进程。Java 进程由启动/停止脚本操作。它也创建了一个 java 实例的 .pid 文件。

我的单位文件看起来像:

[Unit]
After=syslog.target network.target
Description=Someservice

[Service]
User=xxxuser
Type=forking
WorkingDirectory=/srv/apps/someservice
ExecStart=/srv/apps/someservice/server.sh start
ExecStop=/srv/apps/someservice/server.sh stop
PIDFile=/srv/apps/someservice/application.pid
TimeoutStartSec=0

[Install]
WantedBy=multi-user.target
Run Code Online (Sandbox Code Playgroud)

当我调用stop函数时,脚本终止 java 进程SIGTERM并返回 0 代码:

kill $OPT_FORCEKILL `cat $PID_FILE`
<...>
return 0
Run Code Online (Sandbox Code Playgroud)

之后,如果我检查我的单位的状态,我会得到类似的东西(状态 = 143):

? someservice.service - Someservice
   Loaded: loaded (/usr/lib/systemd/system/someservice.service; enabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Wed 2017-08-30 09:17:40 EEST; 4s ago
  Process: 48365 ExecStop=/srv/apps/someservice/server.sh stop (code=exited, status=0/SUCCESS)
 Main PID: 46115 (code=exited, status=143)

Aug 29 17:10:02 whatever.domain.com systemd[1]: Starting Someservice...
Aug 29 17:10:02 whatever.domain.com systemd[1]: PID file /srv/apps/someservice/application.pid not readable (yet?) after start.
Aug 29 17:10:04 whatever.domain.com systemd[1]: Started Someservice.
Aug 30 09:17:39 whatever.domain.com systemd[1]: Stopping Someservice...
Aug 30 09:17:39 whatever.domain.com server.sh[48365]: Stopping someservice - PID [46115]
Aug 30 09:17:40 whatever.domain.com systemd[1]: someservice.service: main process exited, code=exited, status=143/n/a
Aug 30 09:17:40 whatever.domain.com systemd[1]: Stopped Someservice.
Aug 30 09:17:40 whatever.domain.com systemd[1]: Unit someservice.service entered failed state.
Aug 30 09:17:40 whatever.domain.com systemd[1]: someservice.service failed.
Run Code Online (Sandbox Code Playgroud)

return我的启动/停止脚本中没有该值时,它的行为完全相同。
添加到单元文件中,例如:

[Service]
SuccessExitStatus=143
Run Code Online (Sandbox Code Playgroud)

对我来说不是个好主意。为什么systemctl演技如此,并没有显示我正常的服务状态?

当我尝试修改我的启动/停止脚本而不是return 0我把return 10它放在相同的位置时,但我可以看到它exit 10通过了。
下面是一个例子:

? someservice.service - Someservice
   Loaded: loaded (/usr/lib/systemd/system/someservice.service; enabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Wed 2017-08-30 09:36:22 EEST; 5s ago
  Process: 48460 ExecStop=/srv/apps/someservice/server.sh stop (code=exited, status=10)
  Process: 48424 ExecStart=/srv/apps/someservice/server.sh start (code=exited, status=0/SUCCESS)
 Main PID: 48430 (code=exited, status=143)

Aug 30 09:36:11 whatever.domain.com systemd[1]: Starting Someservice...
Aug 30 09:36:11 whatever.domain.com systemd[1]: PID file /srv/apps/someservice/application.pid not readable (yet?) after start.
Aug 30 09:36:13 whatever.domain.com systemd[1]: Started Someservice.
Aug 30 09:36:17 whatever.domain.com systemd[1]: Stopping Someservice...
Aug 30 09:36:17 whatever.domain.com server.sh[48460]: Stopping someservice - PID [48430]
Aug 30 09:36:21 whatever.domain.com systemd[1]: someservice.service: main process exited, code=exited, status=143/n/a
Aug 30 09:36:22 whatever.domain.com systemd[1]: someservice.service: control process exited, code=exited status=10
Aug 30 09:36:22 whatever.domain.com systemd[1]: Stopped Someservice.
Aug 30 09:36:22 whatever.domain.com systemd[1]: Unit someservice.service entered failed state.
Aug 30 09:36:22 whatever.domain.com systemd[1]: someservice.service failed.
Run Code Online (Sandbox Code Playgroud)

journalctl日志中我可以看到,systemctl首先返回状态=143,然后返回我的返回值 10。所以我猜我的错误是在启动/停止脚本中的某个地方(因为错误代码 143 在函数返回 0 之前被传递)?

Moh*_*nnd 10

您应该能够通过将退出代码添加到单元文件中作为“成功”退出状态来抑制这种情况:

[Service]
SuccessExitStatus=143
Run Code Online (Sandbox Code Playgroud)

来源

  • 否决者可以解释一下吗?,问题的答案取决于 Linux 管理信息。您认为每个 Java 开发人员都应该是 Linux 管理员吗? (3认同)