AWS Beanstalk泊坞窗异常:“ Shim收获”

alc*_*yon 7 shim amazon-web-services docker amazon-elastic-beanstalk

我目前正在使用docker环境在生产中在AWS beantalk上部署应用程序(nodejs websocket服务器)。

定期地,容器“崩溃”(实际上,容器中的主要过程会重新启动),我不知道为什么。 /var/log/docker包含以下日志(在事件发生的确切时刻):

time="2018-12-07T00:48:46Z" level=info msg="shim reaped" id=0af18fa159c07b167a29012b34c6c925c877f98d9a09dcd67078aa6c12f4ef2f 
time="2018-12-07T00:48:46.052832134Z" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
time="2018-12-07T00:48:46Z" level=info msg="shim docker-containerd-shim started" address="/containerd-shim/moby/0af18fa159c07b167a29012b34c6c925c877f98d9a09dcd67078aa6c12f4ef2f/shim.sock" debug=false pid=9192
Run Code Online (Sandbox Code Playgroud)

那时的CPU和RAM似乎还可以。有人有线索吗?

编辑 还有其他日志,但是我怀疑是这样的结果:

/var/log/nginx/error.log:

2018/12/07 00:48:45 [error] 4268#0: *10397 recv() failed (104: Connection reset by peer) while proxying upgraded connection, client: 172.31.43.209, server: , request: "GET /stream?s=000 HTTP/1.1", upstream: "http://172.17.0.2:80/stream?s=000", host: "..."
2018/12/07 00:48:45 [error] 4268#0: *1009 recv() failed (104: Connection reset by peer) while proxying upgraded connection, client: 172.31.43.209, server: , request: "GET /stream?s=000 HTTP/1.1", upstream: "http://172.17.0.2:80/stream?s=000", host: "..."
2018/12/07 00:48:46 [error] 4267#0: *11092 connect() failed (111: Connection refused) while connecting to upstream, client: 172.31.12.149, server: , request: "GET /stream?s=000 HTTP/1.1", upstream: "http://172.17.0.2:80/stream?s=000", host: "..."
Run Code Online (Sandbox Code Playgroud)

/var/log/docker-events.log

2018-12-07T00:48:46.052880449Z container die 0af18fa159c07b167a29012b34c6c925c877f98d9a09dcd67078aa6c12f4ef2f (exitCode=1, image=2fc4abcada2b, name=inspiring_euler)
2018-12-07T00:48:46.176330610Z network disconnect 94c449d445a5a434af70517a1c8734c540c5c1f9ddbbc1a53a002f25dbc7f581 (container=0af18fa159c07b167a29012b34c6c925c877f98d9a09dcd67078aa6c12f4ef2f, name=bridge, type=bridge)
2018-12-07T00:48:46.626514590Z network connect 94c449d445a5a434af70517a1c8734c540c5c1f9ddbbc1a53a002f25dbc7f581 (container=0af18fa159c07b167a29012b34c6c925c877f98d9a09dcd67078aa6c12f4ef2f, name=bridge, type=bridge)
2018-12-07T00:48:46.869988171Z container start 0af18fa159c07b167a29012b34c6c925c877f98d9a09dcd67078aa6c12f4ef2f (image=2fc4abcada2b, name=inspiring_euler)
Run Code Online (Sandbox Code Playgroud)

小智 1

此故障可能是由于 Containerd 在启用了 THP(透明大页)的系统上运行所致。内存管理方案与容器的内存分配模式不一致,导致失败。https://github.com/containerd/containerd/issues/2202报告了类似的问题

遗憾的是,您无法调整 Elastic Beanstalk 主机的内核设置来解决此问题。该解决方案已针对 mongodb 进行了记录,因为它与 THP 存在类似问题。

https://docs.mongodb.com/manual/tutorial/transparent-huge-pages/