Linux 网络崩溃：找出原因的最佳步骤？

Question

Linux 网络崩溃：找出原因的最佳步骤？

Aro*_*eel 8 networking linux centos

我们的一台 Linux (CentOS) 服务器昨晚无法访问。

除了远程控制台之外，无法以任何方式访问服务器。使用远程控制台登录后，结果我也无法 ping 任何外部主机。

一个简单的方法service network restart解决了这个问题，但我仍然想知道是什么导致了这个问题。我的日志文件似乎根本没有显示任何错误（除了需要网络连接并在网络故障后失败的各种守护进程）。

我是否可以采取任何其他步骤来找出此问题的原因？

编辑：这又发生了。在我重新启动网络服务之前，服务器完全没有响应。欢迎任何建议。这可能是由有故障的硬件组件引起的吗？

根据 Madhatters 的要求，以下是当时日志的一些摘录（网络在 20:13 崩溃）：

/var/日志/消息：

Dec  2 20:01:05 graviton kernel: Firewall: *TCP_IN Blocked* IN=eth0 OUT= MAC=<stripped> SRC=<stripped> DST=<stripped> LEN=40 TOS=0x00 PREC=0x00 TTL=101 ID=256 PROTO=TCP SPT=6000 DPT=3306 WINDOW=16384 RES=0x00 SYN URGP=0
Dec  2 20:01:05 graviton kernel: Firewall: *TCP_IN Blocked* IN=eth0 OUT= MAC=<stripped> SRC=<stripped> DST=<stripped> LEN=40 TOS=0x00 PREC=0x00 TTL=100 ID=256 PROTO=TCP SPT=6000 DPT=3306 WINDOW=16384 RES=0x00 SYN URGP=0
Dec  2 20:01:05 graviton kernel: Firewall: *TCP_IN Blocked* IN=eth0 OUT= MAC=<stripped> SRC=<stripped> DST=<stripped> LEN=40 TOS=0x00 PREC=0x00 TTL=101 ID=256 PROTO=TCP SPT=6000 DPT=3306 WINDOW=16384 RES=0x00 SYN URGP=0
Dec  2 20:13:34 graviton junglediskserver: Connection to gateway failed: xGatewayTransport - Connection to gateway failed.

Run Code Online (Sandbox Code Playgroud)

前三个消息是对我通过 LFD 防火墙设置的 iptables 规则的简单响应。最后一条消息表明我用于备份的 JungleDisk 无法再连接到网关。除此之外，这段时间没有有趣的消息。

12 月 4 日编辑：根据 Mattdm 的要求，这里是输出ethtool eth0：

（请注意，这些是当前有效的设置。如果再次出现问题，如有必要，我一定会再次发布。

Settings for eth0:
        Supported ports: [ TP ]
        Supported link modes:   10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
                                1000baseT/Full
        Supports auto-negotiation: Yes
        Advertised link modes:  10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
                                1000baseT/Full
        Advertised auto-negotiation: Yes
        Speed: 1000Mb/s
        Duplex: Full
        Port: Twisted Pair
        PHYAD: 1
        Transceiver: internal
        Auto-negotiation: on
        Supports Wake-on: g
        Wake-on: d
        Link detected: yes

Run Code Online (Sandbox Code Playgroud)

根据 Joris 的要求，这里也是输出route -n：

aron@graviton [~]# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
xx.xx.xx.58    0.0.0.0         255.255.255.255 UH    0      0        0 eth0
xx.xx.xx.42    0.0.0.0         255.255.255.255 UH    0      0        0 eth0
xx.xx.xx.43    0.0.0.0         255.255.255.255 UH    0      0        0 eth0
xx.xx.xx.41    0.0.0.0         255.255.255.255 UH    0      0        0 eth0
xx.xx.xx.46    0.0.0.0         255.255.255.255 UH    0      0        0 eth0
xx.xx.xx.47    0.0.0.0         255.255.255.255 UH    0      0        0 eth0
xx.xx.xx.44    0.0.0.0         255.255.255.255 UH    0      0        0 eth0
xx.xx.xx.45    0.0.0.0         255.255.255.255 UH    0      0        0 eth0
xx.xx.xx.50    0.0.0.0         255.255.255.255 UH    0      0        0 eth0
xx.xx.xx.51    0.0.0.0         255.255.255.255 UH    0      0        0 eth0
xx.xx.xx.48    0.0.0.0         255.255.255.255 UH    0      0        0 eth0
xx.xx.xx.49    0.0.0.0         255.255.255.255 UH    0      0        0 eth0
xx.xx.xx.54    0.0.0.0         255.255.255.255 UH    0      0        0 eth0
xx.xx.xx.52    0.0.0.0         255.255.255.255 UH    0      0        0 eth0
xx.xx.xx.53    0.0.0.0         255.255.255.255 UH    0      0        0 eth0
xx.xx.xx.0     0.0.0.0         255.255.255.192 U     0      0        0 eth0
xx.xx.xx.0     0.0.0.0         255.255.255.0   U     0      0        0 eth0
169.254.0.0     0.0.0.0         255.255.0.0     U     0      0        0 eth0
0.0.0.0         xx.xx.xx.62    0.0.0.0         UG    0      0        0 eth0

Run Code Online (Sandbox Code Playgroud)

底部 xx.62 是我的网关。

编辑 12 月 28 日：问题再次发生，我有机会比较上述测试的一些输出。我发现它arp -an为我的网关返回了一个不完整的 MAC 地址（这不在我的控制之下；服务器位于共享机架中）：

失败时：

? (xx.xx.xx.62) at <incomplete> on eth0

Run Code Online (Sandbox Code Playgroud)

之后service network restart：

? (xx.xx.xx.62) at 00:00:0C:9F:F0:30 [ether] on eth0

Run Code Online (Sandbox Code Playgroud)

这是我可以解决的问题还是我该联系数据中心了？

Answer 1

Aro*_*eel 1

这个问题很久以前就已经解决了：问题显然与硬件有关。

新的网卡已经解决了这个问题。

归档时间：	15 年，3 月前
查看次数：	16265 次
最近记录：	13 年，10 月前