TCP 死在 Linux 笔记本电脑上

Rom*_*aka 17 networking debian tcp

几天一次,我遇到以下问题。我的笔记本电脑(Debian 测试)突然无法使用 TCP 连接到互联网。

以下事情继续正常工作:

  • UDP (DNS)、ICMP (ping) — 我得到即时响应
  • 与本地网络中其他机器的 TCP 连接(例如,我可以通过 ssh 连接到邻居的笔记本电脑)
  • 我局域网中的其他机器一切正常

但是当我从我的笔记本电脑尝试 TCP 连接时,它们超时(对 SYN 数据包没有响应)。这是一个典型的 curl 输出:

% curl -v google.com     
* About to connect() to google.com port 80 (#0)
*   Trying 173.194.39.105...
* Connection timed out
*   Trying 173.194.39.110...
* Connection timed out
*   Trying 173.194.39.97...
* Connection timed out
*   Trying 173.194.39.102...
* Timeout
*   Trying 173.194.39.98...
* Timeout
*   Trying 173.194.39.96...
* Timeout
*   Trying 173.194.39.103...
* Timeout
*   Trying 173.194.39.99...
* Timeout
*   Trying 173.194.39.101...
* Timeout
*   Trying 173.194.39.104...
* Timeout
*   Trying 173.194.39.100...
* Timeout
*   Trying 2a00:1450:400d:803::1009...
* Failed to connect to 2a00:1450:400d:803::1009: Network is unreachable
* Success
* couldn't connect to host
* Closing connection #0
curl: (7) Failed to connect to 2a00:1450:400d:803::1009: Network is unreachable
Run Code Online (Sandbox Code Playgroud)

重新启动连接和/或重新加载网卡内核模块无济于事。唯一有帮助的是重新启动。

显然我的系统出了问题(其他一切正常),但我不知道究竟是什么。

我的设置是通过 PPPoE 连接到 ISP 的无线路由器。

有什么建议吗?

对评论的回答

它是什么网卡?

12:00.0 Network controller: Broadcom Corporation BCM4313 802.11b/g/n Wireless LAN Controller (rev 01)
  Subsystem: Dell Inspiron M5010 / XPS 8300
  Flags: bus master, fast devsel, latency 0, IRQ 17
  Memory at fbb00000 (64-bit, non-prefetchable) [size=16K]
  Capabilities: [40] Power Management version 3
  Capabilities: [58] Vendor Specific Information: Len=78 <?>
  Capabilities: [48] MSI: Enable- Count=1/1 Maskable- 64bit+
  Capabilities: [d0] Express Endpoint, MSI 00
  Capabilities: [100] Advanced Error Reporting
  Capabilities: [13c] Virtual Channel
  Capabilities: [160] Device Serial Number 00-00-9d-ff-ff-aa-1c-65
  Capabilities: [16c] Power Budgeting <?>
  Kernel driver in use: brcmsmac
Run Code Online (Sandbox Code Playgroud)

出现问题时您的 NIC 处于什么状态?

iptables-save 什么都不打印。

ip rule show

0:  from all lookup local 
32766:  from all lookup main 
32767:  from all lookup default 
Run Code Online (Sandbox Code Playgroud)

ip route show table all

default via 192.168.1.1 dev wlan0 
192.168.1.0/24 dev wlan0  proto kernel  scope link  src 192.168.1.105 
broadcast 127.0.0.0 dev lo  table local  proto kernel  scope link  src 127.0.0.1 
local 127.0.0.0/8 dev lo  table local  proto kernel  scope host  src 127.0.0.1 
local 127.0.0.1 dev lo  table local  proto kernel  scope host  src 127.0.0.1 
broadcast 127.255.255.255 dev lo  table local  proto kernel  scope link  src 127.0.0.1 
broadcast 192.168.1.0 dev wlan0  table local  proto kernel  scope link  src 192.168.1.105 
local 192.168.1.105 dev wlan0  table local  proto kernel  scope host  src 192.168.1.105 
broadcast 192.168.1.255 dev wlan0  table local  proto kernel  scope link  src 192.168.1.105 
fe80::/64 dev wlan0  proto kernel  metric 256 
unreachable default dev lo  table unspec  proto kernel  metric 4294967295  error -101 hoplimit 255
local ::1 via :: dev lo  table local  proto none  metric 0 
local fe80::1e65:9dff:feaa:b1f1 via :: dev lo  table local  proto none  metric 0 
ff00::/8 dev wlan0  table local  metric 256 
unreachable default dev lo  table unspec  proto kernel  metric 4294967295  error -101 hoplimit 255
Run Code Online (Sandbox Code Playgroud)

机器在正常模式下工作时,以上都是相同的。

ifconfig- 我运行了它,但不知何故在重新启动之前忘记保存。必须等到下一次出现问题时。对于那个很抱歉。

有什么 QoS 吗?

可能不会——至少我没有做任何特别的事情来启用它。

您是否尝试过嗅探接口上实际发送的流量?

我跑了几次 curl 和 tcpdump,有两种模式。

第一个只是没有答案的 SYN 数据包。

17:14:37.836917 IP (tos 0x0, ttl 64, id 4563, offset 0, flags [DF], proto TCP (6), length 60)
    192.168.1.105.42030 > fra07s07-in-f102.1e100.net.http: Flags [S], cksum 0x27fc (incorrect -> 0xbea8), seq 3764607647, win 13600, options [mss 1360,sackOK,TS val 33770316 ecr 0,nop,wscale 4], length 0
17:14:38.836650 IP (tos 0x0, ttl 64, id 4564, offset 0, flags [DF], proto TCP (6), length 60)
    192.168.1.105.42030 > fra07s07-in-f102.1e100.net.http: Flags [S], cksum 0x27fc (incorrect -> 0xbdae), seq 3764607647, win 13600, options [mss 1360,sackOK,TS val 33770566 ecr 0,nop,wscale 4], length 0
17:14:40.840649 IP (tos 0x0, ttl 64, id 4565, offset 0, flags [DF], proto TCP (6), length 60)
    192.168.1.105.42030 > fra07s07-in-f102.1e100.net.http: Flags [S], cksum 0x27fc (incorrect -> 0xbbb9), seq 3764607647, win 13600, options [mss 1360,sackOK,TS val 33771067 ecr 0,nop,wscale 4], length 0
Run Code Online (Sandbox Code Playgroud)

第二个是这样的:

17:22:56.507827 IP (tos 0x0, ttl 64, id 41583, offset 0, flags [DF], proto TCP (6), length 60)
    192.168.1.105.42036 > fra07s07-in-f102.1e100.net.http: Flags [S], cksum 0x27fc (incorrect -> 0x2244), seq 1564709704, win 13600, options [mss 1360,sackOK,TS val 33894944 ecr 0,nop,wscale 4], length 0
17:22:56.546763 IP (tos 0x58, ttl 54, id 65442, offset 0, flags [none], proto TCP (6), length 60)
    fra07s07-in-f102.1e100.net.http > 192.168.1.105.42036: Flags [S.], cksum 0x6b1e (correct), seq 1407776542, ack 1564709705, win 14180, options [mss 1430,sackOK,TS val 3721836586 ecr 33883552,nop,wscale 6], length 0
17:22:56.546799 IP (tos 0x58, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 40)
    192.168.1.105.42036 > fra07s07-in-f102.1e100.net.http: Flags [R], cksum 0xf301 (correct), seq 1564709705, win 0, length 0
17:22:58.511843 IP (tos 0x0, ttl 64, id 41584, offset 0, flags [DF], proto TCP (6), length 60)
    192.168.1.105.42036 > fra07s07-in-f102.1e100.net.http: Flags [S], cksum 0x27fc (incorrect -> 0x204f), seq 1564709704, win 13600, options [mss 1360,sackOK,TS val 33895445 ecr 0,nop,wscale 4], length 0
17:22:58.555423 IP (tos 0x58, ttl 54, id 65443, offset 0, flags [none], proto TCP (6), length 60)
    fra07s07-in-f102.1e100.net.http > 192.168.1.105.42036: Flags [S.], cksum 0x3b03 (correct), seq 1439178112, ack 1564709705, win 14180, options [mss 1430,sackOK,TS val 3721838596 ecr 33883552,nop,wscale 6], length 0
17:22:58.555458 IP (tos 0x58, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 40)
    192.168.1.105.42036 > fra07s07-in-f102.1e100.net.http: Flags [R], cksum 0xf301 (correct), seq 1564709705, win 0, length 0
Run Code Online (Sandbox Code Playgroud)

ethtool 输出

ethtool -k wlan0

Features for wlan0:
rx-checksumming: off [fixed]
tx-checksumming: off
  tx-checksum-ipv4: off [fixed]
  tx-checksum-unneeded: off [fixed]
  tx-checksum-ip-generic: off [fixed]
  tx-checksum-ipv6: off [fixed]
  tx-checksum-fcoe-crc: off [fixed]
  tx-checksum-sctp: off [fixed]
scatter-gather: off
  tx-scatter-gather: off [fixed]
  tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: off
  tx-tcp-segmentation: off [fixed]
  tx-tcp-ecn-segmentation: off [fixed]
  tx-tcp6-segmentation: off [fixed]
udp-fragmentation-offload: off [fixed]
generic-segmentation-offload: off [requested on]
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: off [fixed]
tx-vlan-offload: off [fixed]
ntuple-filters: off [fixed]
receive-hashing: off [fixed]
highdma: off [fixed]
rx-vlan-filter: off [fixed]
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: on [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
Run Code Online (Sandbox Code Playgroud)

iptables

# namei -l "$(command -v iptables)"
f: /sbin/iptables
drwxr-xr-x root root /
drwxr-xr-x root root sbin
lrwxrwxrwx root root iptables -> xtables-multi
-rwxr-xr-x root root   xtables-multi

# dpkg -S "$(command -v iptables)"
iptables: /sbin/iptables

# iptables -nvL
Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         
# iptables -t mangle -nvL
Chain PREROUTING (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain POSTROUTING (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         
# iptables -t nat -nvL
Chain PREROUTING (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain POSTROUTING (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         
# iptables -t security -nvL
Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         
Run Code Online (Sandbox Code Playgroud)

模块信息

# ethtool -i wlan0                   
driver: brcmsmac
version: 3.2.0-3-686-pae
firmware-version: N/A
bus-info: 0000:12:00.0
supports-statistics: no
supports-test: no
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: no

# modinfo brcmsmac
filename:       /lib/modules/3.2.0-3-686-pae/kernel/drivers/net/wireless/brcm80211/brcmsmac/brcmsmac.ko
license:        Dual BSD/GPL
description:    Broadcom 802.11n wireless LAN driver.
author:         Broadcom Corporation
alias:          pci:v000014E4d00000576sv*sd*bc*sc*i*
alias:          pci:v000014E4d00004727sv*sd*bc*sc*i*
alias:          pci:v000014E4d00004353sv*sd*bc*sc*i*
alias:          pci:v000014E4d00004357sv*sd*bc*sc*i*
depends:        mac80211,brcmutil,cfg80211,cordic,crc8
intree:         Y
vermagic:       3.2.0-3-686-pae SMP mod_unload modversions 686 
Run Code Online (Sandbox Code Playgroud)

没有/sys/module/brcmsmac/parameters。这是我在那里的东西:

# tree /sys/module/brcmsmac
/sys/module/brcmsmac
??? drivers
?   ??? pci:brcmsmac -> ../../../bus/pci/drivers/brcmsmac
??? holders
??? initstate
??? notes
??? refcnt
??? sections
?   ??? __bug_table
??? uevent
Run Code Online (Sandbox Code Playgroud)

有些网站确实有效

按照dr 的建议,我尝试了其他一些网站,令我惊讶的是其中一些确实有效。以下是一些有效的主机:

  • 漫步者.ru
  • 谷歌.ru
  • ya.ru
  • 开放网
  • 托比
  • 罗氏资讯
  • 雅虎网
  • 易趣网

这里有一些没有:

  • vk.com
  • 元数据
  • 乌克兰网
  • 宗旨.ua
  • 宣传片
  • 红迪网
  • github.com
  • stackexchange.com

网络捕获

我做了一个网络捕获并上传到这里

Sté*_*las 5

在您提供的捕获中,第二个数据包中 SYN-ACK 中的时间戳回声回复与第一个数据包中 SYN 中的 TSVal 不匹配,并且落后了几秒钟。

并查看 173.194.70.108 和 209.85.148.100 发送的所有 TSecr 如何与您发送的 TSVal 完全相同且无关。

看起来有些东西与 TCP 时间戳混合在一起。我不知道是什么原因造成的,但听起来像是在您的机器之外。在这种情况下重新启动路由器有帮助吗?

我不知道这是否是导致您的机器发送 RST(在第三个数据包上)的原因。但它绝对不喜欢那个 SYN-ACK,这是我能找到的唯一错误。我能想到的唯一其他解释是,如果发送 RST 的不是您的机器,但考虑到 SYN-ACK 和 RST 之间的时间差,我对此表示怀疑。但以防万一,您是否在这台机器上使用虚拟机或容器或网络命名空间?

您可以尝试完全禁用 TCP 时间戳以查看是否有帮助:

sudo sysctl -w net.ipv4.tcp_timestamps=0
Run Code Online (Sandbox Code Playgroud)

因此,要么这些站点发送伪造的 TSecr,要么在途中(途中的任何路由器或透明代理)破坏传出的 TSVal 或传入的 TSecr,或具有伪造 TCP 堆栈的代理。为什么有人会破坏 tcp 时间戳我只能推测:错误、入侵检测规避、过于智能/虚假的流量整形算法。这不是我以前听说过的(但我不是这方面的专家)。

如何进一步调查:

  • 看看 TPLink 路由器是否应该归咎于为什么要重置它以查看这是否有助于或捕获外部流量,如果可能的话,看看它是否确实破坏了时间戳
  • 通过玩 TTL、查看 Web 服务器收到的请求标头或查看请求死网站时的行为,检查途中是否有透明代理。
  • 捕获远程 Web 服务器上的流量,以查看是否是 TSVal 或 TSecr 被损坏。