Linux 中发送和接收 tcp/udp 数据包的延迟来源

Question

Linux 中发送和接收 tcp/udp 数据包的延迟来源

osg*_*sgx 3 performance networking latency linux-kernel low-latency

在 Linux 2.6 中发送/接收 TCP/UDP 数据包过程中的延迟来源是什么？

我想知道“乒乓”延迟测试中的延迟源。

有一些关于以太网延迟的相当好的论文，但它们只涵盖了线路和交换机中的延迟源（相当粗略，仅针对特定交换机）。

数据包之后有哪些处理步骤？

对普通 ping (icmp) 进行深度延迟分析的论文也会很有用。

我依赖社区:)

Answer 1

Tgi*_*gul 5

简短的回答：为了获得内核中的确切延迟，您应该使用 perf 探针和 perf 脚本。

更多细节：让我们看下面的例子。首先我们想看看 TCP 乒乓测试使用了哪些函数（我使用的是 netperf）。

关于接收流：

关于传输流：

因此，让我们跟踪传输流的一些函数（它很长，所以我将展示 TCP 流中的主要函数）。我们可以使用 perf 探针来对每个函数的入口点和出口点进行采样：

perf probe --add sock_sendmsg='sock_sendmsg'
perf probe --add sock_sendmsg_exit='sock_sendmsg%return'
perf probe --add inet_sendmsg='inet_sendmsg'
perf probe --add inet_sendmsg_exit='inet_sendmsg%return'
perf probe --add tcp_sendmsg_exit='tcp_sendmsg%return'
perf probe --add tcp_sendmsg='tcp_sendmsg'
perf probe --add tcp_sendmsg_locked='tcp_sendmsg_locked'
perf probe --add tcp_sendmsg_locked_exit='tcp_sendmsg_locked%return'
perf probe --add sk_stream_alloc_skb='sk_stream_alloc_skb'
perf probe --add sk_stream_alloc_skb_exit='sk_stream_alloc_skb%return'
perf probe --add tcp_push_exit='tcp_push%return'
perf probe --add tcp_push='tcp_push'
perf probe --add tcp_send_mss='tcp_send_mss'
perf probe --add tcp_send_mss_exit='tcp_send_mss%return'
perf probe --add __tcp_push_pending_frames='__tcp_push_pending_frames'
perf probe --add __tcp_push_pending_frames_exit='__tcp_push_pending_frames%return'
perf probe --add tcp_write_xmit_exit='tcp_write_xmit%return'
perf probe --add tcp_transmit_skb_exit='tcp_transmit_skb%return'
perf probe --add tcp_transmit_skb='tcp_transmit_skb'

Run Code Online (Sandbox Code Playgroud)

不，我们可以记录这些：

perf record -e probe:* -aR taskset -c 7 netperf -t TCP_RR -l 5 -T 7,7

Run Code Online (Sandbox Code Playgroud)

并运行 perf 脚本来获取延迟报告：

perf script -F time,event --ns

Run Code Online (Sandbox Code Playgroud)

输出（1 次迭代）：

525987.403094082:                   probe:sock_sendmsg:
525987.403095586:                   probe:inet_sendmsg:
525987.403096192:                    probe:tcp_sendmsg:
525987.403098203:             probe:tcp_sendmsg_locked:
525987.403099296:                   probe:tcp_send_mss:
525987.403100002:              probe:tcp_send_mss_exit:
525987.403100740:            probe:sk_stream_alloc_skb:
525987.403101697:       probe:sk_stream_alloc_skb_exit:
525987.403103079:                       probe:tcp_push:
525987.403104284:      probe:__tcp_push_pending_frames:
525987.403105575:               probe:tcp_transmit_skb:
525987.403110178:               probe:tcp_transmit_skb:
525987.403111640:          probe:tcp_transmit_skb_exit:
525987.403112876:          probe:tcp_transmit_skb_exit:
525987.403114351:            probe:tcp_write_xmit_exit:
525987.403114768: probe:__tcp_push_pending_frames_exit:
525987.403115191:                  probe:tcp_push_exit:
525987.403115718:        probe:tcp_sendmsg_locked_exit:
525987.403117576:               probe:tcp_sendmsg_exit:
525987.403118082:              probe:inet_sendmsg_exit:
525987.403118568:              probe:sock_sendmsg_exit:

Run Code Online (Sandbox Code Playgroud)

现在很容易看出延迟花在哪里。例如，我们可以注意到，在 sock_sendmsg() 调用和 inet_sendmsg() 调用之间有 1504 纳秒 (ns) 或 1.504 微秒 (us) 的延迟。另外，我们可以看到 sk_stream_alloc_skb 需要 957 ns。总共，整个过程（sock_sendmsg 进入到退出）大约需要 24.5us。请记住，这不是您在 netperf 中看到的情况，因为数据包是在流中间的某个位置物理传输的。

您可以使用相同的方法来跟踪内核中的任何代码段。

希望这可以帮助。

PS这是在 kernel-4.14 而非 2.6 上完成的。不确定当时的性能如何，所以它可能无法正常工作。

归档时间：	15 年，6 月前
查看次数：	3874 次
最近记录：	7 年，11 月前