发送UDP数据包的延迟很长

rav*_*int 11 c++ sockets windows udp boost-asio

我有一个接收,处理和传输UDP数据包的应用程序.

如果接收和传输的端口号不同,一切正常.

如果端口号相同且IP地址不同,则当IP地址与运行应用程序的计算机位于同一子网时,通常可以正常工作.在最后一种情况下,send_to函数需要几秒钟才能完成,而不是通常的几毫秒.

Rx Port  Tx IP          Tx Port    Result

5001     Same           5002       OK  Delay ~ 0.001 secs
         subnet     

5001     Different      5001       OK  Delay ~ 0.001 secs
         subnet

5001     Same           5001       Fails  Delay > 2 secs
         subnet
Run Code Online (Sandbox Code Playgroud)

这是一个演示问题的简短程序.

#include <ctime>
#include <iostream>
#include <string>
#include <boost/array.hpp>
#include <boost/asio.hpp>

using boost::asio::ip::udp;
using std::cout;
using std::endl;

int test( const std::string& output_IP)
{
    try
    {
        unsigned short prev_seq_no;

        boost::asio::io_service io_service;

        // build the input socket

        /* This is connected to a UDP client that is running continuously
        sending messages that include an incrementing sequence number
        */

        const int input_port = 5001;
        udp::socket input_socket(io_service, udp::endpoint(udp::v4(), input_port ));

        // build the output socket

        const std::string output_Port = "5001";
        udp::resolver resolver(io_service);
        udp::resolver::query query(udp::v4(), output_IP, output_Port );
        udp::endpoint output_endpoint = *resolver.resolve(query);
        udp::socket output_socket( io_service );
        output_socket.open(udp::v4());

       // double output buffer size
       boost::asio::socket_base::send_buffer_size option( 8192 * 2 );
       output_socket.set_option(option);

        cout  << "TX to " << output_endpoint.address() << ":"  << output_endpoint.port() << endl;



        int count = 0;
        for (;;)
        {
            // receive packet
            unsigned short recv_buf[ 20000 ];
            udp::endpoint remote_endpoint;
            boost::system::error_code error;
            int bytes_received = input_socket.receive_from(boost::asio::buffer(recv_buf,20000),
                                 remote_endpoint, 0, error);

            if (error && error != boost::asio::error::message_size)
                throw boost::system::system_error(error);

            // start timer
            __int64 TimeStart;
            QueryPerformanceCounter( (LARGE_INTEGER *)&TimeStart );

            // send onwards
            boost::system::error_code ignored_error;
            output_socket.send_to(
                boost::asio::buffer(recv_buf,bytes_received),
                output_endpoint, 0, ignored_error);

            // stop time and display tx time
            __int64 TimeEnd;
            QueryPerformanceCounter( (LARGE_INTEGER *)&TimeEnd );
            __int64 f;
            QueryPerformanceFrequency( (LARGE_INTEGER *)&f );
            cout << "Send time secs " << (double) ( TimeEnd - TimeStart ) / (double) f << endl;

            // stop after loops
            if( count++ > 10 )
                break;
        }
    }
    catch (std::exception& e)
    {
        std::cerr << e.what() << std::endl;
    }

}
int main(  )
{

    test( "193.168.1.200" );

    test( "192.168.1.200" );

    return 0;
}
Run Code Online (Sandbox Code Playgroud)

当在地址为192.168.1.101的机器上运行时,此程序的输出

TX to 193.168.1.200:5001
Send time secs 0.0232749
Send time secs 0.00541566
Send time secs 0.00924535
Send time secs 0.00449014
Send time secs 0.00616714
Send time secs 0.0199299
Send time secs 0.00746081
Send time secs 0.000157972
Send time secs 0.000246775
Send time secs 0.00775578
Send time secs 0.00477618
Send time secs 0.0187321
TX to 192.168.1.200:5001
Send time secs 1.39485
Send time secs 3.00026
Send time secs 3.00104
Send time secs 0.00025927
Send time secs 3.00163
Send time secs 2.99895
Send time secs 6.64908e-005
Send time secs 2.99864
Send time secs 2.98798
Send time secs 3.00001
Send time secs 3.00124
Send time secs 9.86207e-005
Run Code Online (Sandbox Code Playgroud)

为什么会这样?有什么方法可以减少延迟吗?

笔记:

  • 使用code :: blocks构建,在各种Windows版本下运行

  • 数据包长度为10000字节

  • 如果我将运行应用程序的计算机连接到第二个网络,问题就会消失.例如WWLAN(蜂窝网络"火箭棒")

据我所知,这是我们的情况:

这工作(不同的端口,相同的LAN):

在此输入图像描述

这也有效(相同的端口,不同的LAN):

在此输入图像描述

这不起作用(相同的端口,相同的LAN):

在此输入图像描述

这似乎工作(相同的端口,相同的LAN,双宿主Module2主机)

在此输入图像描述

Tan*_*ury 6

鉴于在Windows上观察到大型数据报的目标地址与发送者位于同一子网内的不存在的对等体,问题可能是send()阻塞等待地址解析协议(ARP)响应的结果,使得layer2以太网帧可以填充:

  • 发送数据时,将使用路由中下一跳的媒体访问控制(MAC)地址填充layer2以太网帧.如果发送方不知道下一跳的MAC地址,它将广播ARP请求并缓存响应.使用发送方的子网掩码和目标地址,发送方可以确定下一跳是否与发送方位于同一子网上,或者数据是否必须通过默认网关路由.根据问题中的结果,在发送大数据报时:

    • 发往不同子网的数据报没有延迟,因为默认网关的MAC地址在发送方的ARP缓存中
    • 发往发送方子网上不存在的对等方的数据报会导致等待ARP解析的延迟
  • 套接字的发送缓冲区大小(SO_SNDBUF)被设置为16384字节,但发送的数据报的大小是10000.没有说明send()缓冲区何时饱和的行为行为,但有些系统会观察到send()阻塞.在这种情况下,如果任何数据报发生延迟,例如等待ARP响应,则饱和将相当快地发生.

    // Datagrams being sent are 10000 bytes, but the socket buffer is 16384.
    boost::asio::socket_base::send_buffer_size option(8192 * 2);
    output_socket.set_option(option);
    
    Run Code Online (Sandbox Code Playgroud)

    考虑让内核管理套接字缓冲区大小或根据预期的吞吐量增加它.

  • 发送大小超过Window注册表FastSendDatagramThreshold?参数send()的数据报时,调用可能会阻塞,直到数据报发送完毕.有关更多详细信息,请参阅Microsoft TCP/IP实现详细信息:

    小于此参数值的数据报通过快速I/O路径或在发送时进行缓冲.保持较大的直到实际发送数据报.通过测试找到默认值是性能的最佳整体值.快速I/O意味着复制数据并绕过I/O子系统,而不是映射内存并通过I/O子系统.这对于少量数据是有利的.通常不建议更改此值.

如果一个人观察每个延迟send()现有的发件人的子网同行,然后点击个人资料和分析网络:

  • 使用iperf测量网络潜在吞吐量
  • 使用wireshark可以更深入地了解给定节点上发生的情况.查找ARP请求和响应.
  • 从发件人的计算机上ping对等体,然后检查APR缓存.验证对等方是否存在缓存条目且该条目是否正确.
  • 尝试使用其他端口和/或TCP.这有助于确定网络策略是否限制或整形特定端口或协议的流量.

另请注意,FastSendDatagramThreshold在等待ARP解析时快速连续发送低于该值的数据报可能会导致数据报被丢弃:

当该IP地址被解析为媒体访问控制地址时,ARP仅为指定的目标地址排队一个出站IP数据报.如果基于用户数据报协议(UDP)的应用程序将多个IP数据报发送到单个目标地址而它们之间没有任何暂停,则如果没有已存在的ARP缓存条目,则可能会丢弃某些数据报.应用程序可以通过在发送数据包流之前调用iphlpapi.dll例程SendArp()来建立ARP缓存条目来对此进行补偿.


Ser*_*eyA 2

好吧,整理一些代码(如下)。很明显,大多数情况下发送时间不到一毫秒。这证明问题出在boost上。

#include <iostream>
#include <string>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <stdexcept>
#include <poll.h>
#include <string>
#include <memory.h>
#include <chrono>
#include <stdio.h>

void test( const std::string& remote, const std::string& hello_string, bool first)
{
    try
    {
        const short unsigned input_port = htons(5001);
        int sock = socket(AF_INET, SOCK_DGRAM, 0);
        if (sock == -1) {
            perror("Socket creation error: ");
            throw std::runtime_error("Could not create socket!");
        }

        sockaddr_in local_addr;
        local_addr.sin_port = input_port;
        local_addr.sin_addr.s_addr = INADDR_ANY;
        if (bind(sock, (const sockaddr*)&local_addr, sizeof(local_addr))) {
            perror("Error: ");
            throw std::runtime_error("Can't bind to port!");
        }

        sockaddr_in remote_addr;
        remote_addr.sin_port = input_port;
        if (!inet_aton(remote.c_str(), &remote_addr.sin_addr))
            throw std::runtime_error("Can't parse remote IP address!");

        std::cout  << "TX to " << remote << "\n";

        unsigned char recv_buf[40000];

        if (first) {
            std::cout << "First launched, waiting for hello.\n";
            int bytes = recv(sock, &recv_buf, sizeof(recv_buf), 0);
            std::cout << "Seen hello from my friend here: " << recv_buf << ".\n";
        }

        int count = 0;
        for (;;)
        {

            std::chrono::high_resolution_clock::time_point start = std::chrono::high_resolution_clock::now();
            if (sendto(sock, hello_string.c_str(), hello_string.size() + 1, 0, (const sockaddr*)&remote_addr, sizeof(remote_addr)) != hello_string.size() + 1) {
                perror("Sendto error: ");
                throw std::runtime_error("Error sending data");
            }
            std::chrono::high_resolution_clock::time_point end = std::chrono::high_resolution_clock::now();

            std::cout << "Send time nanosecs " << std::chrono::duration_cast<std::chrono::nanoseconds>(end - start).count() << "\n";

            int bytes = recv(sock, &recv_buf, sizeof(recv_buf), 0);
            std::cout << "Seen hello from my friend here: " << recv_buf << ".\n";

            // stop after loops
            if (count++ > 10)
                break;
        }
    }
    catch (std::exception& e)
    {
        std::cerr << e.what() << std::endl;
    }

}
int main(int argc, char* argv[])
{
    test(argv[1], argv[2], *argv[3] == 'f');

    return 0;
}
Run Code Online (Sandbox Code Playgroud)

正如预期的那样,没有延迟。这是其中一对的输出(我在同一网络中的两台机器上成对运行代码):

./socktest x.x.x.x 'ThingTwo' f
TX to x.x.x.x
First launched, waiting for hello.
Seen hello from my friend here: ThingOne.
Send time nanosecs 17726
Seen hello from my friend here: ThingOne.
Send time nanosecs 6479
Seen hello from my friend here: ThingOne.
Send time nanosecs 6362
Seen hello from my friend here: ThingOne.
Send time nanosecs 6048
Seen hello from my friend here: ThingOne.
Send time nanosecs 6246
Seen hello from my friend here: ThingOne.
Send time nanosecs 5691
Seen hello from my friend here: ThingOne.
Send time nanosecs 5665
Seen hello from my friend here: ThingOne.
Send time nanosecs 5930
Seen hello from my friend here: ThingOne.
Send time nanosecs 6082
Seen hello from my friend here: ThingOne.
Send time nanosecs 5493
Seen hello from my friend here: ThingOne.
Send time nanosecs 5893
Seen hello from my friend here: ThingOne.
Send time nanosecs 5597
Run Code Online (Sandbox Code Playgroud)