如何在单个TCP套接字上最大化吞吐量?

uri*_*ium 3 .net c# sockets performance

我试图让标准示例"echo"客户端/服务器应用程序尽快运行,我确信网络是一个限制因素.我有一个1千兆网卡,当我使用资源监视器时,我只从客户端获得7兆位.

我理解套接字,消息帧和长度指示器的基础知识,接收长度指示器指示的所有字节.保持活跃包,半开连接等.

我开始使用库存标准套接字操作,然后切换到使用异步.(我没有改变发送到异步因为有人[谁似乎懂行的说,它不应该有任何影响])我有相同的表现,我只能认为,所有材料都假设我一些其他的工作,可以在同一个线程上完成.但是在我的快速测试中,我专门用了1个线程来循环发送,另一个完全不同的线程接收.

我已经尝试了所有的东西,并且完全迷失了我可以获得更多性能的地方.我使用了IPerf并报告了每秒1千兆位的后退速度,而资源监视器也显示出要攫取带宽.

即使有人可以指出我更完整的例子.我遇到的大多数都是微不足道的或不完整的.

这是一般代码.

class Program
{
private static Socket sock;
private static BlockingCollection<string> queue;
private static int bytesReceived;
private static byte[] dataBuffer;
private static readonly byte[] lengthBuffer = new byte[4];

private static byte[] PrependLengthIndicator(byte[] data)
{
    return BitConverter.GetBytes(data.Length).Concat(data).ToArray();
}

private static void Receive()
{
    if (dataBuffer == null)
    {
        sock.BeginReceive(lengthBuffer, 0, 4, SocketFlags.None, ReceiveCallback, null);
    }
    else
    {
        sock.BeginReceive(dataBuffer, 0, bytesReceived, SocketFlags.None, ReceiveCallback, null);
    }
}

private static void ReceiveCallback(IAsyncResult ar)
{
    bytesReceived += sock.EndReceive(ar);
    if (dataBuffer == null)
    {
        // Currently receiving length indicator
        if (bytesReceived >= 4)
        {
            var length = BitConverter.ToInt32(lengthBuffer, 0);
            dataBuffer = new byte[length];
            bytesReceived = 0;
        }
    }
    else
    {
        if (bytesReceived == dataBuffer.Length)
        {
            // Finished reading
            var request = Encoding.ASCII.GetString(dataBuffer);
            dataBuffer = null;
            bytesReceived = 0;
            queue.Add(request);
        }
    }
    ContinueReading();
}

private static void ContinueReading()
{
    // Read into the appropriate buffer: length or data
    if (dataBuffer != null)
    {
        sock.BeginReceive(dataBuffer, bytesReceived, dataBuffer.Length - bytesReceived, SocketFlags.None, ReceiveCallback, null);
    }
    else
    {
        sock.BeginReceive(lengthBuffer, bytesReceived, lengthBuffer.Length - bytesReceived, SocketFlags.None, ReceiveCallback, null);
    }
}

}
Run Code Online (Sandbox Code Playgroud)

这是服务器部分:

static void Main(string[] args)
{
        var listenSock = new Socket(AddressFamily.InterNetwork, SocketType.Stream, ProtocolType.Tcp);
        listenSock.Bind(new IPEndPoint(IPAddress.Parse(ConfigurationManager.AppSettings["LocalIp"]), 3333));
        listenSock.Listen(10);
        Console.WriteLine("Server started...");
        sock = listenSock.Accept();
        Console.WriteLine("Connection accepted.");

        queue = new BlockingCollection<string>();
        Receive();
        var count = 0;
        var sender = new Thread(() =>
            {
                while (true)
                {
                    var bar = queue.Take() + "Resp";
                    count++;
                    var resp = Encoding.ASCII.GetBytes(bar);
                    var toSend = PrependLengthIndicator(resp);
                    if (count % 10000 == 0)
                    {
                        Console.WriteLine(bar);
                    }
                    sock.Send(toSend);
                }
            });
        sender.Start();
    }
Run Code Online (Sandbox Code Playgroud)

这是客户端部分:

static void Main(string[] args)
{

    sock = new Socket(AddressFamily.InterNetwork, SocketType.Stream, ProtocolType.Tcp);
        Console.WriteLine("Connecting...");
        sock.Connect(IPAddress.Parse(ConfigurationManager.AppSettings["EndPointIp"]), 3333);
        Console.WriteLine("Connected.");
        Receive();

        var count = 0;
        while(true)
        {
            count++;
            var foo = "Echo-" + count;
            var data = Encoding.ASCII.GetBytes(foo);
            var toSend = PrependLengthIndicator(data);
            sock.Send(toSend);
        }
    }
Run Code Online (Sandbox Code Playgroud)

usr*_*usr 5

您正在发送微小的消息.想想你需要多少数百万个才能使1 Gbit/sec链路饱和.每次调用套接字都会烧毁CPU.

发送更大的消息.此外,尽量不要一直分配新的缓冲区.不要使用Enumerable.Concat连接字节缓冲区,因为它以极低效的方式逐字节操作.Array.Copy预分配的数组一起使用.

如果你使用很少的线程切换到同步IO,因为它会更快(真的!它有更少的开销).

你可以通过运行一个dumb 无限发送循环来确认这个答案是正确的,该循环只是一直同步发送64KB的缓冲区.它会使链接饱和.