为什么这个 System.IO.Pipelines 代码比基于 Stream 的代码慢得多？

Question

为什么这个 System.IO.Pipelines 代码比基于 Stream 的代码慢得多？

Wil*_*ill 12 c# performance .net-core system.io.pipelines

我编写了一个小解析程序来比较.NET Core 中System.IO.Stream的新旧版本System.IO.Pipelines。我期望管道代码具有相同的速度或更快。但是，它慢了大约 40%。

程序很简单：它在一个 100Mb 的文本文件中搜索关键字，并返回关键字的行号。这是流版本：

public static async Task<int> GetLineNumberUsingStreamAsync(
    string file,
    string searchWord)
{
    using var fileStream = File.OpenRead(file);
    using var lines = new StreamReader(fileStream, bufferSize: 4096);

    int lineNumber = 1;
    // ReadLineAsync returns null on stream end, exiting the loop
    while (await lines.ReadLineAsync() is string line)
    {
        if (line.Contains(searchWord))
            return lineNumber;

        lineNumber++;
    }
    return -1;
}

Run Code Online (Sandbox Code Playgroud)

我希望上面的流代码比下面的管道代码慢，因为流代码将字节编码为 StreamReader 中的字符串。管道代码通过对字节进行操作来避免这种情况：

public static async Task<int> GetLineNumberUsingPipeAsync(string file, string searchWord)
{
    var searchBytes = Encoding.UTF8.GetBytes(searchWord);
    using var fileStream = File.OpenRead(file);
    var pipe = PipeReader.Create(fileStream, new StreamPipeReaderOptions(bufferSize: 4096));

    var lineNumber = 1;
    while (true)
    {
        var readResult = await pipe.ReadAsync().ConfigureAwait(false);
        var buffer = readResult.Buffer;

        if(TryFindBytesInBuffer(ref buffer, searchBytes, ref lineNumber))
        {
            return lineNumber;
        }

        pipe.AdvanceTo(buffer.End);

        if (readResult.IsCompleted) break;
    }

    await pipe.CompleteAsync();

    return -1;
}

Run Code Online (Sandbox Code Playgroud)

以下是相关的辅助方法：

/// <summary>
/// Look for `searchBytes` in `buffer`, incrementing the `lineNumber` every
/// time we find a new line.
/// </summary>
/// <returns>true if we found the searchBytes, false otherwise</returns>
static bool TryFindBytesInBuffer(
    ref ReadOnlySequence<byte> buffer,
    in ReadOnlySpan<byte> searchBytes,
    ref int lineNumber)
{
    var bufferReader = new SequenceReader<byte>(buffer);
    while (TryReadLine(ref bufferReader, out var line))
    {
        if (ContainsBytes(ref line, searchBytes))
            return true;

        lineNumber++;
    }
    return false;
}

static bool TryReadLine(
    ref SequenceReader<byte> bufferReader,
    out ReadOnlySequence<byte> line)
{
    var foundNewLine = bufferReader.TryReadTo(out line, (byte)'\n', advancePastDelimiter: true);
    if (!foundNewLine)
    {
        line = default;
        return false;
    }

    return true;
}

static bool ContainsBytes(
    ref ReadOnlySequence<byte> line,
    in ReadOnlySpan<byte> searchBytes)
{
    return new SequenceReader<byte>(line).TryReadTo(out var _, searchBytes);
}

Run Code Online (Sandbox Code Playgroud)

我使用SequenceReader<byte>上面是因为我的理解是它比ReadOnlySequence<byte>;更智能/更快；当它可以在单个Span<byte>.

以下是基准测试结果 (.NET Core 3.1)。完整代码和 BenchmarkDotNet 结果可在此 repo 中获得。

GetLineNumberWithStreamAsync -分配 366.19 MB 时为435.6 毫秒
GetLineNumberUsingPipeAsync -分配 9.28 MB 时为619.8 毫秒

我在管道代码中做错了什么吗？

更新：Evk 已经回答了这个问题。应用他的修复后，这里是新的基准数字：

GetLineNumberWithStreamAsync -分配 366.19 MB 时为452.2 毫秒
GetLineNumberWithPipeAsync -分配 9.28 MB 时为203.8 毫秒

Answer 1

Evk*_*Evk 6

I believe the reason is implementaiton of SequenceReader.TryReadTo. Here is the source code of this method. It uses pretty straightforward algorithm (read to the match of first byte, then check if all subsequent bytes after that match, if not - advance 1 byte forward and repeat), and note how there are quite some methods in this implementation called "slow" (IsNextSlow, TryReadToSlow and so on), so under at least certain circumstances and in certain cases it falls back to some slow path. It also has to deal with the fact sequence might contain multiple segments, and with maintaining the position.

In your case you can avoid using SequenceReader specifically for searching the match (but leave it for actually reading lines), for example with this minor changes (this overload of TryReadTo is also more efficient in this case):

private static bool TryReadLine(ref SequenceReader<byte> bufferReader, out ReadOnlySpan<byte> line) {
    // note that both `match` and `line` are now `ReadOnlySpan` and not `ReadOnlySequence`
    var foundNewLine = bufferReader.TryReadTo(out ReadOnlySpan<byte> match, (byte) '\n', advancePastDelimiter: true);

    if (!foundNewLine) {
        line = default;
        return false;
    }

    line = match;
    return true;
}

Run Code Online (Sandbox Code Playgroud)

Then:

private static bool ContainsBytes(ref ReadOnlySpan<byte> line, in ReadOnlySpan<byte> searchBytes) {
    // line is now `ReadOnlySpan` so we can use efficient `IndexOf` method
    return line.IndexOf(searchBytes) >= 0;
}

Run Code Online (Sandbox Code Playgroud)

This will make your pipes code run faster than streams one.

归档时间：	5 年，1 月前
查看次数：	935 次
最近记录：	5 年，1 月前