使用Json.NET序列化为NDJSON

Nat*_*lor 7 c# json json.net ndjson

是否可以使用Json.NET 序列化为NDJSON(Newline Delimited JSON)?Elasticsearch API使用NDJSON进行批量操作,我发现没有任何迹象表明任何 .NET库都支持这种格式.

这个答案为反序列化NDJSON提供了指导,并且注意到可以独立地序列化每一行并加入换行符,但我不一定称之为支持.

dbc*_*dbc 7

最简单的答案是TextWriter使用单独JsonTextWriter的每一行写入单个,为每个行设置CloseOutput = false:

public static partial class JsonExtensions
{
    public static void ToNewlineDelimitedJson<T>(Stream stream, IEnumerable<T> items)
    {
        // Let caller dispose the underlying stream 
        using (var textWriter = new StreamWriter(stream, new UTF8Encoding(false, true), 1024, true))
        {
            ToNewlineDelimitedJson(textWriter, items);
        }
    }

    public static void ToNewlineDelimitedJson<T>(TextWriter textWriter, IEnumerable<T> items)
    {
        var serializer = JsonSerializer.CreateDefault();

        foreach (var item in items)
        {
            // Formatting.None is the default; I set it here for clarity.
            using (var writer = new JsonTextWriter(textWriter) { Formatting = Formatting.None, CloseOutput = false })
            {
                serializer.Serialize(writer, item);
            }
            // https://web.archive.org/web/20180513150745/http://specs.okfnlabs.org/ndjson/
            // Each JSON text MUST conform to the [RFC7159] standard and MUST be written to the stream followed by the newline character \n (0x0A). 
            // The newline charater MAY be preceeded by a carriage return \r (0x0D). The JSON texts MUST NOT contain newlines or carriage returns.
            textWriter.Write("\n");
        }
    }
}
Run Code Online (Sandbox Code Playgroud)

样品小提琴.

由于单个NDJSON线可能很短但行数可能很大,因此这个答案提出了一种流式解决方案,以避免分配大于85kb的单个字符串.正如Newtonsoft Json.NET性能提示中所解释的那样,这样的大字符串最终会出现在大对象堆上,并可能随后降低应用程序性能.