我经常发现自己在读取一个大的 JSON 文件(通常是一个对象数组),然后操作每个对象并写回一个新文件。
为了在 Node 中实现这一点(至少是读取数据部分),我通常使用 stream-json 模块来做这样的事情。
const fs = require('fs');
const StreamArray = require('stream-json/streamers/StreamArray');
const pipeline = fs.createReadStream('sample.json')
  .pipe(StreamArray.withParser());
pipeline.on('data', data => {
    //do something with each object in file
});
我最近发现了 Deno,并且希望能够使用 Deno 完成这个工作流程。
看起来标准库中的readJSON方法将文件的全部内容读入内存,所以我不知道它是否适合处理大文件。
有没有一种方法可以通过使用 Deno 中内置的一些较低级别的方法从文件中流式传输数据来完成?
现在 Deno 1.0 已经发布了,以防万一其他人有兴趣做这样的事情。我能够拼凑出一个适合我的用例的小类。它并不像类似的东西那么强大stream-json但它可以很好地处理大型 JSON 数组。
import { EventEmitter } from "https://deno.land/std/node/events.ts";\n\nexport class JSONStream extends EventEmitter {\n\n    private openBraceCount = 0;\n    private tempUint8Array: number[] = [];\n    private decoder = new TextDecoder();\n\n    constructor (private filepath: string) {\n        super();\n        this.stream();\n    }\n\n    async stream() {\n        console.time("Run Time");\n        let file = await Deno.open(this.filepath);\n        //creates iterator from reader, default buffer size is 32kb\n        for await (const buffer of Deno.iter(file)) {\n\n            for (let i = 0, len = buffer.length; i < len; i++) {\n                const uint8 = buffer[ i ];\n\n                //remove whitespace\n                if (uint8 === 10 || uint8 === 13 || uint8 === 32) continue;\n\n                //open brace\n                if (uint8 === 123) {\n                    if (this.openBraceCount === 0) this.tempUint8Array = [];\n                    this.openBraceCount++;\n                };\n\n                this.tempUint8Array.push(uint8);\n\n                //close brace\n                if (uint8 === 125) {\n                    this.openBraceCount--;\n                    if (this.openBraceCount === 0) {\n                        const uint8Ary = new Uint8Array(this.tempUint8Array);\n                        const jsonString = this.decoder.decode(uint8Ary);\n                        const object = JSON.parse(jsonString);\n                        this.emit(\'object\', object);\n                    }\n                };\n            };\n        }\n        file.close();\n        console.timeEnd("Run Time");\n    }\n}\n用法示例
\n\nconst stream = new JSONStream(\'test.json\');\n\nstream.on(\'object\', (object: any) => {\n    // do something with each object\n});\n处理约 4.8 MB 的 json 文件,其中包含约 20,000 个小对象
\n\n[\n    {\n      "id": 1,\n      "title": "in voluptate sit officia non nesciunt quis",\n      "urls": {\n         "main": "https://www.placeholder.com/600/1b9d08",\n         "thumbnail": "https://www.placeholder.com/150/1b9d08"\n      }\n    },\n    {\n      "id": 2,\n      "title": "error quasi sunt cupiditate voluptate ea odit beatae",\n      "urls": {\n          "main": "https://www.placeholder.com/600/1b9d08",\n          "thumbnail": "https://www.placeholder.com/150/1b9d08"\n      }\n    }\n    ...\n]\n花了 127 毫秒。
\n\n[\n    {\n      "id": 1,\n      "title": "in voluptate sit officia non nesciunt quis",\n      "urls": {\n         "main": "https://www.placeholder.com/600/1b9d08",\n         "thumbnail": "https://www.placeholder.com/150/1b9d08"\n      }\n    },\n    {\n      "id": 2,\n      "title": "error quasi sunt cupiditate voluptate ea odit beatae",\n      "urls": {\n          "main": "https://www.placeholder.com/600/1b9d08",\n          "thumbnail": "https://www.placeholder.com/150/1b9d08"\n      }\n    }\n    ...\n]\n| 归档时间: | 
 | 
| 查看次数: | 1910 次 | 
| 最近记录: |