我在将JSON文件导入本地MongoDB实例时遇到一些问题.使用JSON生成mongoexport并看起来像这样.没有数组,没有硬核嵌套:
{"_created":{"$date":"2015-10-20T12:46:25.000Z"},"_etag":"7fab35685eea8d8097656092961d3a9cfe46ffbc","_id":{"$oid":"562637a14e0c9836e0821a5e"},"_updated":{"$date":"2015-10-20T12:46:25.000Z"},"body":"base64 encoded string","sender":"mail@mail.com","type":"answer"}
{"_created":{"$date":"2015-10-20T12:46:25.000Z"},"_etag":"7fab35685eea8d8097656092961d3a9cfe46ffbc","_id":{"$oid":"562637a14e0c9836e0821a5e"},"_updated":{"$date":"2015-10-20T12:46:25.000Z"},"body":"base64 encoded string","sender":"mail@mail.com","type":"answer"}
Run Code Online (Sandbox Code Playgroud)
如果我导入一个包含~300行的9MB文件,则没有问题:
[stekhn latest]$ mongoimport -d mietscraping -c mails mails-small.json
2015-11-02T10:03:11.353+0100 connected to: localhost
2015-11-02T10:03:11.372+0100 imported 240 documents
Run Code Online (Sandbox Code Playgroud)
但是如果尝试导入大约1300行的32MB文件,则导入失败:
[stekhn latest]$ mongoimport -d mietscraping -c mails mails.json
2015-11-02T10:05:25.228+0100 connected to: localhost
2015-11-02T10:05:25.735+0100 error inserting documents: lost connection to server
2015-11-02T10:05:25.735+0100 Failed: lost connection to server
2015-11-02T10:05:25.735+0100 imported 0 documents
Run Code Online (Sandbox Code Playgroud)
这是日志:
2015-11-02T11:53:04.146+0100 I NETWORK [initandlisten] connection accepted from 127.0.0.1:45237 #21 (6 connections now open)
2015-11-02T11:53:04.532+0100 I - [conn21] Assertion: 10334:BSONObj size: 23592351 (0x167FD9F) is invalid. Size must be between 0 and 16793600(16MB) First element: insert: "mails"
2015-11-02T11:53:04.536+0100 I NETWORK [conn21] AssertionException handling request, closing client connection: 10334 BSONObj size: 23592351 (0x167FD9F) is invalid. Size must be between 0 and 16793600(16MB) First element: insert: "mails"
Run Code Online (Sandbox Code Playgroud)
我之前听说过BSON文件的16MB限制,但由于我的JSON文件中没有行超过16MB,这应该不是问题,对吧?当我完全相同(32MB)导入我的本地计算机时,一切正常.
什么可能导致这种奇怪行为的想法?
Rod*_*igo 66
我想这个问题与性能有关,你可以用任何方式解决:
你可以使用mongoimport选项-j.如果不能使用4,请尝试增量.即4,8,16,取决于你在cpu中拥有的核心数量.
mongoimport --help
-j, - numInsertionWorkers =并发运行的插入操作数(默认为1)
mongoimport -d mietscraping -c mails -j 4 <mails.json
或者您可以拆分文件并导入所有文件.
我希望这对你有帮助.
看多一点,是一些版本的错误 https://jira.mongodb.org/browse/TOOLS-939 这里的另一个解决方案,您可以更改batchSize,默认为10000,减少值并测试:
mongoimport -d mietscraping -c mails <mails.json --batchSize 1
| 归档时间: |
|
| 查看次数: |
11578 次 |
| 最近记录: |