Mongodb Mongoimport太大:解析错误失败

use*_*898 27 import json mongodb

我正在尝试导入有效的MongoDB 70 mb json文件.但是,我在一个循环中一遍又一遍地得到这个错误:

 01 11:42:20 exception:BSON representation of supplied JSON is too large: Failure parsing JSON string near: "name": "L
 01 11:42:20
 01 11:42:20 Assertion: 10340:Failure parsing JSON string near: "link": "h
 01 11:42:20 mongoimport.exe    ???
 01 11:42:20 mongoimport.exe    ???
 01 11:42:20 mongoimport.exe    ???
 01 11:42:20 mongoimport.exe    ???
 01 11:42:20 mongoimport.exe    ???
 01 11:42:20 mongoimport.exe    ???
 01 11:42:20 mongoimport.exe    ???
 01 11:42:20 kernel32.dll       BaseThreadInitThunk+0x12
 01 11:42:20 ntdll.dll          RtlInitializeExceptionChain+0xef
 01 11:42:20 exception:BSON representation of supplied JSON is too large: Failure parsing JSON string near: "link": "h
 01 11:42:20
 01 11:42:20 Assertion: 10340:Failure parsing JSON string near: }
 01 11:42:20 mongoimport.exe    ???
 01 11:42:20 mongoimport.exe    ???
 01 11:42:20 mongoimport.exe    ???
 01 11:42:20 mongoimport.exe    ???
 01 11:42:20 mongoimport.exe    ???
 01 11:42:20 mongoimport.exe    ???
 01 11:42:20 mongoimport.exe    ???
 01 11:42:20 kernel32.dll       BaseThreadInitThunk+0x12
 01 11:42:20 ntdll.dll          RtlInitializeExceptionChain+0xef
 01 11:42:20 exception:BSON representation of supplied JSON is too large: Failure parsing JSON string near: }
 01 11:42:20
 01 11:42:20 Assertion: 10340:Failure parsing JSON string near: ],
 01 11:42:20 mongoimport.exe    ???
 01 11:42:20 mongoimport.exe    ???
 01 11:42:20 mongoimport.exe    ???
 01 11:42:20 mongoimport.exe    ???
 01 11:42:20 mongoimport.exe    ???
 01 11:42:20 mongoimport.exe    ???
 01 11:42:20 mongoimport.exe    ???
 01 11:42:20 kernel32.dll       BaseThreadInitThunk+0x12
 01 11:42:20 ntdll.dll          RtlInitializeExceptionChain+0xef
 01 11:42:20 exception:BSON representation of supplied JSON is too large: Failure parsing JSON string near: ],
Run Code Online (Sandbox Code Playgroud)

我的JSON(只有它的一个小例子)包含许多像这样的结构:

[ 
{
   "data": [
         "id": "xxxxxxxxxxxxxxxxxx",
         "from": {
            "name": "yyyyyyyyyyy",
            "id": "1111111111111"
         },
         "to": {
            "data": [
               {
                  "version": 1,
                  "name": "1111111111111",
                  "id": "1111111111111"
               }
            ]
         },
         "picture": "fffffffffffffffffffffff.jpg",
         "link": "http://www.youtube.com/watch?v=qqqqqqqqqqqqq",
         "source": "http://www.youtube.com/v/qqqqqqqqqqqqq?version=3&autohide=1&autoplay=1",
         "name": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
         "description": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...",
         "icon": "http://static.ak.fbcdn.net/rsrc.php/v2/xxx/r/dddd",
         "actions": [
            {
               "name": "Comment",
               "link": "http://www.example.com/1111111111111/posts/1111111111111"
            },
            {
               "name": "Like",
               "link": "http://www.example.com/1111111111111/posts/1111111111111"
            }
         ],
         "privacy": {
            "value": ""
         },
         "type": "video",
         "created_time": 1356953890,
         "updated_time": 1356953890,
         "likes": {
            "data": [
               {
                  "name": "jjj ",
                  "id": "59xxx67"
               },
               {
                  "name": "xxxxx",
                  "id": "79xxx27"
               }
            ],
            "count": 2
         },
         "comments": {
            "count": 0
         }
      },

....
....
....
}
]
Run Code Online (Sandbox Code Playgroud)

这是json的一般模式":

[
{
   "data": [
      {

      }
    ],
    "paging": {
      "previous": "link",
      "next": "link"
   }
},
   "data": [
      {
      }
    ],
    "paging": {
      "previous": "link",
      "next": "link"
   }
},
"data": [
      {
      }
    ],
    "paging": {
      "previous": "link",
      "next": "link"
   }
}
]
Run Code Online (Sandbox Code Playgroud)

小智 76

而不是使用:

mongoimport -d DATABASE_NAME -c COLLECTION_NAME --file YOUR_JSON_FILE
Run Code Online (Sandbox Code Playgroud)

使用以下命令:

mongoimport -d DATABASE_NAME -c COLLECTION_NAME --file YOUR_JSON_FILE --jsonArray
Run Code Online (Sandbox Code Playgroud)

  • 我也是!--jsonArray参数完全不同.惭愧它不能只检测一个json数组. (3认同)

Hao*_* Li 15

在我的情况下,我的文件实际上不是太大,所以错误消息是误导性的.我不得不将每个文档放在一行或使用--jsonArray.

所以要么改变文件,如:

{ "_id" : "xxxxxxxx", "foo" : "yyy", "bar" : "zzz" }
Run Code Online (Sandbox Code Playgroud)

或将导入命令更改为

mongoimport -d [db_name] -c [col_name] --file [file_with_multi_lined_docs] --jsonArray
Run Code Online (Sandbox Code Playgroud)

如果我的文件保持每个文档的多行格式

{
    "_id" : "xxxxxxxx", 
    "foo" : "yyy", 
    "bar" : "zzz" 
}
Run Code Online (Sandbox Code Playgroud)


Chu*_*Lyu 5

您的json文件是否仅包含该data字段中的记录列表?在这种情况下,您需要将json文件重新格式化为一个记录列表,如下所示:

     {
     "id": "xxxxxxxxxxxxxxxxxx",
     "from": {
        "name": "yyyyyyyyyyy",
        "id": "1111111111111"
     },
     "to": {
        "data": [
           {
              "version": 1,
              "name": "1111111111111",
              "id": "1111111111111"
           }
        ]
     },
     ......
     }
     {
     "id": "xxxxxxxxxxxxxxxxxx",
     "from": {
        "name": "yyyyyyyyyyy",
        "id": "1111111111111"
     },
     "to": {
        "data": [
           {
              "version": 1,
              "name": "1111111111111",
              "id": "1111111111111"
           }
        ]
     },
     ......
     }
Run Code Online (Sandbox Code Playgroud)

如果您的json文件格式正确,只需编辑一些前导/结束行即可.

编辑:您可能需要--jsonArray从有效的json文件导入选项.尝试

mongoimport --db DATABASE_NAME --collection COLLECTION_NAME --file YOUR_JSON_FILE --jsonArray
Run Code Online (Sandbox Code Playgroud)

  • 在这种情况下,您可以使用`--jsonArray`选项.尝试`mongoimport --db DATABASE_NAME --collection COLLECTION_NAME --file YOUR_JSON_FILE --jsonArray`. (2认同)