Ami*_*t P 25 json elasticsearch
我正在尝试将JSON文件批量索引到新的Elasticsearch索引中,但我无法这样做.我在JSON中有以下示例数据
[{"Amount": "480", "Quantity": "2", "Id": "975463711", "Client_Store_sk": "1109"},
{"Amount": "2105", "Quantity": "2", "Id": "975463943", "Client_Store_sk": "1109"},
{"Amount": "2107", "Quantity": "3", "Id": "974920111", "Client_Store_sk": "1109"},
{"Amount": "2115", "Quantity": "2", "Id": "975463798", "Client_Store_sk": "1109"},
{"Amount": "2116", "Quantity": "1", "Id": "975463827", "Client_Store_sk": "1109"},
{"Amount": "648", "Quantity": "3", "Id": "975464139", "Client_Store_sk": "1109"},
{"Amount": "2126", "Quantity": "2", "Id": "975464805", "Client_Store_sk": "1109"},
{"Amount": "2133", "Quantity": "1", "Id": "975464061", "Client_Store_sk": "1109"},
{"Amount": "1339", "Quantity": "4", "Id": "974919458", "Client_Store_sk": "1109"},
{"Amount": "1196", "Quantity": "5", "Id": "974920538", "Client_Store_sk": "1109"},
{"Amount": "1198", "Quantity": "4", "Id": "975463638", "Client_Store_sk": "1109"},
{"Amount": "1345", "Quantity": "4", "Id": "974919522", "Client_Store_sk": "1109"},
{"Amount": "1347", "Quantity": "2", "Id": "974919563", "Client_Store_sk": "1109"},
{"Amount": "673", "Quantity": "2", "Id": "975464359", "Client_Store_sk": "1109"},
{"Amount": "2153", "Quantity": "1", "Id": "975464511", "Client_Store_sk": "1109"},
{"Amount": "3896", "Quantity": "4", "Id": "977289342", "Client_Store_sk": "1109"},
{"Amount": "3897", "Quantity": "4", "Id": "974920602", "Client_Store_sk": "1109"}]
Run Code Online (Sandbox Code Playgroud)
当我尝试使用Elasticsearch的标准批量索引api时,我收到此错误错误:{"message":"ActionRequestValidationException [验证失败:1:未添加任何请求;]"}
任何人都可以帮助索引这种类型的JSON吗?
Val*_*Val 46
您需要做的是读取该JSON文件,然后使用_bulk端点期望的格式构建批量请求,即命令的一行和文档的一行,由换行符分隔...冲洗并重复每个文件:
curl -XPOST localhost:9200/your_index/_bulk -d '
{"index": {"_index": "your_index", "_type": "your_type", "_id": "975463711"}}
{"Amount": "480", "Quantity": "2", "Id": "975463711", "Client_Store_sk": "1109"}
{"index": {"_index": "your_index", "_type": "your_type", "_id": "975463943"}}
{"Amount": "2105", "Quantity": "2", "Id": "975463943", "Client_Store_sk": "1109"}
... etc for all your documents
'
Run Code Online (Sandbox Code Playgroud)
只要确保替换your_index和your_type你正在使用的实际索引和类型名称.
UPDATE
请注意,可以通过删除命令行来缩短命令行_index,_type如果在URL中指定了这些命令行._id如果在映射中指定id字段的路径,也可以删除(请注意,此功能在ES 2.0中将不推荐使用).至少,您的命令行可能看起来像{"index":{}}所有文档,但它始终是必需的,以指定您要执行的操作类型(在本例中index为文档)
更新2
curl -XPOST localhost:9200/index_local/my_doc_type/_bulk --data-binary @/home/data1.json
Run Code Online (Sandbox Code Playgroud)
/home/data1.json 应该是这样的:
{"index":{}}
{"Amount": "480", "Quantity": "2", "Id": "975463711", "Client_Store_sk": "1109"}
{"index":{}}
{"Amount": "2105", "Quantity": "2", "Id": "975463943", "Client_Store_sk": "1109"}
{"index":{}}
{"Amount": "2107", "Quantity": "3", "Id": "974920111", "Client_Store_sk": "1109"}
Run Code Online (Sandbox Code Playgroud)
截至目前,6.1.2是ElasticSearch的最新版本,在Windows(x64)上对我有效的curl命令是
curl -s -XPOST localhost:9200/my_index/my_index_type/_bulk -H "Content-Type:
application/x-ndjson" --data-binary @D:\data\mydata.json
Run Code Online (Sandbox Code Playgroud)
mydata.json中应该存在的数据格式与@val的答案相同
| 归档时间: |
|
| 查看次数: |
40104 次 |
| 最近记录: |