使用 Python Elastic Client 插入新文档引发非法参数异常

Cur*_*tLH 1 python elasticsearch

我在 AWS 上设置了一个 Elasticsearch 服务,其中包含一个现有索引,我正在尝试向该索引添加更多文档。我想使用 Python Elasticsearch 客户端与此服务交互。我能够成功连接服务并按预期查询它。但是,当我向 Elasticsearch 添加新文档时,收到以下错误:

RequestError: RequestError(400, 'illegal_argument_exception', 'mapper [city] cannot be changed from type [keyword] to [text]')
Run Code Online (Sandbox Code Playgroud)

我是否需要以某种方式为添加 Elasticsearch 的每个文档指定映射?我已经搜索过文档,但没有看到任何这样的例子。我想将城市字段映射为关键字,但我不知道在上传新文档时如何指定。

这是我当前的流程:

# create auth for AWS version 4
awsauth = AWS4Auth(access_key, secret_key, "us-east-2", "es")

# instantiate the elastic search client
es = Elasticsearch(
    hosts = [{'host': host, 'port': 443}],
    http_auth = awsauth,
    use_ssl = True,
    verify_certs = True,
    connection_class = RequestsHttpConnection
)

# create a document to upload
data = {'ad_id': 1053674,
 'city': 'Houston',
 'category': 'Cars',
 'date_posted': datetime.datetime(2021, 1, 29, 19, 33),
 'title': '2020 Chevrolet Silverado',
 'body': "This brand new vehicle is the perfect truck for you.",
 'phone': None}

# add document to index
res = es.index(index='ads', doc_type="doc", id=data[0]['ad_id'], body=data[0])
print(res['result'])
Run Code Online (Sandbox Code Playgroud)
RequestError: RequestError(400, 'illegal_argument_exception', 'mapper [city] cannot be changed from type [keyword] to [text]')
Run Code Online (Sandbox Code Playgroud)

注意:这是输出 fomo es.info()

{'name': '123456789', 'cluster_name': '123456789:ads', 'cluster_uuid': '123456789', 'version': {'number': '7.9.1', 'build_flavor': 'oss', 'build_type': 'tar', 'build_hash': 'unknown', 'build_date': '2020-11-03T09:54:32.349659Z', 'build_snapshot': False, 'lucene_version': '8.6.2', 'minimum_wire_compatibility_version': '6.8.0', 'minimum_index_compatibility_version': '6.0.0-beta1'}, 'tagline': 'You Know, for Search'}
Run Code Online (Sandbox Code Playgroud)

Joe*_*ook 5

当您以某种方式修改摄取文档时,Elasticsearch 自动为您的索引生成映射,然后您尝试摄取不一定符合先前定义的结构(映射)的文档,则会引发此错误。

\n

检查当前映射,请运行:

\n
current_mapping = es.indices.get_mapping(\'ads\')\n
Run Code Online (Sandbox Code Playgroud)\n

现在,要真正解决原始问题,请删除索引并显式指定映射,以便您可以完全控制 ES 索引的结构:

\n
# create a document to upload\ndata = [{\n    \'ad_id\': 1053674,\n    \'city\': \'Houston\',\n    \'category\': \'Cars\',\n    \'date_posted\': datetime.datetime(2021, 1, 29, 19, 33),\n    \'title\': \'2020 Chevrolet Silverado\',\n    \'body\': "This brand new vehicle is the perfect truck for you.",\n    \'phone\': None\n}]\n\nmapping = \'\'\'\n{\n  "mappings" : {\n    "properties" : {\n      "ad_id" : {\n        "type" : "long"\n      },\n      "body" : {\n        "type" : "text",\n        "fields" : {\n          "keyword" : {\n            "type" : "keyword",\n            "ignore_above" : 256\n          }\n        }\n      },\n      "category" : {\n        "type" : "text",\n        "fields" : {\n          "keyword" : {\n            "type" : "keyword",\n            "ignore_above" : 256\n          }\n        }\n      },\n      "city" : {\n        "type" : "text",\n        "fields" : {\n          "keyword" : {\n            "type" : "keyword",\n            "ignore_above" : 256\n          }\n        }\n      },\n      "date_posted" : {\n        "type" : "date"\n      },\n      "title" : {\n        "type" : "text",\n        "fields" : {\n          "keyword" : {\n            "type" : "keyword",\n            "ignore_above" : 256\n          }\n        }\n      }\n    }\n  }\n}\n\'\'\'\n\n# drop the index\n# es.indices.delete(index=\'ads\', ignore=[400, 404])\n\n# create the index w/ the mapping\nes.indices.create(index=\'ads\', ignore=400, body=mapping)\n\n# add document to index\nres = es.index(index=\'ads\', doc_type="_doc", id=data[0][\'ad_id\'], body=data[0])\n\nprint(res[\'result\'])\n
Run Code Online (Sandbox Code Playgroud)\n

仅供参考 \xe2\x80\x94 如果您打算将映射保留city为 a keyword,则在查询时只能精确匹配。

\n