如何在elasticsearch中添加doc而不用python指定id

4 python elasticsearch

我想使用 python elasticsearch 在弹性搜索中添加 doc 但在文档的示例中我有这个代码,在这个示例中指定 id ,我不想指定 id,我希望弹性为我生成 id 像这样示例 AK3286826fds83

def addBrandInES():

    doc = {
        'author': 'kimchy',
        'text': 'Elasticsearch: cool. bonsai cool.',
        'timestamp': datetime.now(),
    }

    # res = es.index(index="brands", doc_type='external', id=1, body=doc)
    res = es.index(index="brands", doc_type='external', body=doc) <-- can i do that ??
    print(res['created'])
Run Code Online (Sandbox Code Playgroud)

小智 7

是的,您可以简单地省略该id参数。当缺少该参数时,Elasticsearch 将为该文档创建一个。以下代码段来自 elasticsearch-pyindex方法:

def index(self, index, doc_type, body, id=None, params=None):
        """
        Adds or updates a typed JSON document in a specific index, making it searchable.
        `<http://www.elastic.co/guide/en/elasticsearch/reference/current/docs-index_.html>`_

        :arg index: The name of the index
        :arg doc_type: The type of the document
        :arg body: The document
        :arg id: Document ID
        :arg op_type: Explicit operation type, default 'index', valid choices
            are: 'index', 'create'
        :arg parent: ID of the parent document
        :arg pipeline: The pipeline id to preprocess incoming documents with
        :arg refresh: If `true` then refresh the affected shards to make this
            operation visible to search, if `wait_for` then wait for a refresh
            to make this operation visible to search, if `false` (the default)
            then do nothing with refreshes., valid choices are: u'true',
            u'false', u'wait_for'
        :arg routing: Specific routing value
        :arg timeout: Explicit operation timeout
        :arg timestamp: Explicit timestamp for the document
        :arg ttl: Expiration time for the document
        :arg version: Explicit version number for concurrency control
        :arg version_type: Specific version type, valid choices are: 'internal',
            'external', 'external_gte', 'force'
        :arg wait_for_active_shards: Sets the number of shard copies that must
            be active before proceeding with the index operation. Defaults to 1,
            meaning the primary shard only. Set to `all` for all shard copies,
            otherwise set to any non-negative value less than or equal to the
            total number of copies for the shard (number of replicas + 1)
        """
        for param in (index, doc_type, body):
            if param in SKIP_IN_PATH:
                raise ValueError("Empty value passed for a required argument.")
        return self.transport.perform_request('POST' if id in SKIP_IN_PATH else 'PUT',
            _make_path(index, doc_type, id), params=params, body=body)
Run Code Online (Sandbox Code Playgroud)

注意倒数第二行:SKIP_IN_PATH定义为:

SKIP_IN_PATH = (None, '', b'', [], ())
Run Code Online (Sandbox Code Playgroud)

因此,如果id缺少,将使用 HTTP 'POST',即创建新对象,否则将使用 'PUT',即更新现有文档。

还有另一个名为 的 API create(),它需要id设置 。该 API 专门用于创建具有指定 id 的文档。


小智 -2

res = es.index(index="brands", doc_type='external', body=doc , id=<your id>)
Run Code Online (Sandbox Code Playgroud)