标签: aggregation

弹性搜索。_Score 在聚合中为空。为什么？

我使用 ES v 1.7。ES 仅在“点击”部分返回 _score ，但我对“点击”不感兴趣，我需要来自_score响应的“聚合”部分的数据。为什么 ES 会这样以及如何解决它？

\n\n

要求：

\n\n

{\n    "size": 1,\n        "query": {\n            "bool": {\n                "must": [\n                    { "match": {"_all": {"query": "test","operator": "and","fuzziness": "2"}}}\n                ],\n                "should": [\n                    { "multi_match" : {\n                            "query":      "test"\n                            ,"type":       "best_fields"\n                            ,"fields":     ["ObjectData.PRTNAME","ObjectData.EXTERNALID","ObjectData.contactList.VALUE","*SERIES","*NUMBER","ObjectData.INN"]\n                            ,"operator":   "or"\n                            ,"boost": 3\n                    }}\n                ]\n            } \n        },   \n  "aggs": {\n    "byObjectID": {\n      "terms": {"field": "ObjectID"},\n      "aggs": {\n        "latestVer": {\n          "top_hits": {\n            "sort": [{"creationDate": { "order": "desc" }}]\n            ,"_source": { "include": ["ObjectData.BRIEFNAME", "creationDate", "ObjectID" ]}\n            ,"size": …

Run Code Online (Sandbox Code Playgroud)

full-text-search aggregation relevance elasticsearch

Ily*_*a P

lucky-day

3
推荐指数

1
解决办法

2880
查看次数

pandas-对列中具有相同值的连续行进行分组和聚合

我有一个 pandas DataFrame，它来自从数据库中提取的一长串日期时间范围，每个范围都有一个标签。日期的排序使得一行的开始日期是前一行的结束日期。一个可行的例子在这里：

import pandas as pd

bins = [{'start': '2020-01-12 00:00:00', 'end': '2020-01-13 00:00:00', 'label': 't3'},
        {'start': '2020-01-13 00:00:00', 'end': '2020-01-13 07:00:00', 'label': 't2'},
        {'start': '2020-01-13 07:00:00', 'end': '2020-01-13 15:30:00', 'label': 't1'},
        {'start': '2020-01-13 15:30:00', 'end': '2020-01-14 00:00:00', 'label': 't2'},
        {'start': '2020-01-14 00:00:00', 'end': '2020-01-14 07:00:00', 'label': 't2'},
        {'start': '2020-01-14 07:00:00', 'end': '2020-01-14 15:30:00', 'label': 't1'},
        {'start': '2020-01-14 15:30:00', 'end': '2020-01-15 00:00:00', 'label': 't2'},
        {'start': '2020-01-15 00:00:00', 'end': '2020-01-15 07:00:00', 'label': 't2'},
        {'start': '2020-01-15 07:00:00', 'end': '2020-01-15 15:30:00', 'label': …

Run Code Online (Sandbox Code Playgroud)

django aggregation dataframe pandas pandas-groupby

Mar*_*rkD

lucky-day

3
推荐指数

1
解决办法

1980
查看次数

PyMongo 聚合“AttributeError：'dict'对象没有属性'_txn_read_preference'”

我确信我的代码中有错误，因为我是 pyMongo 的新手，但我会尝试一下。MongoDB中的数据为167k+，如下：

{'overall': 5.0,
 'reviewText': {'ago': 1,
                'buy': 2,
                'daughter': 1,
                'holiday': 1,
                'love': 2,
                'niece': 1,
                'one': 2,
                'still': 1,
                'today': 1,
                'use': 1,
                'year': 1},
 'reviewerName': 'dcrm'}

Run Code Online (Sandbox Code Playgroud)

我想获得该reviewText领域内所有 5.0 评级中使用的术语的统计。我运行了以下代码，并收到以下错误。有什么见解吗？

{'overall': 5.0,
 'reviewText': {'ago': 1,
                'buy': 2,
                'daughter': 1,
                'holiday': 1,
                'love': 2,
                'niece': 1,
                'one': 2,
                'still': 1,
                'today': 1,
                'use': 1,
                'year': 1},
 'reviewerName': 'dcrm'}

Run Code Online (Sandbox Code Playgroud)

正如您所看到的，我遇到了“allDiskUse”组件，因为我似乎超过了 100MB 阈值。但我得到的错误是：

#1 Find the top 20 most common words found in 1-star reviews.

aggr = [{"$unwind": …

Run Code Online (Sandbox Code Playgroud)

aggregation pymongo

Mat*_*vis

lucky-day

3
推荐指数

1
解决办法

4429
查看次数

将字段添加到 mongo 聚合中对象数组中的每个对象

我在根级别的模式中有一个字段，并希望将其添加到数组中与条件匹配的每个对象中。

这是一个示例文档......

{
    calls: [
      {
        "name": "sam",
        "status": "scheduled"
      },
      {
        "name": "tom",
        "status": "cancelled"
      },
      {
        "name": "bob",
        "status": "scheduled"
      },
      
    ],
    "time": 1620095400000.0,
    "call_id": "ABCABCABC"
}

Run Code Online (Sandbox Code Playgroud)

所需文件如下：

[
  {
    "call_id": "ABCABCABC",
    "calls": [
      {
        "call_id": "ABCABCABC",
        "name": "sam",
        "status": "scheduled"
      },
      {
        "name": "tom",
        "status": "cancelled"
      },
      {
        "call_id": "ABCABCABC",
        "name": "bob",
        "status": "scheduled"
      }
    ],
    "time": 1.6200954e+12
  }
]

Run Code Online (Sandbox Code Playgroud)

应将其call_id添加到数组中状态为“已计划”的所有对象。是否可以通过 mongo 聚合来做到这一点？我已经尝试过$addFields，但无法达到上述结果。提前致谢！

aggregation mongodb

CoC*_*oCo

2021 05-04

3
推荐指数

1
解决办法

5591
查看次数

Mongo 每天汇总 $group 吗？

我有一个历史记录集合，并且想要基于该集合创建导出数据库

[{
_id: "...",
value: 10,
at: ISODate("2021-24-06T00:01:02.023")
}, {
_id: ...,
value: 13,
at: ISODate("2021-24-06T00:04:11.211")
}, {
_id: ...,
value: 12,
at: ISODate("2021-24-06T09:11:31.182")
}, {
_id: ...,
value: 40,
at: ISODate("2021-24-07T01:33:31.723")
}, {
_id: ...,
value: 40,
at: ISODate("2021-24-15T09:32:44.983")
}, {
_id: ...,
value: 40,
at: ISODate("2021-24-16T10:43:22.083")
}, {
_id: ...,
value: 40,
at: ISODate("2021-24-16T14:43:22.083")
}, {
_id: ...,
value: 40,
at: ISODate("2021-24-17T04:25:12.021")
}, {
_id: ...,
value: 40,
at: ISODate("2021-24-18T20:13:22.083")
}, {
_id: ...,
value: 40,
at: ISODate("2021-24-19T18:41:22.083") …

Run Code Online (Sandbox Code Playgroud)

group-by aggregation mongodb node.js

Vũ *_*ũng

2021 06-25

3
推荐指数

1
解决办法

2231
查看次数

在基于文档的聚合管道中使用 $merge 不起作用

我有一个集合，我想对其执行聚合并将结果放入同一数据库中的单独集合中。在仔细检查文档时，我偶然发现$merge哪个完全按照我想要的方式工作。我想出了以下 mongo shell 管道，它运行完美。

db.getCollection('SOURCE_COLLECTION').aggregate([
  {
    "$match": {type: 'ABC'}
  },
  {
    "$merge": {
      "into": "OUTPUT_COLLECTION",
      "whenMatched": "replace"
    }
  }
])

Run Code Online (Sandbox Code Playgroud)

现在，我需要在 Spring boot 中达到相同的效果，为此我想出了以下内容，理论上应该没有什么不同。

final ArrayList<Document> pipeline = new ArrayList<>();

pipeline.add(Document.parse("{$match: {type: 'ABC'}}"));
pipeline.add(Document.parse("{$merge: {into: 'OUTPUT_COLLECTION', whenMatched: 'replace'}}"));

mongoTemplate.getDb()
    .getCollection("SOURCE_COLLECTTION", Document.class)
    .aggregate(pipeline);

Run Code Online (Sandbox Code Playgroud)

尽管如此，这不起作用。可以看出，我正在使用以 a作为管道的MongoCollection<T>.aggregate()方法。List<Document>管道中的每个阶段都是通过将 JSON 字符串解析为文档来生成的。

有趣的是，当我用替换 merge 时$out，它的工作没有任何问题。

final ArrayList<Document> pipeline = new ArrayList<>();

pipeline.add(Document.parse("{$match: {type: 'ABC'}}"));
pipeline.add(Document.parse("{$out: 'OUTPUT_COLLECTION'}"));

mongoTemplate.getDb()
    .getCollection("SOURCE_COLLECTTION", Document.class)
    .aggregate(pipeline);

Run Code Online (Sandbox Code Playgroud)

但这对我来说没有好处，因为这个聚合将被执行多次（实际上我试图在这里填充一个集合的物化视图类型）。我需要$merge工作，但事实并非如此。我缺少什么？有人可以看到我看不到的东西吗？

java aggregation mongodb spring-boot

Rom*_*rra

2021 10-11

3
推荐指数

1
解决办法

974
查看次数

如何在Spring Boot中的存储库中的@Aggregation注释中添加“allowDiskUse”？

我在一个文件中写入了一个聚合，该聚合是从后端的文件MyRepository.kt调用的。MongoDataRetriever.kt

MyRepository.kt文件：

import org.springframework.data.mongodb.repository.Aggregation\nimport org.springframework.data.mongodb.repository.MongoRepository\n\n  @Aggregation(pipeline = [\n    "{ \\$match: { 'objName' : { \\$exists: true } } }",\n    "{ \\$sort: { 'addedDate': -1 } }"\n  ])\n  fun getLatestObjectsWithLatestData(): List<MyDocument>\n

Run Code Online (Sandbox Code Playgroud)\n

和MongoDataRetriever.kt文件：

  override fun getLatestObjects(): List<MyObj> {\n    return myRepository.getLatestObjectsWithLatestData().map { it.toMyObj() }\n  }\n

Run Code Online (Sandbox Code Playgroud)\n

上述aggregation失败并出现错误：

message:\xc2\xa0"Command failed with error 16819 (Location16819): 'Sort exceeded memory limit of 104857600 bytes, but did not opt in to external sorting. Aborting operation. …

spring aggregation mongodb mongorepository spring-boot

Lig*_*ami

2022 05-06

3
推荐指数

1
解决办法

1139
查看次数

MongoDB：如何将聚合管道中的所有文档合并为单个文档

我当前的聚合输出如下：

[
    {
        "courseCount": 14
    },
    {
        "registeredStudentsCount": 1
    }
]

Run Code Online (Sandbox Code Playgroud)

该数组有两个文档。我想将所有文档合并到一个包含 mongoDB 中所有字段的文档中

aggregation mongodb

Uro*_*ooj

lucky-day

3
推荐指数

1
解决办法

3362
查看次数

Java代码中的聚合和组合

由于聚合和组合是相关的协会,或者我们可以说它给出了对象或其他任何东西之间关系的理解.

我发布了这个问题,因为我在采访中问了一个问题,即什么是组合和聚合.

所以根据我的理解,我给出了我的想法,如下所示.

http://www.coderanch.com/t/522414/java/java/Association-Aggregation-Composition

聚合,关联和组合

Java中的关联与聚合与组合

还访问了更多.

我的基本解释与聚合表示松散的关系有关,而作文表明与此有关的强关系和明确的解释.

但是采访者侮辱了我并且说这是理论概念,你说我想要完美的Java代码,在这些代码中它们如何区别,并告诉我是否会给你一个小的应用程序然后你如何识别这是聚合和这个是作文？

那么现在我想了解纯技术概念和Java代码,它们的区别在于它们是什么？它们是什么？

java composition relationship aggregation

Nir*_*ani

2017 05-23

2
推荐指数

1
解决办法

2万
查看次数

如何在弹性搜索聚合中使用分页(大小和从)？

如何在弹性搜索聚合中使用分页(大小和从),我使用Size和from agreggition,s为异常抛出异常.
我想查询一下？

GET/index/nameorder/_search

{
    "size": 0,
    "query": {
        "filtered": {
            "query": {
                "bool": {
                    "must": [
                        {
                            "match": {
                                "projectId": "10057"
                            }
                        }
                    ]
                }
            },
            "filter": {
                "range": {
                    "purchasedDate": {
                        "from": "2012-02-05T00:00:00",
                        "to": "2015-02-11T23:59:59"
                    }
                }
            }
        }
    },
    "aggs": {
        "group_by_a": {
            "terms": {
                "field": "promocode",
                "size": 40,
                "from": 40
            },
            "aggs": {
                "TotalPrice": {
                    "sum": {
                        "field": "subtotalPrice"
                    }
                }
            }
        }
    }
}

Run Code Online (Sandbox Code Playgroud)

pagination aggregation elasticsearch

Dha*_*hai

2015 02-12

2
推荐指数

1
解决办法

5159
查看次数