Courier Fetch:分片失败

Question

Courier Fetch:分片失败

Car*_*ega 28 elasticsearch kibana kibana-4

为什么在向elasticsearch添加更多数据后会收到这些警告？每次浏览仪表板时警告都不同.

"Courier Fetch:60个碎片中有30个失败了."

更多细节:

它是CentOS 7.1上的唯一节点

/etc/elasticsearch/elasticsearch.yml

index.number_of_shards: 3
index.number_of_replicas: 1

bootstrap.mlockall: true

threadpool.bulk.queue_size: 1000
indices.fielddata.cache.size: 50%
threadpool.index.queue_size: 400
index.refresh_interval: 30s

index.number_of_shards: 5
index.number_of_replicas: 1

Run Code Online (Sandbox Code Playgroud)

/usr/share/elasticsearch/bin/elasticsearch.in.sh

ES_HEAP_SIZE=3G

#I use this Garbage Collector instead of the default one.

JAVA_OPTS="$JAVA_OPTS -XX:+UseG1GC"

Run Code Online (Sandbox Code Playgroud)

集群状态

{
  "cluster_name" : "my_cluster",
  "status" : "yellow",
  "timed_out" : false,
  "number_of_nodes" : 1,
  "number_of_data_nodes" : 1,
  "active_primary_shards" : 61,
  "active_shards" : 61,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 61
}

Run Code Online (Sandbox Code Playgroud)

集群细节

{
  "cluster_name" : "my_cluster",
  "nodes" : {
    "some weird number" : {
      "name" : "ES 1",
      "transport_address" : "inet[localhost/127.0.0.1:9300]",
      "host" : "some host",
      "ip" : "150.244.58.112",
      "version" : "1.4.4",
      "build" : "c88f77f",
      "http_address" : "inet[localhost/127.0.0.1:9200]",
      "process" : {
        "refresh_interval_in_millis" : 1000,
        "id" : 7854,
        "max_file_descriptors" : 65535,
        "mlockall" : false
      }
    }
  }
}

Run Code Online (Sandbox Code Playgroud)

我很好奇"mlockall":false因为在yml上我写了bootstrap.mlockall:true

日志

很多行如:

org.elasticsearch.common.util.concurrent.EsRejectedExecutionException: rejected execution (queue capacity 1000) on org.elasticsearch.search.action.SearchServiceTransportAction$23@a9a34f5

Run Code Online (Sandbox Code Playgroud)

Answer 1

小智 26

对我来说,调整线程池搜索queue_size解决了这个问题.我尝试了很多其他的东西,这就解决了它.

我把它添加到我的elasticsearch.yml中

threadpool.search.queue_size: 10000

Run Code Online (Sandbox Code Playgroud)

然后重新启动elasticsearch.

推理......(来自文档)

节点包含多个线程池,以便改进节点内线程内存消耗的管理方式.其中许多池也有与之关联的队列,这允许保留挂起的请求而不是丢弃.

特别是搜索......

用于计数/搜索操作.默认值固定为int((#ofable_processors*3)/ 2)+ 1,queue_size为1000.

有关更多信息,请参阅此处的elasticsearch 文档 ...

我很难找到这些信息,所以我希望这有助于其他人!

谢谢它对我有用.config键是thread_pool.search.queue_size而不是threadpool.search.queue_size (3认同)

Answer 2

小智 7

使用Elasticsearch 5.4 thread_pool有一个下划线.

thread_pool.search.queue_size: 10000

Run Code Online (Sandbox Code Playgroud)

请参阅Elasticsearch Thread Pool模块文档中的文档

Answer 3

spi*_*ech 6

当查询缺少右引号时出现此错误：

field:"value

在我的ElasticSearch日志中，我看到以下异常：

Caused by: org.elasticsearch.index.query.QueryShardException:
    Failed to parse query [field:"value]
...
Caused by: org.apache.lucene.queryparser.classic.ParseException: 
    Cannot parse 'field:"value': Lexical error at line 1, column 13.  
    Encountered: <EOF> after : "\"value"

Run Code Online (Sandbox Code Playgroud)

这是一个答案；像其他答案所建议的那样，此错误可能是由于查询错误而引起的，而不仅是queue_size等。 (4认同)

Answer 4

Alc*_*zar 5

这很可能表明集群的运行状况存在问题。在不了解您的集群的情况下，没有什么可以说的了。

谢谢，我使用以下方法解决了该问题：＃不使用所有处理器处理器：6个线程池：获取：类型：固定大小：30 queue_size：3000搜索：类型：固定大小：30 queue_size：3000 index.number_of_shards：2 index.number_of_replicas ：0 (2认同)

归档时间：	10 年，9 月前
查看次数：	37819 次
最近记录：	6 年，8 月前