在Elasticsearch中设置刷新间隔以改善io-wait?

era*_*ran 1 elasticsearch

我的群集显示了很多io-wait(大约50%).

我做了很多索引和重建索引.

我想也许重新索引lucene是造成很多IO的原因.想到可能增加refresh_interval或者index.translog选项 - 这是正确的方法吗?

我的主要问题是我不知道如何找出我的设置.

http://www.elasticsearch.org/guide/reference/api/admin-indices-update-settings/中列出了很多选项,当我使用时,这些选项都不可用:

curl -xget 'http://localhost:9200/my_index/_settings'
Run Code Online (Sandbox Code Playgroud)

如果使用默认值,它不会返回值(根据kimchy在这篇文章中的回答)

我只获得了明确设置的分片,副本的数量.elasticsearch.yml文件不会告诉默认值是什么.我怎么知道我的变化发生了,现在有什么价值?

非常感谢,因为我无法找到相关文档.

运行hot_threads,我得到:

> curl -XGET 'http://localhost:9200/_nodes/hot_threads?threads=5'
::: [Gardener][CR0qQbtBRyeU94hltnnE7A][inet[/10.154.148.151:9300]]{aws_availability_zone=us-east-1d}

   50.6% (253.2ms out of 500ms) cpu usage by thread 'elasticsearch[Gardener][search][T#20]'
     10/10 snapshots sharing following 8 elements
       sun.misc.Unsafe.park(Native Method)
       java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
       java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
       java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
       java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       java.lang.Thread.run(Thread.java:722)

   32.9% (164.5ms out of 500ms) cpu usage by thread 'elasticsearch[Gardener][search][T#12]'
     10/10 snapshots sharing following 8 elements
       sun.misc.Unsafe.park(Native Method)
       java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
       java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
       java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
       java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       java.lang.Thread.run(Thread.java:722)

   29.1% (145.5ms out of 500ms) cpu usage by thread 'elasticsearch[Gardener][search][T#8]'
     2/10 snapshots sharing following 20 elements
       org.apache.lucene.search.MultiTermQueryWrapperFilter.getDocIdSet(MultiTermQueryWrapperFilter.java:111)
       org.apache.lucene.search.ConstantScoreQuery$ConstantWeight.scorer(ConstantScoreQuery.java:131)
       org.apache.lucene.search.FilteredQuery$RandomAccessFilterStrategy.filteredScorer(FilteredQuery.java:533)
       org.apache.lucene.search.FilteredQuery$1.scorer(FilteredQuery.java:133)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:609)
       org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:161)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:572)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:524)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:501)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:345)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:127)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:239)
       org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:141)
       org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:206)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:193)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$2.run(TransportSearchTypeAction.java:179)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       java.lang.Thread.run(Thread.java:722)
     8/10 snapshots sharing following 2 elements
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       java.lang.Thread.run(Thread.java:722)

   26.5% (132.7ms out of 500ms) cpu usage by thread 'elasticsearch[Gardener][search][T#11]'
     2/10 snapshots sharing following 15 elements
       org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:161)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:572)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:524)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:501)
       org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:345)
       org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:127)
       org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:239)
       org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteQuery(SearchServiceTransportAction.java:141)
       org.elasticsearch.action.search.type.TransportSearchQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchQueryThenFetchAction.java:80)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:206)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:193)
       org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$2.run(TransportSearchTypeAction.java:179)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       java.lang.Thread.run(Thread.java:722)
     8/10 snapshots sharing following 8 elements
       sun.misc.Unsafe.park(Native Method)
       java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
       java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
       java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
       java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       java.lang.Thread.run(Thread.java:722)

    4.2% (21.1ms out of 500ms) cpu usage by thread 'elasticsearch[Gardener][bulk][T#4]'
     10/10 snapshots sharing following 9 elements
       sun.misc.Unsafe.park(Native Method)
       java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
       org.elasticsearch.common.util.concurrent.jsr166y.LinkedTransferQueue.awaitMatch(LinkedTransferQueue.java:706)
       org.elasticsearch.common.util.concurrent.jsr166y.LinkedTransferQueue.xfer(LinkedTransferQueue.java:615)
       org.elasticsearch.common.util.concurrent.jsr166y.LinkedTransferQueue.take(LinkedTransferQueue.java:1109)
       java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
       java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
       java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       java.lang.Thread.run(Thread.java:722)
Run Code Online (Sandbox Code Playgroud)

运行阻止并等待:

> curl -XGET 'http://localhost:9200/_nodes/hot_threads?threads=3&type=wait'
::: [Gardener][CR0qQbtBRyeU94hltnnE7A][inet[/10.154.148.151:9300]]{aws_availability_zone=us-east-1d}

    0.0% (0s out of 500ms) wait usage by thread 'Reference Handler'
     10/10 snapshots sharing following 3 elements
       java.lang.Object.wait(Native Method)
       java.lang.Object.wait(Object.java:503)
       java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)

    0.0% (0s out of 500ms) wait usage by thread 'Finalizer'
     10/10 snapshots sharing following 4 elements
       java.lang.Object.wait(Native Method)
       java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
       java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
       java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:189)

    0.0% (0s out of 500ms) wait usage by thread 'Signal Dispatcher'
     unique snapshot
     unique snapshot
     unique snapshot
     unique snapshot
     unique snapshot
     unique snapshot
     unique snapshot
     unique snapshot
     unique snapshot
     unique snapshot

> curl -XGET 'http://localhost:9200/_nodes/hot_threads?threads=3&type=block'
::: [Gardener][CR0qQbtBRyeU94hltnnE7A][inet[/10.154.148.151:9300]]{aws_availability_zone=us-east-1d}

    0.0% (0s out of 500ms) block usage by thread 'Reference Handler'
     10/10 snapshots sharing following 3 elements
       java.lang.Object.wait(Native Method)
       java.lang.Object.wait(Object.java:503)
       java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)

    0.0% (0s out of 500ms) block usage by thread 'Finalizer'
     10/10 snapshots sharing following 4 elements
       java.lang.Object.wait(Native Method)
       java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
       java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
       java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:189)

    0.0% (0s out of 500ms) block usage by thread 'Signal Dispatcher'
     unique snapshot
     unique snapshot
     unique snapshot
     unique snapshot
     unique snapshot
     unique snapshot
     unique snapshot
     unique snapshot
     unique snapshot
     unique snapshot
Run Code Online (Sandbox Code Playgroud)

imo*_*tov 15

默认情况下,index.refresh_interval设置为1秒.您可以通过将其设置为-1来增加此间隔或禁用自动刷新.

curl -XPUT 'localhost:9200/my_index/_settings' -d '
{
    "index" : {
        "refresh_interval" : -1
    }
}
'
Run Code Online (Sandbox Code Playgroud)

但是,在开始搞乱设置之前,我建议找出这种高I/O的实际原因.运行hot_threads请求并检查线程在大多数时间花费的位置.

  • 如果它不是条带EBS使其条带化,如果可能切换到配置"IOPS". (2认同)