org.apache.lucene.store.LockObtainFailedException：锁获取超时：

Question

org.apache.lucene.store.LockObtainFailedException：锁获取超时：

我正在尝试索引从 tomcat 服务器获取的大量日志文件。我编写了代码来打开每个文件，为每行创建索引，然后使用 Apache lucene 存储每行。所有这些都是使用多线程完成的。

当我尝试这段代码时，我得到了这个异常

org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out:

Run Code Online (Sandbox Code Playgroud)

代码

  if (indexWriter.getConfig().getOpenMode() == IndexWriterConfig.OpenMode.CREATE)
        {
          // New index, so we just add the document (no old document can be there):
           System.out.println("adding " + path);

                indexWriter.addDocument(doc);

       } else {
          // Existing index (an old copy of this document may have been indexed) so 
       // we use updateDocument instead to replace the old one matching the exact 
           // path, if present:
            System.out.println("updating " + path);

                indexWriter.updateDocument(new Term("path", path), doc);

          }
        indexWriter.commit();
        indexWriter.close();

Run Code Online (Sandbox Code Playgroud)

现在我想既然我每次都提交索引，它可能会导致写锁。所以我删除了indexWriter.commit();：

if (indexWriter.getConfig().getOpenMode() == IndexWriterConfig.OpenMode.CREATE)
    {
      // New index, so we just add the document (no old document can be there):
       System.out.println("adding " + path);

            indexWriter.addDocument(doc);

   } else {
      // Existing index (an old copy of this document may have been indexed) so 
   // we use updateDocument instead to replace the old one matching the exact 
       // path, if present:
        System.out.println("updating " + path);

            indexWriter.updateDocument(new Term("path", path), doc);

      }

    indexWriter.close();

Run Code Online (Sandbox Code Playgroud)

现在我也不例外

问：所以我的问题是为什么indexWriter.commit(); 导致异常。即使我删除了indexWriter.commit(); 我在搜索时没有遇到任何问题。那就是我得到了我想要的确切结果。那为什么要使用indexWriter.commit(); ？

Answer 1

Jay*_*dra 2

简而言之，它类似于数据库提交，除非您提交事务，否则添加到 Solr 的文档仅保存在内存中。只有在提交时文档才会保留在索引中。
如果文档在内存中时 Solr 崩溃，您可能会丢失这些文档。

解释：-

从第一天开始，Lucene 的原则之一就是一次写入策略。我们永远不会将一个文件写入两次。当您通过 IndexWriter 添加文档时，它会被索引到内存中，一旦达到某个阈值（最大缓冲文档或 RAM 缓冲区大小），我们就会将所有文档从主内存写入磁盘；您可以在这里和这里找到更多相关信息。将文档写入磁盘会产生一个全新的索引，称为段。现在，当您索引一堆文档或在生产中运行增量索引时，您可以看到段的数量频繁变化。然而，一旦调用 commit，Lucene 会将其整个 RAM 缓冲区刷新为段，同步它们并将指向属于此提交的所有段的指针写入 SEGMENTS 文件。

如果该文档已存在于 Solr 中，它将被覆盖（由唯一 id 确定）。
因此，您的搜索可能仍然可以正常工作，但除非您提交，否则最新文档不可用于搜索。

此外，一旦您打开索引写入器，它将获得索引上的锁，您应该关闭写入器以释放锁。

归档时间：	12 年，10 月前
查看次数：	13720 次
最近记录：	11 年，1 月前