小编Ana*_*tri的帖子

如何在SOLR中编制.html文件索引

我想要做索引的文件存储在服务器上(我不需要抓取)./ path/to/files /示例HTML文件是

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta name="product_id" content="11"/>
<meta name="assetid" content="10001"/>
<meta name="title" content="title of the article"/>
<meta name="type" content="0xyzb"/>
<meta name="category" content="article category"/>
<meta name="first" content="details of the article"/>

<h4>title of the article</h4>
<p class="link"><a href="#link">How cite the Article</a></p>
<p class="list">
  <span class="listterm">Length: </span>13 to 15 feet<br>
  <span class="listterm">Height to Top of Head: </span>up to 18 feet<br>
  <span class="listterm">Weight: </span>1,200 to 4,300 pounds<br>
  <span class="listterm">Diet: </span>leaves and branches of trees<br>
  <span class="listterm">Number of Young: </span>1<br>
  <span class="listterm">Home: …
Run Code Online (Sandbox Code Playgroud)

solr data-import full-text-indexing dataimporthandler solr4

5
推荐指数
2
解决办法
9952
查看次数

用于xml文件的DIH(数据导入处理程序)在Solr4中不起作用

我已经在服务器上安装并配置了Solr4和tomcat6.它工作得很好,但是当我尝试构建DIH(数据导入处理程序)时,它给了我一个错误,我无法解决.

我将以下代码添加到我的solrconfig.xml文件中

<requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler">
<lst name="defaults">
  <str name="config">/path/to/data-config.xml</str>
</lst>
Run Code Online (Sandbox Code Playgroud)

我的data-config.xml文件如下所示

<dataConfig>
<dataSource type="FileDataSource" />
<document>
    <entity name="f" processor="FileListEntityProcessor" baseDir="/path/to/basedirectory/toxmlfiles/" fileName=".*xml" recursive="true" rootEntity="false" dataSource="null">
        <field column="plainText" name="text"/>
    </entity>
</document>
Run Code Online (Sandbox Code Playgroud)

我点击localhost时在浏览器上遇到的错误:8080/solr / 浏览器错误 我的错误日志中的错误是

       SEVERE: Unable to create core: collection1
       org.apache.solr.common.SolrException: RequestHandler init failure
       at org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:168)
       at org.apache.solr.core.SolrCore.<init>(SolrCore.java:657)
       at org.apache.solr.core.SolrCore.<init>(SolrCore.java:566)
       at org.apache.solr.core.CoreContainer.create(CoreContainer.java:850)
       at org.apache.solr.core.CoreContainer.load(CoreContainer.java:534)
       at org.apache.solr.core.CoreContainer.load(CoreContainer.java:356)
       at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:308)
       at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:107)
       at org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:295)
       at org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:422)
       at org.apache.catalina.core.ApplicationFilterConfig.<init>(ApplicationFilterConfig.java:115)
       at org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3838)
       at org.apache.catalina.core.StandardContext.start(StandardContext.java:4488)
       at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:791)
       at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:771)
       at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:526)
       at org.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.java:637)
       at org.apache.catalina.startup.HostConfig.deployDescriptors(HostConfig.java:563)
       at org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:498) …
Run Code Online (Sandbox Code Playgroud)

solr dataimporthandler dih solr4

1
推荐指数
1
解决办法
4442
查看次数