使用Solr Data Import Handler从mySQL导入多值字段到Solr

Saq*_*Ali 3 mysql solr dataimporthandler solr4

我们的mySQL中有以下两个表:

mysql> describe comment;
+--------------+--------------+------+-----+---------+-------+
| Field        | Type         | Null | Key | Default | Extra |
+--------------+--------------+------+-----+---------+-------+
| id           | int(11)      | YES  |     | NULL    |       |
| blogpost_id  | int(11)      | YES  |     | NULL    |       |
| comment_text | varchar(256) | YES  |     | NULL    |       |
+--------------+--------------+------+-----+---------+-------+

mysql> describe comment_tags;
+------------+-------------+------+-----+---------+-------+
| Field      | Type        | Null | Key | Default | Extra |
+------------+-------------+------+-----+---------+-------+
| comment_id | int(11)     | YES  |     | NULL    |       |
| tag        | varchar(80) | YES  |     | NULL    |       |
+------------+-------------+------+-----+---------+-------+
Run Code Online (Sandbox Code Playgroud)

每个评论可以有多个标签.我们可以使用数据导入处理程序将整个注释导入Solr.但是,我不确定如何将每个注释的标记导入到为每个注释文档定义schema.xml的多值字段中.

请指教.谢谢

Mur*_*ala 13

您也可以将GROUP_CONCAT与Seperator(例如",")一起使用,然后尝试这样的事情:

<dataConfig>
<!-- dataSource is just an example. Included just for completeness. -->
 <dataSource type="JdbcDataSource" driver="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost/db" user="root" password="root"/>
   <document>
     <entity name="comment" pk="id" query="SELECT *, group_concat(tags) as comment_tags FROM comment" transformer="RegexTransformer">
      <field column="blogpost_id" name="blogpost_id"/>
      <field column="comment_text" name="comment_text" />
      <field column="tag" name="comment_tags" splitBy = "," />       
    </entity>
  </document>    
</dataConfig>  
Run Code Online (Sandbox Code Playgroud)

它会提高性能,还会删除另一个查询的依赖关系.


Kev*_*vin 7

尝试这样的事情:

<dataConfig>
    <!-- dataSource is just an example. Included just for completeness. -->
    <dataSource batchSize="500" type="JdbcDataSource" driver="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost/my-database" user="root" password="somethinglong1283"/>
<document>
    <entity name="comment" pk="id" query="SELECT * FROM comment">
        <field column="blogpost_id" name="blogpost_id"/>
        <field column="comment_text" name="comment_text" />
        <entity name="comment_tags" pk="comment_id" query="SELECT * FROM comment_tags WHERE comment_id='${comment.id}'">
            <field column="tag" name="tag" />
        </entity>
    </entity>
</document>
Run Code Online (Sandbox Code Playgroud)