Jac*_*ian 3 search sphinx plaintext
我浏览了数十篇文章和论坛主题,浏览了官方文档,但找不到答案。据说,这篇文章听起来很有希望,The data to be indexed can generally come from very different sources: SQL databases, plain text files, HTML files但不幸的是,正如它所致力于的所有其他文章和论坛主题一样MySQL。
听到它Sphinx是如此酷,这很奇怪,它可以做到这一点,并且几乎可以用您喜欢的任何数据源来完成您想要的任何事情。但是,这些示例中除数据源以外的所有其他示例在哪里MySQL?Sphinx当您要扫描世界上最简单的数据源-纯文本文件时,仅是一个最小的琐碎的分步示例。假设我已经安装Sphinx并想要(递归)扫描主目录以查找所有包含“ Hello world”的纯文本文件。我应该怎么做才能做到这一点?
先决条件:
Ubuntusudo apt-get install sphinxsearch理想情况下,我会这样做。
我们将使用Sphinx的sql_file_field索引具有文件路径的表。这是PHP脚本,用于创建带有特定目录(scandir)文件路径的表。
<?php
$con = mysqli_connect("localhost","root","password","database");
mysqli_query($con,"CREATE TABLE fileindex ( id INT(6) UNSIGNED AUTO_INCREMENT PRIMARY KEY,text VARCHAR(100) NOT NULL);");
// Check connection
if (mysqli_connect_errno()) {
echo "Failed to connect to MySQL: " . mysqli_connect_error();
}
$dir = scandir('/absolute/path/to/your/dir/');
foreach ($dir as $entry) {
if (!is_dir($entry)) {
$path= "/absolute/path/to/your/dir/$entry";
mysqli_query($con,"INSERT INTO fileindex ( text ) VALUES ( '$path' )");
}
}
mysqli_close($con);
?>
Run Code Online (Sandbox Code Playgroud)
下面的代码是sphinx.conf文件,用于使用文件路径对表进行索引。注意sql_file_field将索引text(filepath)列中指定的那些文件
source src1
{
type = mysql
sql_host = localhost
sql_user = root
sql_pass = password
sql_db = filetest
sql_port = 3306 # optional, default is 3306
sql_query_pre = SET CHARACTER_SET_RESULTS=utf8
sql_query_pre = SET NAMES utf8
sql_query = SELECT id,text from fileindex
sql_file_field = text
}
index filename
{
source = src1
path = /var/lib/sphinxsearch/data/files
docinfo = extern
}
indexer
{
mem_limit = 128M
}
searchd
{
log = /var/log/sphinxsearch/searchd.log
pid_file = /var/log/sphinxsearch/searchd.pid
}
Run Code Online (Sandbox Code Playgroud)
创建表后,将sphinx.conf保存在/etc/sphinxsearch/sphinx.conf中,只需运行sudo indexer filename --rotate,您的索引就准备好了!键入搜索,然后键入关键字以获取结果。