与使用Neo4j的MySql相比，性能较差

Question

与使用Neo4j的MySql相比，性能较差

我将MySQL数据库迁移到Neo4j并测试了一个简单的请求。我很惊讶地发现neo4j中的等效请求比MySql中的请求长10到100倍。我正在研究Neo4j 2.0.1。

在原始的MySql模式中，我具有以下三个表：

国家：包含“代码”，“ continent_id”和“选定”布尔值，
城市：包含“国家/地区代码”，“名称”和“状态”布尔值，
剧院：包含“ city_id”和“ public”布尔值，

每个属性都有一个索引。我要显示在多个条件下给定大陆的城市剧院数量。请求是：

SELECT count(*) as nb, c.name 
FROM `cities` c LEFT JOIN theaters t ON c.id = t.city_id 
WHERE c.country_code IN 
  (SELECT code FROM countries WHERE selected is true AND continent_id = 4)
 AND c.status=1 AND t.public = 1 
GROUP BY c.name  ORDER BY nb DESC

Run Code Online (Sandbox Code Playgroud)

Neo4j中 的数据库架构如下：

（：Continent）-[：Include]->（：Country {selected：bool }）-[：Include]->（：City {name：string，status：bool }）-[：Include]->（：Theater {public：bool }）

每个属性上还定义了一个索引。密码请求是：

MATCH (:Continent{code: 4})-[:Include]->(:Country{selected:true})-[:Include]->(city:City{status:true})-[:Include]->(:Theater{public: true})
RETURN city.name, count(*) AS nb ORDER BY nb DESC

Run Code Online (Sandbox Code Playgroud)

每个数据库中大约有70.000个城市和140.000个剧院。

在ID为4的大陆上，MySql请求大约花费0.02s，而Neo4j花费0.4s。此外，如果我在Cypher请求中引入Country和City之间的可变关系长度（...(:Country{selected:true})-[:Include*..3]->(city:City{status:true})...），因为我希望能够添加诸如Regions之类的中间级别，那么该请求将花费2秒钟以上的时间。

我知道在这种特殊情况下，使用Neo4j代替MySql没有任何好处，但是我希望看到这两种技术之间的性能大致相当，并且我想利用Neo4j的地理层次结构功能。

我是否缺少某些东西，或者这是Neo4j的限制吗？

谢谢您的回答。

编辑：首先，您将在这里找到数据库转储文件。Neo4j 服务器配置是开箱即用的。我在Ruby环境中工作，并且使用neography宝石。另外，由于我不在JRuby上，所以我分别运行Neo4J服务器，因此它通过Rest API发送密码请求。

该数据库包含244个国家，69000个城市和138,000个剧院。对于continent_id 4，有46,982个城市（37,210个状态的布尔值设置为true）和74,420个剧院。

该请求返回了2256行。在第三轮运行中，花费了338毫秒。这是带有概要分析信息的请求输出：

profile MATCH (:Continent{code: 4})-[:Include]->(country:Country{selected:true})-[:Include*..1]->(city:City{status:true})-[:Include]->(theater:Theater{public: true}) RETURN city.name, count(*) AS nb ORDER BY nb DESC;

==> ColumnFilter(symKeys=["city.name", "  INTERNAL_AGGREGATE85ca19f3-9421-4c18-a449-1097e3deede2"], returnItemNames=["city.name", "nb"], _rows=2256, _db_hits=0)
==> Sort(descr=["SortItem(Cached(  INTERNAL_AGGREGATE85ca19f3-9421-4c18-a449-1097e3deede2 of type Integer),false)"], _rows=2256, _db_hits=0)
==>   EagerAggregation(keys=["Cached(city.name of type Any)"], aggregates=["(  INTERNAL_AGGREGATE85ca19f3-9421-4c18-a449-1097e3deede2,CountStar())"], _rows=2256, _db_hits=0)
==>     Extract(symKeys=["city", "  UNNAMED27", "  UNNAMED7", "country", "  UNNAMED113", "theater", "  UNNAMED72"], exprKeys=["city.name"], _rows=2257, _db_hits=2257)
==>       Filter(pred="(hasLabel(theater:Theater(3)) AND Property(theater,public(5)) == true)", _rows=2257, _db_hits=2257)
==>         SimplePatternMatcher(g="(city)-['  UNNAMED113']-(theater)", _rows=2257, _db_hits=4514)
==>           Filter(pred="(((hasLabel(city:City(2)) AND hasLabel(city:City(2))) AND Property(city,status(4)) == true) AND Property(city,status(4)) == true)", _rows=2257, _db_hits=74420)
==>             TraversalMatcher(start={"label": "Continent", "query": "Literal(4)", "identifiers": ["  UNNAMED7"], "property": "code", "producer": "SchemaIndex"}, trail="(  UNNAMED7)-[  UNNAMED27:Include WHERE (((hasLabel(NodeIdentifier():Country(1)) AND hasLabel(NodeIdentifier():Country(1))) AND Property(NodeIdentifier(),selected(3)) == true) AND Property(NodeIdentifier(),selected(3)) == true) AND true]->(country)-[:Include*1..1]->(city)", _rows=37210, _db_hits=37432)

Run Code Online (Sandbox Code Playgroud)

Answer 1

Mic*_*ger 5

您是对的，我为自己尝试过，只将查询时间降低到100毫秒。

 MATCH (:Continent{code: 4})-[:Include]->
       (country:Country{selected:true})-[:Include]->
       (city:City{status:true})-[:Include]->
       (theater:Theater{public: true}) 
 RETURN city.name, count(*) AS nb 
 ORDER BY nb DESC;

| "Forbach"                       | 1  |
| "Stuttgart"                     | 1  |
| "Mirepoix"                      | 1  |
| "Bonnieux"                      | 1  |
| "Saint Cyprien Plage"           | 1  |
| "Crissay sur Manse"             | 1  |
+--------------------------------------+
2256 rows
**85 ms**

Run Code Online (Sandbox Code Playgroud)

请注意，从2.0.x版本开始的cypher尚未对性能进行优化，该工作始于Neo4j 2.1，并将一直持续到2.3。内核中还计划了更多的性能工作，这些工作也会加快速度。

我也用Java实现了该解决方案，并将其降低到19ms。它当然不那么漂亮，但这也是我们针对cypher的目标：

class City {
    Node city;
    int count = 1;

    public City(Node city) {
        this.city = city;
    }

    public void inc() { count++; }

    @Override
    public String toString() {
        return String.format("City{city=%s, count=%d}", city.getProperty("name"), count);
    }
}

private List<?> queryJava3() {
    long start = System.currentTimeMillis();
    Node continent = IteratorUtil.single(db.findNodesByLabelAndProperty(CONTINENT, "code", 4));
    Map<Node,City> result = new HashMap<>();
    for (Relationship rel1 : continent.getRelationships(Direction.OUTGOING,Include)) {
        Node country = rel1.getEndNode();
        if (!(country.hasLabel(COUNTRY) && (Boolean) country.getProperty("selected", false))) continue;
        for (Relationship rel2 : country.getRelationships(Direction.OUTGOING, Include)) {
            Node city = rel2.getEndNode();
            if (!(city.hasLabel(CITY) && (Boolean) city.getProperty("status", false))) continue;
            for (Relationship rel3 : city.getRelationships(Direction.OUTGOING, Include)) {
                Node theater = rel3.getEndNode();
                if (!(theater.hasLabel(THEATER) && (Boolean) theater.getProperty("public", false))) continue;
                City city1 = result.get(city);
                if (city1==null) result.put(city,new City(city));
                else city1.inc();
            }
        }
    }
    List<City> list = new ArrayList<>(result.values());
    Collections.sort(list, new Comparator<City>() {
        @Override
        public int compare(City o1, City o2) {
            return Integer.compare(o2.count,o1.count);
        }
    });
    output("java", start, list.iterator());
    return list;
}


java time = 19ms
first = City{city=Val de Meuse, count=1} total-count 22561

Run Code Online (Sandbox Code Playgroud)

归档时间：	11 年，7 月前
查看次数：	2001 次
最近记录：	11 年，7 月前