Lio*_*ero 6 neo4j graph-databases
我有一个用例,我需要在一个大房间内对人们的轨迹进行分类。
在性能和最佳 Neo4j 实践方面,如果我想对这些数据进行分类以便以后能够使用这些分类的任何类型组合进行搜索/获取,那么哪种选择会更好?
不同的分类是:
轨迹包含一组点(时间、x、y、motion_type),基本上可以告诉人去了哪里。一个点告诉您在给定时间该人在房间中的确切位置,以及他是在居住、行走还是跑步(这是运动类型)。
例如,获取年龄在 21 到 30 岁之间的女性、客户的所有轨迹
选项1:
// Here I get the set of trajectories that fall within a certain time range (:Trajectory(at) is indexed)
MATCH (trajectory:Trajectory)
WHERE datetime("2020-01-01T00:00:00.000000+13:00") <= trajectory.at < datetime("2020-01-11T00:00:00.000000+13:00")
// Once I have all the trajectories I start filtering by the different criteria
MATCH (trajectory)-[:GENDER]->(:Female)
MATCH (trajectory)-[:PERSON_TYPE]->(:Customer)
// AgeGroup could have quite a lot of groups depending on how accurate the data is. At this stage we have 8 groups.
// Knowing that we have 8 groups, should I filter by property or should I have 8 different labels, one per age group? Is there any other option?
MATCH (trajectory)-[:AGE]->(age:AgeGroup)
WHERE age.group = "21-30"
RETURN COUNT(trajectory)
Run Code Online (Sandbox Code Playgroud)
选项 2:
轨迹节点将具有与可用类别一样多的子标签。例如,如果我想获得与选项 1 相同的结果,我将执行以下操作:
MATCH (trajectory:Trajectory:Female:Customer)
WHERE datetime("2020-01-01T00:00:00.000000+13:00") <= trajectory.at < datetime("2020-01-11T00:00:00.000000+13:00")
MATCH (trajectory)-[:AGE]->(age:AgeGroup)
WHERE age.group = "21-30"
RETURN COUNT(trajectory)
// Or assuming I have a label per each age group:
MATCH (trajectory:Trajectory:Female:Customer:Age21-30)
WHERE datetime("2020-01-01T00:00:00.000000+13:00") <= trajectory.at < datetime("2020-01-11T00:00:00.000000+13:00")
RETURN COUNT(trajectory)
Run Code Online (Sandbox Code Playgroud)
所以我想知道:
请注意,并非每个轨迹都有每个类别。例如,如果我们的面部识别系统无法检测到该人是女性还是男性,则该特定轨迹将不存在该类别。
小智 1
当您遵循https://neo4j.com/docs/getting-started/current/graphdb-concepts/中的概念时,您有两种基本类型的节点 Person 和 Locations Person 节点可以有多个标签
位置节点具有位置属性,但没有时间属性。
我将轨迹建模为人与位置之间的关系,具有时间属性和运动类型。因此,节点 Person 和 Location 之间可以有 DWELLING、WALKING 和 RUNNING 类型的关系
在您的查询中
MATCH (n)-[r]
WHERE n:Female and n:Customer and n:Age_20_30
AND datetime("2020-01-01T00:00:00.000000+13:00") <= r.at < datetime("2020-01-11T00:00:00.000000+13:00")
RETURN COUNT(r)
Run Code Online (Sandbox Code Playgroud)
计数正在运行的客户将是
MATCH (n)-[r:RUNNING]
WHERE n:Female and n:Customer and n:Age_20_30
AND datetime("2020-01-01T00:00:00.000000+13:00") <= r.at < datetime("2020-01-11T00:00:00.000000+13:00")
RETURN COUNT(r)
Run Code Online (Sandbox Code Playgroud)