我在图形数据库、始发机场和目的地机场和承运人中创建了 3 个节点。它们通过名为“canceled_by”的属性相关联。
MATCH (origin:origin_airport {name: row.ORIGIN}),
(destination:dest_airport {name: row.DEST}),
(carrier:Carrier {name: row.UNIQUE_CARRIER})
CREATE (origin)-[:cancelled_by {cancellation: row.count}]->(carrier)
CREATE (origin)-[:cancelled_by {cancellation: row.count}]->(destination)
CREATE (origin)-[:operated_by {carrier: row.UNIQUE_CARRIER}]->(carrier)
Run Code Online (Sandbox Code Playgroud)
cancelled_by 保存特定载波被取消的次数值。我的输入文件将采用以下格式:
ORIGIN UNIQUE_CARRIER DEST Cancelled
ABE DL ATL 1
ABE EV ATL 1
ABE EV DTW 3
ABE EV ORD 3
ABQ DL DFW 2
ABQ B6 JFK 2
Run Code Online (Sandbox Code Playgroud)
这里我需要计算每个承运人的取消百分比。我期待的结果如下:
UNIQUE_CARRIER DEST Percentage_Cancelled
DL 25%
EV 58.33%
B6 16.66%
Example: Total number of cancellation = 12
No of cancellation for DL = 3
Percentage of cancellation for DL = (3/12)*100 = 25%
Run Code Online (Sandbox Code Playgroud)
下面的查询给出了每个承运人的取消总和:
MATCH ()-[ca:cancelled_by]->(c:Carrier)
RETURN c.name AS Carrier,
SUM(toFloat(ca.cancellation)) As sum
ORDER BY sum DESC
LIMIT 10
Run Code Online (Sandbox Code Playgroud)
我尝试了以下查询来计算百分比:
MATCH ()-[ca:cancelled_by]->(c:Carrier)
WITH SUM(toFloat(ca.cancellation)) As total
MATCH ()-[ca:cancelled_by]->(c:Carrier)
RETURN c.name AS Carrier,
(toFloat(ca.cancellation)/total)*100 AS percent
ORDER BY percent DESC
LIMIT 10
Run Code Online (Sandbox Code Playgroud)
但它不是通过分组计算百分比,而是单独计算百分比。
Carrier sum
DL 0.36862408915559364
DL 0.34290612944706383
DL 0.3171881697385341
Run Code Online (Sandbox Code Playgroud)
如何在 Neo4j 中使用密码查询基于 group_by 计算百分比?
您在分组时忘记了每个运营商的总和,并且不一定总是使用强制转换来浮动 - 只是在最后一次计算乘以浮点数时。
MATCH ()-[ca:cancelled_by]->(:Carrier)
WITH SUM(ca.cancellation) As total
MATCH ()-[ca:cancelled_by]->(c:Carrier)
RETURN c.name AS Carrier,
100.0 * SUM(ca.cancellation) / total AS percent
ORDER BY percent DESC
LIMIT 10
Run Code Online (Sandbox Code Playgroud)