Arangodb AQL过滤器不是IN集合,非常慢

rol*_*lls 4 arangodb aql

我想找到没有个人资料的用户组.

ArangoDB 2.4.3

LENGTH(users) -> 130k
LENGTH(profiles) -> 110k

users.userId -> unique hash index
profiles.userId -> unique hash index
Run Code Online (Sandbox Code Playgroud)

我制作的AQL片段比夏季中期穿越大峡谷的蜗牛要慢.

LET usersWithProfiles = ( /* This part is ok */
FOR i IN users
    FOR j IN profiles
        FILTER i.userId == j.userId
RETURN i
)

LET usersWithoutProfiles = ( /* This is not */
FOR i IN usersWithProfiles
    FILTER i NOT IN users
RETURN i
)

RETURN LENGTH(usersWithoutProfiles)
Run Code Online (Sandbox Code Playgroud)

我很确定有一种完全正确的做法,但是我很想念它.有任何想法吗?

编辑1(在@dothebart的回复之后):

这是新查询,但仍然很慢

LET userIds_usersWithProfile = (
FOR i IN users
    FOR j IN profile
        FILTER i.userId == j.userId
RETURN i.userId
)

LET usersWithoutProfiles = (
FOR i IN users 
    FILTER i.userId NOT IN userIds_usersWithProfile
RETURN i
)

RETURN LENGTH(usersWithoutProfiles)
Run Code Online (Sandbox Code Playgroud)

stj*_*stj 6

另请注意,原始查询的这部分非常昂贵:

LET usersWithoutProfiles = (
  FOR i IN usersWithProfiles
    FILTER i NOT IN users
    RETURN i
)
Run Code Online (Sandbox Code Playgroud)

原因是FILTERusing users,此时是一个表达式,它将集合中的所有文档构建为数组.我建议使用此查询,而不是使用此查询,该查询将返回_key没有关联的配置文件记录的用户的属性:

FOR user IN users 
  LET profile = (
    FOR profile IN profiles 
      FILTER profile.userId == user.userId 
      RETURN 1
  ) 
  FILTER LENGTH(profile) == 0 
  RETURN user._key
Run Code Online (Sandbox Code Playgroud)