Cassandra查询制作-无法执行此查询，因为它可能涉及数据过滤，因此性能可能无法预测

Question

Cassandra查询制作-无法执行此查询，因为它可能涉及数据过滤，因此性能可能无法预测

Pyt*_*ast 1 python cassandra cassandra-cli cqlengine

我将使用以下Cassandra模型：-

class Automobile(Model):
    manufacturer = columns.Text(primary_key=True)
    year = columns.Integer(index=True)
    model = columns.Text(index=True)
    price = columns.Decimal(index=True)

Run Code Online (Sandbox Code Playgroud)

我需要以下查询：

q = Automobile.objects.filter(manufacturer='Tesla')
q = Automobile.objects.filter(year='something')
q = Automobile.objects.filter(model='something')
q = Automobile.objects.filter(price='something')

Run Code Online (Sandbox Code Playgroud)

这些都工作正常，直到我想要多列过滤，即当我尝试

q = Automobile.objects.filter(manufacturer='Tesla',year='2013')

Run Code Online (Sandbox Code Playgroud)

它抛出一个错误说 Cannot execute this query as it might involve data filtering and thus may have unpredictable performance.

我用重写了查询 allowed_filtering，但这不是最佳解决方案。

然后，在阅读更多内容后，我对模型进行了如下编辑：

class Automobile(Model):
    manufacturer = columns.Text(primary_key=True)
    year = columns.Integer(primary_key=True)
    model = columns.Text(primary_key=True)
    price = columns.Decimal()

Run Code Online (Sandbox Code Playgroud)

有了这个，我也能够过滤多个库仑，而无需任何警告。

当我这样做时DESCRIBE TABLE automobile，它表明这将创建复合键PRIMARY KEY ((manufacturer), year, model)。

所以，我的问题是，如果我将每个属性都声明为主键，该怎么办？这有什么问题吗，因为我也可以过滤多列。

这只是一个小模型。如果我有一个模型，例如：

class UserProfile(Model):
    id = columns.UUID(primary_key=True, default=uuid.uuid4)
    model = columns.Text()
    msisdn = columns.Text(index=True)
    gender = columns.Text(index=True)
    imei1 = columns.Set(columns.Text)
    circle = columns.Text(index=True)
    epoch = columns.DateTime(index=True)
    cellid = columns.Text(index=True)
    lacid = columns.Text(index=True)
    mcc = columns.Text(index=True)
    mnc = columns.Text(index=True)
    installed_apps = columns.Set(columns.Text)
    otp = columns.Text(index=True)
    regtype = columns.Text(index=True)
    ctype = columns.Text(index=True)
    operator = columns.Text(index=True)
    dob = columns.DateTime(index=True)
    jsonver = columns.Text(index=True)

Run Code Online (Sandbox Code Playgroud)

如果我将每个属性都声明为PK，这有什么问题吗？

Answer 1

ash*_*hic 6

要了解这一点，您需要了解cassandra如何存储数据。主键中的第一个键称为分区键。它定义行所属的分区。分区中的所有行都存储在一起，并一起复制。在分区内，根据聚类键存储行。这些是PK中不是分区键的列。因此，如果您的PK为（a，b，c，d），则a定义分区。并且在特定分区（例如a = a1）中，按b排序存储行。对于每个b，按c ...等排序存储行。查询时，您打了一个（或几个分区），然后需要指定每个连续的集群键，直到找到所需的键为止。除查询中指定的最后一个聚类列（可能是范围查询）外，它们必须完全相等。

在前面的示例中，您可以这样做

where a = a1 and b > b1 where a = a1 and b=b1 and c>c1 where a = a1 and b=b1 and c=c1 and d > d1

但不能这样做：

where a=a1 and c=c1

为此，您需要“允许过滤”（实际上，您应该考虑更改模型或在那时进行非规范化）。

现在，关于您将PK的每个列都包括在内的问题。您可以这样做，但请记住，Cassandra中的所有写入都是upserts。行由其主键标识。如果将每一列都作为PK的一部分，则将无法编辑行。您不允许更新主键中任何列的值。

归档时间：	10 年，11 月前
查看次数：	5429 次
最近记录：	10 年，11 月前