Cassandra物化视图分区密钥更新性能

Ank*_*ita 0 data-modeling materialized-views cassandra scylla cassandra-3.0

我正在尝试更新基表中的列,该列是物化视图中的分区键,并尝试了解其在生产环境中的性能影响.

基表:

CREATE TABLE if not exists data.test
 ( ?foreignid    uuid,
? id           uuid,?        
 kind         text,
? version      text,?           
 createdon    timestamp,?        
**certid**    text,
? PRIMARY KEY(foreignid,createdon,id)?    );
Run Code Online (Sandbox Code Playgroud)

物化视图:

CREATE MATERIALIZED VIEW if not exists data.test_by_certid 
AS? SELECT *?FROM data.test? WHERE id IS NOT NULL AND foreignid 
IS NOT NULL AND createdon IS NOT NULL AND certid IS NOT NULL 
PRIMARY KEY (**certid**, foreignid, createdon, id);
Run Code Online (Sandbox Code Playgroud)

因此,certid是物化视图中的新分区键

发生了什么:

1. When we first insert into the test table , usually the certids would
be empty which would be replaced by "none" string and inserted into
the test base table.

2.The row gets inserted into materialized view as well

3. When the user provides us with certid , the row gets updated in the test base table with the new certid

4.the action gets mirrored and the row is updated in materialized view wherein the partition key certid is getting updated from "none"
to a new value
Run Code Online (Sandbox Code Playgroud)

问题:

1.What is the perfomance implication of updating the partition key certid in the materialized view?

2.For my use case, is it better to create a new table with certid as partition key (insert only when certid in non-empty) and manually
maintain all CRUD operations to the new table or should I use MV and
let cassandra do the bookkeeping?
Run Code Online (Sandbox Code Playgroud)

需要注意的是,性能是一个重要的标准,因为它将用于生产环境.

谢谢

Dua*_*nes 7

更新存在一个或多个视图的表总是比更新没有视图的表更昂贵,因为执行读前写操作和锁定分区以确保并发更新与之前的读取一致写.您可以在ScyllaDb的wiki中阅读有关Cassandra中物化视图内部的更多信息.

如果改变certid是一次性操作,那么性能影响不应该太担心.无论如何,让Cassandra处理更新MV总是更好的想法,因为它会处理异常(例如当存储视图的节点被分区并且更新无法传播时会发生什么),并最终确保一致性.

如果您担心性能,可以考虑用Scylla替换Cassandra.