mva*_*ebr 7 python cql3 cassandra-2.0
我正在使用Cassandra 2.0和python CQL.
我创建了一个列系列如下:
CREATE KEYSPACE IF NOT EXISTS Identification
WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy',
'DC1' : 1 };
USE Identification;
CREATE TABLE IF NOT EXISTS entitylookup (
name varchar,
value varchar,
entity_id uuid,
PRIMARY KEY ((name, value), entity_id))
WITH
caching=all
;
Run Code Online (Sandbox Code Playgroud)
然后,我尝试计算此CF中的记录数,如下所示:
#!/usr/bin/env python
import argparse
import sys
import traceback
from cassandra import ConsistencyLevel
from cassandra.cluster import Cluster
from cassandra.query import SimpleStatement
def count(host, cf):
keyspace = "identification"
cluster = Cluster([host], port=9042, control_connection_timeout=600000000)
session = cluster.connect(keyspace)
session.default_timeout=600000000
st = SimpleStatement("SELECT count(*) FROM %s" % cf, consistency_level=ConsistencyLevel.ALL)
for row in session.execute(st, timeout=600000000):
print "count for cf %s = %s " % (cf, str(row))
dump_pool.close()
dump_pool.join()
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("-cf", "--column-family", default="entitylookup", help="Column Family to query")
parser.add_argument("-H", "--host", default="localhost", help="Cassandra host")
args = parser.parse_args()
count(args.host, args.column_family)
print "fim"
Run Code Online (Sandbox Code Playgroud)
计数对我来说没有用,它只是一个需要很长时间才能完成的操作的测试.
虽然我已将超时定义为600000000秒,但在不到30秒后,我收到以下错误:
./count_entity_lookup.py -H localhost -cf entitylookup
Traceback (most recent call last):
File "./count_entity_lookup.py", line 27, in <module>
count(args.host, args.column_family)
File "./count_entity_lookup.py", line 16, in count
for row in session.execute(st, timeout=None):
File "/home/mvalle/pyenv0/local/lib/python2.7/site-packages/cassandra/cluster.py", line 1026, in execute
result = future.result(timeout)
File "/home/mvalle/pyenv0/local/lib/python2.7/site-packages/cassandra/cluster.py", line 2300, in result
raise self._final_exception
cassandra.ReadTimeout: code=1200 [Timeout during read request] message="Operation timed out - received only 1 responses." info={'received_responses': 1, 'data_retrieved': True, 'required_responses': 2, 'consistency': 5}
Run Code Online (Sandbox Code Playgroud)
似乎答案只是在复制品中找到,但这对我来说真的没有意义.不管怎么说cassandra都不能查询它?
在下图中,可以看到对集群的请求数量非常低,延迟也很低.我不知道为什么会这样.

从回复来看:
received_responses': 1, 'data_retrieved': True, 'required_responses': 2
Run Code Online (Sandbox Code Playgroud)
数据仅在一个节点上可用,而查询要求一致性==全部。Cassandra 无法满足该请求并超时。
如果需要所有节点都有数据,可以将写入一致性更改为“ALL”。
这将确保所有读取请求都可以在没有一致性==ALL的情况下得到满足,因为这将由写入请求本身来满足,尽管如果节点离线,写入可能会失败。
请参阅文档以了解每个一致性级别含义的解释。
LOCAL_QUORUM用于确保在 DC 内联系复制因子方面的大多数节点。
| 归档时间: |
|
| 查看次数: |
3843 次 |
| 最近记录: |