Cassandra 集群 - 种子提供程序如何工作？

Question

Cassandra 集群 - 种子提供程序如何工作？

Bla*_*Hat 1 database rhel cassandra cassandra-2.0

我对 cassandra seeds_provider 分配有疑问。在我的环境中，需要 3 个 cassandra 节点才能设置为集群。我应该如何在 cassandra.yaml 中定义它？我很困惑，因为大多数教程给出了不同的答案。

示例：主机 A - 192.168.1.1 主机 B - 192.168.1.2 主机 C - 192.168.1.3

以下是我当前对主机 A 的设置，是否正确？

主机B和主机C的配置如何？

# any class that implements the SeedProvider interface and has a
# constructor that takes a Map<String, String> of parameters will do.
seed_provider:
    # Addresses of hosts that are deemed contact points. 
    # Cassandra nodes use this list of hosts to find each other and learn
    # the topology of the ring.  You must change this if you are running
    # multiple nodes!
    - class_name: org.apache.cassandra.locator.SimpleSeedProvider
      parameters:
          # seeds is actually a comma-delimited list of addresses.
          # Ex: "<ip1>,<ip2>,<ip3>"
          - seeds: "192.168.1.1,192.168.1.2,192.168.1.3"

Run Code Online (Sandbox Code Playgroud)

Answer 1

Aar*_*ron 5

对于初学者来说，您不需要class_name更改seed_provider. AFAIK，只有一个与 Cassandra 一起提供。它被定义为“可插入”，以允许编写自定义种子提供程序。

对于seeds，我不建议指定种子列表中的每个节点。如果只有 3 个节点，则只需提供 1 或 2。种子节点不引导数据，并且需要repair在替换时保持一致。这会使节点添加变得困难。

但据我所知，您当前的配置将起作用。我只会构建最多 2 个节点的种子列表。

请记住，有两个主要要求seed_list：

如果您要启动集群中的第一个节点，其 IP 必须位于seed_list.
至少有一个节点必须正在运行。

您介意进一步解释一下如果我继续在种子列表中添加所有 3 个节点会产生什么影响吗？您只选择在种子列表中添加 1 或 2 个节点的原因是什么？

当然，这一切都可以追溯到这一点：

种子节点不引导数据

因此，在所有 3 个节点上指定seed_list所有 3 个节点会导致以下问题：

如果在节点 B 或 C 加入集群之前启动节点 A 并写入数据，则该数据不会流向节点 B 或 C。
如果将来节点 A 发生故障并被替换，数据将不会流向替换节点。

在这些情况下，nodetool repair需要运行 a will 将初始数据获取到新添加的节点。

归档时间：	5 年，5 月前
查看次数：	2273 次
最近记录：	5 年，5 月前