标签: distributed

我正在尝试使用tf.distribute.MirroredStrategy(). 虽然训练循环在单个 GPU 上完美运行，ValueError: 'handle' is not available outside the replica context or a 'tf.distribute.Strategy.update()' call但当我尝试使用多个 GPU 时会抛出错误。我正在使用 tensorflow 1.14 和 Python 3.7.3。

我在下面尝试了一个最小的例子。自定义训练循环在单个 GPU 上运行没有问题，但我尝试使用tf.distribute.MirroredStrategy()多个 GPU 失败并显示错误消息（完整输出）

ValueError                                Traceback (most recent call last)
<ipython-input-11-3fda5d330457> in <module>
      1 with mirrored_strategy.scope():
----> 2     model, train_op, X1_in, X2_in = create_model_and_train_op()
      3     with tf.Session() as sess:
      4         sess.run(tf.global_variables_initializer())
      5         for sample_ind in range(n_samples):

<ipython-input-7-8f5b3971bbe2> in create_model_and_train_op()
      6 
      7     model = Model(name='BNN',inputs=[X1_in,X2_in], outputs=[loss])
----> 8     train_op = tf.train.AdamOptimizer().minimize(loss) …

Run Code Online (Sandbox Code Playgroud)

python distributed keras tensorflow

I. *_*ert

lucky-day

5
推荐指数

1
解决办法

1719
查看次数

如何在分布式模式下部署kafka connect？

我正在使用 kubernetes 中的 JDBC sink 连接器构建 Kafka-connect 应用程序。我尝试了独立模式，它正在工作。我想转向分布式模式。我可以通过运行下面的 yaml 文件成功构建两个 pod（kafka 连接器）：

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  namespace: vtq
  name: kafka-sink-postgres-dis
spec:
  replicas: 2
  template:
    metadata:
      labels:
        app: kafka-sink-postgres-dis
    spec:
      containers:
      - name: kafka-sink-postgres-dis
        image: ***
        imagePullPolicy: Always

Run Code Online (Sandbox Code Playgroud)

bin/connect-distributed.sh config/worker.properties

bootstrap.servers=***:9092
offset.flush.interval.ms=10000

rest.port=8083
rest.host.name=127.0.0.1


key.converter=org.apache.kafka.connect.storage.StringConverter
value.converter=io.confluent.connect.avro.AvroConverter
value.converter.schema.registry.url=http://schema-registry:8081


# Prevent the connector from pulling all historical messages
auto.offset.reset=latest

# options below may be required for distributed mode

# unique name for the cluster, used in forming the Connect cluster group. Note …

Run Code Online (Sandbox Code Playgroud)

distributed apache-kafka-connect

Sop*_*hie

2019 09-05

5
推荐指数

0
解决办法

2793
查看次数

为什么建议创建节点数为奇数的集群

有一些关于分布式系统的资源，比如mongo db 文档，它推荐集群中的奇数节点。

拥有奇数个节点有什么好处？

distributed cluster-computing leader-election

Pro*_*Cpp

lucky-day

5
推荐指数

2
解决办法

2329
查看次数

Git 是分布式的还是去中心化的？

我知道 git 使用版本控制来跟踪文件。而且它也是分布式的，这意味着不止一台计算机存储相关文件。但我怀疑 git 是分布式的还是去中心化的？如果是去中心化的，那为什么还需要github、gitlab呢？使用 Github 和 Gitlab 使其分布式（一个主多个从节点）对吗？因为，我们有一个 master（如 github），客户（合作者）依赖于它。但是 git 利用了区块链（各种）技术，这让我认为 git 是去中心化的，因为所有区块链技术应用程序，如比特币、以太坊都是去中心化的。与比特币不同，git 中的节点之间没有点对点通信，这与区块链的去中心化性质相矛盾。我们需要 github 来与其他节点通信，或者如果我们要与其他节点协作。

git distributed github decentralized-applications

hri*_*ham

2019 12-30

5
推荐指数

2
解决办法

3983
查看次数