尝试创建新的 etcdv3 客户端时出现“rpc 错误:代码 = 不可用 desc = 从服务器读取错误:EOF”

mag*_*nes 6 go etcd kubernetes grpc istio

我正在尝试从 K8s 控制器访问我的 ETCD 数据库,但在尝试打开 ETCD 客户端时出现 rpc 错误/EOF。

我的设置:

  • ETCD 服务部署在我的 K8s 集群中,并包含在我的 Istio 服务网格中(其 DNS 记录my-etcd-cluster.my-etcd-namespace.svc.cluster.local:)
  • 我有一个使用 Kubebuilder 框架开发的自定义 K8s 控制器,并部署在同一集群、不同命名空间中,但配置为同一 Istio 服务网格的一部分
  • 我正在尝试使用 ETCD 的 Go 客户端 SDK 库从控制器连接到 ETCD 数据库

这是我受影响的 Go 代码:

cli, err := clientv3.New(clientv3.Config{
    Endpoints:   []string{"http://my-etcd-cluster.my-etcd-namespace.svc.cluster.local:2379"},
    DialTimeout: 5 * time.Second,
    Username:    username,
    Password:    password,
})

if err != nil {
    return nil, fmt.Errorf("opening ETCD client failed: %v", err)
}
Run Code Online (Sandbox Code Playgroud)

clientv3.New(...)这是执行时遇到的错误:

{"level":"warn","ts":"2022-03-16T23:37:42.174Z","logger":"etcd-client","caller":"v3@v3.5.0/retry_interceptor.go:62","msg":"retrying of unary invoker failed",
"target":"etcd-endpoints://0xc00057f500/#initially=[http://my-etcd-cluster.my-etcd-namespace.svc.cluster.local:2379]","attempt":0,
"error":"rpc error: code = Unavailable desc = error reading from server: EOF"}
...
1.647473862175209e+09   INFO    controller.etcdclient   Finish reconcile loop for some-service/test-svc-client  {"reconciler group": "my-controller.something.io", "reconciler kind": "ETCDClient", "name": "test-svc-client", "namespace": "some-service", "reconcile-etcd-client": "some-service/test-svc-client"}
1.6474738621752858e+09  ERROR   controller.etcdclient   Reconciler error        {"reconciler group": "my-controller.something.io", "reconciler kind": "ETCDClient", "name": "test-svc-client", "namespace": "some-service", "error": "opening ETCD client failed: rpc error: code = Unavailable desc = error reading from server: EOF"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
        /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
        /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:227
Run Code Online (Sandbox Code Playgroud)

当我传递一些虚拟的无效凭据时,也会发生同样的错误。

然而,当我尝试以HTTP API方式访问数据库时:

{"level":"warn","ts":"2022-03-16T23:37:42.174Z","logger":"etcd-client","caller":"v3@v3.5.0/retry_interceptor.go:62","msg":"retrying of unary invoker failed",
"target":"etcd-endpoints://0xc00057f500/#initially=[http://my-etcd-cluster.my-etcd-namespace.svc.cluster.local:2379]","attempt":0,
"error":"rpc error: code = Unavailable desc = error reading from server: EOF"}
...
1.647473862175209e+09   INFO    controller.etcdclient   Finish reconcile loop for some-service/test-svc-client  {"reconciler group": "my-controller.something.io", "reconciler kind": "ETCDClient", "name": "test-svc-client", "namespace": "some-service", "reconcile-etcd-client": "some-service/test-svc-client"}
1.6474738621752858e+09  ERROR   controller.etcdclient   Reconciler error        {"reconciler group": "my-controller.something.io", "reconciler kind": "ETCDClient", "name": "test-svc-client", "namespace": "some-service", "error": "opening ETCD client failed: rpc error: code = Unavailable desc = error reading from server: EOF"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
        /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
        /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.11.0/pkg/internal/controller/controller.go:227
Run Code Online (Sandbox Code Playgroud)

...我得到了 200 OK 和一个正确的令牌(这是预期的),所以我相信我的 Istio 配置没问题,我的控制器应该能够看到 ETCD 数据库服务。我不知道为什么在遵循客户端 SDK 方法时这不起作用。

当我使用 ETCD 服务的端口转发并在本地访问它时,clientv3.New()其他客户端 SDK 方法就像一个魅力。我缺少什么?我真的很感激任何建议。

编辑:我还添加了一个简单的 pod 来尝试通过 etcdctl 访问我的 etcd 数据库:

postBody, _ := json.Marshal(map[string]string{
    "name":     username,
    "password": password,
})
responseBody := bytes.NewBuffer(postBody)

resp, err := http.Post("http://my-etcd-cluster.my-etcd-namespace.svc.cluster.local:2379/v3/auth/authenticate", "application/json", responseBody)
if err != nil {
    return ctrl.Result{}, fmt.Errorf("an error occured %w", err)
}
l.Info(fmt.Sprintf("code: %d", resp.StatusCode))
defer resp.Body.Close()
Run Code Online (Sandbox Code Playgroud)

当通过 登录到容器时kubectl exec,我能够访问我的数据库:

$ etcdctl --endpoints=my-etcd-cluster.my-etcd-namespace.svc.cluster.local:2379 --user="user" --password="password" put foo bob
OK
Run Code Online (Sandbox Code Playgroud)

我猜问题出在SDK的某个地方?

mag*_*nes 1

结果是版本不匹配 - 我的 ETCD 数据库是 v3.5.2,我使用的 clientv3 库是 v3.5.0。如 ETCD 变更日志( https://github.com/etcd-io/etcd/blob/main/CHANGELOG/CHANGELOG-3.5.md )所示:

在此输入图像描述