小编jra*_*raj的帖子

Kubernetes 仪表板 - ServiceUnavailable(503 错误)

我是 Kubernetes 新手。我正在尝试使用 kops 在 AWS 上设置 Kubernetes 集群。我成功地设置了集群。但是,我无法访问仪表板 UI。( https://kubernetes.io/docs/tasks/access-application-cluster/web-ui-dashboard/#accessing-the-dashboard-ui )

当我访问主节点时,我看到以下错误:

{
  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {},
  "status": "Failure",
  "message": "no endpoints available for service \"kubernetes-dashboard\"",
  "reason": "ServiceUnavailable",
  "code": 503
}
Run Code Online (Sandbox Code Playgroud)

我看到仪表板的状态为 CrashLoopBackOff。(请注意:我已经删除了以下日志中其他 pod 的名称)

~$ kubectl get pods --all-namespaces
NAMESPACE     NAME                                                    READY     STATUS             RESTARTS   AGE
kube-system   kubernetes-dashboard-4167803980-vnx3k                   0/1       CrashLoopBackOff   6          6m

$ kubectl logs kubernetes-dashboard-4167803980-vnx3k --namespace=kube-system
2017/09/25 17:50:37 Using in-cluster config to connect to apiserver
2017/09/25 17:50:37 Using service account token for csrf signing
2017/09/25 17:50:37 …
Run Code Online (Sandbox Code Playgroud)

dashboard kubernetes

6
推荐指数
1
解决办法
8682
查看次数

使用 SageMaker Pytorch 图像进行训练

我正在尝试将微调 BERT 模型的训练过程容器化,并在 SageMaker 上运行。我计划使用预构建的 SageMaker Pytorch GPU 容器 ( https://aws.amazon.com/releasenotes/available-deep-learning-containers-images/ ) 作为起点,但在提取图像时遇到问题我的构建过程。

我的 Dockerfile 如下所示:

# SageMaker PyTorch image
FROM 763104351884.dkr.ecr.us-east-1.amazonaws.com/pytorch-training:1.5.0-gpu-py36-cu101-ubuntu16.04


ENV PATH="/opt/ml/code:${PATH}"

# /opt/ml and all subdirectories are utilized by SageMaker, we use the /code subdirectory to store our user code.
COPY /bert /opt/ml/code

# this environment variable is used by the SageMaker PyTorch container to determine our user code directory.
ENV SAGEMAKER_SUBMIT_DIRECTORY /opt/ml/code

# this environment variable is used by the SageMaker PyTorch container to determine our …
Run Code Online (Sandbox Code Playgroud)

amazon-web-services pytorch amazon-sagemaker

1
推荐指数
1
解决办法
1993
查看次数