即使正确生成了 acme.json 文件,Traefik 仍在使用默认证书

Pau*_*bro 5 certificate traefik

我在使用 Traefik 时遇到了一个奇怪的问题。我想使用 ACME 生成 TLS 证书。使用 DNS 执行验证后,我的acme.json文件似乎已正确填充,但是,当我使用 OpenSSL 验证证书时,它似乎使用了 Traefik 提供的默认证书。

这些是我的设置:

[acme]
acmelogging= true
caServer = "https://acme-staging-v02.api.letsencrypt.org/directory"
delayBeforeCheck = 0
email = "<REDACTED>"
entryPoint = "https"
storage = "/etc/traefik/acme.json"
  [acme.dnsChallenge]
  delayBeforeCheck = 0
  provider = "route53"
  [[acme.domains]]
  main = "<REDACTED>"
Run Code Online (Sandbox Code Playgroud)
[entryPoints]
  [entryPoints.http]
  address = ":80"
    [entryPoints.http.redirect]
    entryPoint = "https"
  [entryPoints.https]
    address = ":443"
    [entryPoints.https.tls]
Run Code Online (Sandbox Code Playgroud)

这是证书的主题:

?  Docker git:(master) ? openssl s_client -connect localhost:443 -servername <REDACTED> 2>/dev/null | openssl x509 -noout -subject

subject= /CN=TRAEFIK DEFAULT CERT
Run Code Online (Sandbox Code Playgroud)

Kir*_*rah 7

昨天下午我遇到了同样的问题。就我而言,这是在服务器上运行的,所以我让它继续运行以继续今天早上的故障排除。

当我今天早上尝试时,Traefik 的表现符合预期!(提供 ACME 证书)。我将尝试进行更多调查或在 Github 中打开一个问题以进行澄清。

只需添加此答案,以防您想测试您是否遇到相同的行为。启动你的环境,让它运行几个小时。

顺便说一下,这是我第二次遇到这种情况。我第一次有相同的行为(最初没有工作,经过几个小时的故障排除后开始按预期工作)。

查看日志,我发现了正常工作时应该出现的消息:

{"level":"debug","msg":"Certificates obtained for domains [*.<REDACTED>]","time":"2019-03-21T18:59:44Z"}
{"level":"debug","msg":"Configuration received from provider ACME: {}","time":"2019-03-21T18:59:44Z"}
.....
{"level":"debug","msg":"Add certificate for domains *.<REDACTED>","time":"2019-03-21T18:59:45Z"}
{"level":"info","msg":"Server configuration reloaded on :443","time":"2019-03-21T18:59:45Z"}
{"level":"info","msg":"Server configuration reloaded on :8080","time":"2019-03-21T18:59:45Z"}
{"level":"info","msg":"Server configuration reloaded on :80","time":"2019-03-21T18:59:45Z"}
Run Code Online (Sandbox Code Playgroud)

我还备份了我认为有效的 acme.json 文件,所以我与今天的文件进行了比较。

旧(不工作)

{
  "Account": {
    "Email": "REDACTED",
    "Registration": {
      "body": {
        "status": "valid",
        "contact": [
          "mailto:REDACTED"
        ]
      },
      "uri": "https://acme-staging-v02.api.letsencrypt.org/acme/acct/ACCOUNT_ID_1"
    },
    "PrivateKey": "REDACTED",
    "KeyType": "4096"
  },
  "Certificates": null,
  "HTTPChallenges": {},
  "TLSChallenges": {}
}
Run Code Online (Sandbox Code Playgroud)

新的(工作)

{
  "Account": {
    "Email": "REDACTED",
    "Registration": {
      "body": {
        "status": "valid",
        "contact": [
          "mailto:REDACTED"
        ]
      },
      "uri": "https://acme-staging-v02.api.letsencrypt.org/acme/acct/ACCOUNT_ID_2"
    },
    "PrivateKey": "REDACTED",
    "KeyType": "4096"
  },
  "Certificates": [
    {
      "Domain": {
        "Main": "*.REDACTED",
        "SANs": null
      },
      "Certificate": "REDACTED",
      "Key": "REDACTED"
    }
  ],
  "HTTPChallenges": {},
  "TLSChallenges": {}
}
Run Code Online (Sandbox Code Playgroud)

所以主要的2个变化:

  • 生成了一个新的帐户 ID(不知道为什么)

  • 未填充证书字段。我在 acme.json 文件中的可能只是 letencrypt 生成帐户的私钥,但尚未颁发证书。证书仅在大约 1 小时 30 分后颁发(无法判断,因为我删除了 Pod 几次以查看是否有帮助,上次我将其杀死是 18:03UTC,它在 18:59UTC 开始工作。

所以我现在将重点放在 acme 部分(到目前为止,我一直假设从一开始就正确生成了证书)

编辑:最后的发现

最后,我发现在我的场景中(不确定它是否适用于你,但你可以启用 acme 日志来找出)问题与 DNS 验证有关。

日志(如果在 traefik 配置中acmeLogging设置为这些将显示true):

{"level":"info","msg":"legolog: [INFO] [*.REDACTED] Server responded with a certificate.","time":"2019-03-22T11:14:44Z"}
{"level":"info","msg":"legolog: [INFO] [*.REDACTED] acme: Validations succeeded; requesting certificates","time":"2019-03-22T11:14:39Z"}
{"level":"info","msg":"legolog: [INFO] dreamhost: record_removed","time":"2019-03-22T11:14:39Z"}
{"level":"info","msg":"legolog: [INFO] [*.REDACTED] acme: Cleaning DNS-01 challenge","time":"2019-03-22T11:14:39Z"}
{"level":"info","msg":"legolog: [INFO] [*.REDACTED] The server validated our request","time":"2019-03-22T11:14:39Z"}
{"level":"info","msg":"legolog: [INFO] [*.REDACTED] acme: Waiting for DNS record propagation.","time":"2019-03-22T11:13:34Z"}
{"level":"info","msg":"legolog: [INFO] [*.REDACTED] acme: Waiting for DNS record propagation.","time":"2019-03-22T11:12:34Z"}
... (1 line per minute)
{"level":"info","msg":"legolog: [INFO] [*.REDACTED] acme: Waiting for DNS record propagation.","time":"2019-03-22T10:58:32Z"}
{"level":"info","msg":"legolog: [INFO] [*.REDACTED] acme: Waiting for DNS record propagation.","time":"2019-03-22T10:57:32Z"}
{"level":"info","msg":"legolog: [INFO] Wait for propagation [timeout: 1h0m0s, interval: 1m0s]","time":"2019-03-22T10:57:31Z"}
{"level":"info","msg":"legolog: [INFO] [*.REDACTED] acme: Checking DNS record propagation using [10.96.0.10:53]","time":"2019-03-22T10:57:31Z"}
{"level":"info","msg":"legolog: [INFO] [*.REDACTED] acme: Trying to solve DNS-01","time":"2019-03-22T10:57:31Z"}
{"level":"info","msg":"legolog: [INFO] dreamhost: record_added","time":"2019-03-22T10:57:31Z"}
{"level":"info","msg":"legolog: [INFO] [*.REDACTED] acme: Preparing to solve DNS-01","time":"2019-03-22T10:57:31Z"}
{"level":"info","msg":"legolog: [INFO] [*.REDACTED] acme: use dns-01 solver","time":"2019-03-22T10:57:31Z"}
{"level":"info","msg":"legolog: [INFO] [*.REDACTED] AuthURL: https://acme-staging-v02.api.letsencrypt.org/acme/authz/REDACTED","time":"2019-03-22T10:57:31Z"}
{"level":"info","msg":"legolog: [INFO] [*.REDACTED] acme: Obtaining bundled SAN certificate","time":"2019-03-22T10:57:30Z"}
{"level":"info","msg":"legolog: [INFO] acme: Registering account for REDACTED,"time":"2019-03-22T10:57:30Z"} 
Run Code Online (Sandbox Code Playgroud)

Lego(因此使用 Lego 的 Traefik)将等到 DNS 的权威服务器回复正确的挑战(避免让 LetsEncrypt 在准备好之前执行挑战的机制)。

就我而言,Dreamhost执行此更新需要一段时间。即使更改会立即反映在 Web 门户中(已创建 TXT 记录),DreamhostDNS 仍需要一段时间才能为其返回更新的记录。

在上面的日志中,它只花了几分钟,但在其他迭代中,我看到了长达 30 分钟的延迟(可能更多,不确定)。也许你有类似的问题route53

有趣的是,cloudflare DNS(1.1.1.1)比dreamhost DNS(dreamhost 是权威的)更早地解决了这个问题。

我认为你也可以通过设置delayBeforeCheck一个>0值来绕过这个逻辑,但这听起来不是一个好习惯,因为 LetsEncrypt 挑战可能会失败(不确定 LetsEncrypt 是否为此查询权威服务器)。

希望这也是你的场景。顺便说一句,这种情况的另一个症状是 DNS 记录仍然创建,因为它不会被删除,直到 DNS 挑战成功(或者我假设达到超时)