NGINX 缓存(相同 URL)首先将 MISS 返回给所有 Chrome、Curl 和 Wget

adr*_*TNT 2 http nginx web-server cache http-caching

我有一个 nginx 缓存代理,它从 apache 原始服务器获取内容。

我从curlwget和发出请求Chrome来验证缓存响应。问题是,对于相同的 URL,我总是MISS 在每个单独的客户端中得到第一次

我希望在我向任何客户提出一个请求后,其他客户会得到一个HIT,但我得到了MISS

HIT我只有在同一个客户端重复请求时才会收到。

感觉 key 与用户代理有关,但事实并非如此:

proxy_cache_key $scheme://$host$request_uri;
Run Code Online (Sandbox Code Playgroud)

为了排除不同的HTTP版本和用户代理,我在请求中指定了它们(wget默认使用http1.1),它们都显示GET在日志中,所以不是HEAD

wget --server-response --user-agent "foo" 'https://www.example.com/x.php?124'

HTTP request sent, awaiting response...
  HTTP/1.1 200 OK
  Server: nginx/1.16.1
  Date: Tue, 03 Mar 2020 19:53:53 GMT
  Content-Type: text/html; charset=UTF-8
  Transfer-Encoding: chunked
  Connection: keep-alive
  X-Powered-By: PHP/5.4.16
  X-Accel-Expires: 3600
  Vary: Accept-Encoding
  X-Cache: MISS <<<<<<<<<<<<<<<<<<<<<<<<<< there

# repeating the request again with WGET will get a HIT

wget --server-response --user-agent "foo" 'https://www.example.com/x.php?124'

HTTP request sent, awaiting response...
  HTTP/1.1 200 OK
  Server: nginx/1.16.1
  Date: Tue, 03 Mar 2020 19:55:21 GMT
  Content-Type: text/html; charset=UTF-8
  Transfer-Encoding: chunked
  Connection: keep-alive
  X-Powered-By: PHP/5.4.16
  X-Accel-Expires: 3600
  Vary: Accept-Encoding
  X-Cache: HIT <<<<<<<<<<<<<<<<<<<<<<<<<<< there

# after request should be cached, a CURL request to same URL gets MISS again

curl -L -i --http1.1 --user-agent "foo" 'https://www.example.com/x.php?124'
HTTP/1.1 200 OK
Server: nginx/1.16.1
Date: Tue, 03 Mar 2020 19:56:37 GMT
Content-Type: text/html; charset=UTF-8
Transfer-Encoding: chunked
Connection: keep-alive
X-Powered-By: PHP/5.4.16
X-Accel-Expires: 3600
Vary: Accept-Encoding
X-Cache: MISS <<<<<<<<<<<<<<<<<<<<<<<<<<< there
Run Code Online (Sandbox Code Playgroud)

我的配置

http {

    sendfile            on;
    tcp_nopush          on;
    tcp_nodelay         on;
    keepalive_timeout   65;
    types_hash_max_size 2048;

    include             /etc/nginx/mime.types;
    default_type        application/octet-stream;

    include /etc/nginx/conf.d/*.conf;

    # lower value might show error: "upstream sent too big header"
    proxy_buffer_size   128k;
    proxy_buffers   8 256k;
    proxy_busy_buffers_size   256k;

    # fixes error request entity too large when uploading files
    client_max_body_size 256M;

    # main cache for images and some of the html pages
    proxy_cache_path /nginx_cache levels=1:2 keys_zone=nginx_cache:512m max_size=50g
                     inactive=90d use_temp_path=off;

    # deliver a cached copy in case of error at source server
    proxy_cache_background_update on;
    proxy_cache_use_stale updating error timeout http_500 http_502 http_503 http_504;
    proxy_cache_key $scheme://$host$request_uri;

    # set http version between nginx and origin servr, you can check the version in origin server log
    proxy_http_version 1.1;

    # enable gzip after we forced plain-text between cache and origin with Accept "" in some vhosts
    gzip_types text/plain text/css text/xml text/javascript application/javascript application/x-javascript application/xml image/jpeg image/png image/webp image/gif image/x-icon image/svg;
    gzip on;

    # security headers, iframe block, etc
    add_header X-Frame-Options sameorigin;
    add_header X-Content-Type-Options nosniff;
    add_header Strict-Transport-Security max-age=2678400;

    # default server(s) that don't match any specified hosts
    server {
        server_name _;
        listen 80 default_server;
        listen 443 ssl http2 default_server;
        root /var/www/html;
    }

    # include all our custom vhosts
    include /etc/nginx/adr_vhosts/*.conf;

} # end of http
Run Code Online (Sandbox Code Playgroud)

我的虚拟主机配置

server {
    listen       443 ssl http2;
    server_name  www.example.com;
    root         /usr/share/nginx/html;

    location / {

            # using the alt port to bypass the other nginx cache at source server (and X-Real-IP overwrite)
            proxy_pass       http://xx.xx.xx.xx:81; 

            proxy_cache             nginx_cache;

            # ask directly for the right host (including www), to avoid mismatches, additional redirects
            proxy_set_header Host      $host;

            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For  $proxy_add_x_forwarded_for;

            # sub_filter only works on plain text, disable gzip communication with origin server
            proxy_set_header Accept-Encoding "";

            # was it a hit or a miss
            add_header X-Cache $upstream_cache_status;

            # keep the x-accel header for debugging purposes
            proxy_pass_header "X-Accel-Expires";

    }

}
Run Code Online (Sandbox Code Playgroud)

我禁用了缓存服务器和源服务器之间的 gzip 压缩,proxy_set_header Accept-Encoding "";以便sub_filter在某些位置使用。gzip_types然后我通过和重新激活了 gzip gzip on

里面/nginx_cache有一个为每个客户端保存的缓存文件,这两个文件分别用于 nginx 和 wget,两个文件之间的动画切换可以看到它们几乎相同,除了上面的二进制(或 gzip?!)数据:

在此输入图像描述

编辑:HIT如果我Accept-Encoding: gzip在请求中指定,我会得到所有客户的支持我会调查一下...

编辑2: wget发送请求标头Accept-Encoding: identity,默认情况下curl根本不发送任何标头,而Chrome发送Accept-Encoding: gzip, deflate, br,如果我强制使用任何值(只要它们相同),缓存就会正确地得到命中。这是我的配置错误还是正常行为?它的作用就像接受编码是cache_key的一部分。

adr*_*TNT 7

我正在回答我自己的问题,以便澄清问题中的详细信息并部分发布解决方案......

我发现不同的客户端(Curl、Wget、Chrome)每个都会收到一个MISS缓存回复,由于vary: Accept-Encoding响应标头中的完全相同的 url,一个接一个地收到缓存回复(例如,为每个 Accept-Encoding 创建不同的缓存体)

Chrome: Accept-Encoding: gzip, deflate, br 

Wget: Accept-Encoding: identify 

Curl: n/a
Run Code Online (Sandbox Code Playgroud)

Vary: Accept-Encoding似乎来自我的原始服务器,我确认HIT如果我添加以下内容,缓存总是返回 a:

proxy_ignore_headers Vary;
Run Code Online (Sandbox Code Playgroud)

只是我不确定这样做是否安全,我会为此提出另一个问题。