Jaf*_*son 26 command-line wget curl
我想知道使用 Ubuntu 的网站的 HTTP 状态。我已经为此目的使用curl和wget命令。但问题是这些命令下载完整的网站页面,然后搜索标题并将其显示在屏幕上。例如:
$ curl -I trafficinviter.com
HTTP/1.1 200 OK
Date: Mon, 02 Jan 2017 14:13:14 GMT
Server: Apache
X-Pingback: http://trafficinviter.com/xmlrpc.php
Link: <http://trafficinviter.com/>; rel=shortlink
Set-Cookie: wpfront-notification-bar-landingpage=1
Content-Type: text/html; charset=UTF-8
Run Code Online (Sandbox Code Playgroud)
Wget下载完整页面并不必要地消耗我的带宽的命令也会发生同样的事情。
我正在寻找的是:如何在不实际下载任何页面的情况下获取 HTTP 状态代码,以便我可以节省带宽消耗。我曾尝试使用 curl 但不确定我是在下载完整的页面还是只是在我的系统中下载一个标题来获取状态代码。
Ale*_*exP 49
curl -I仅获取HTTP 标头;它不会下载整个页面。来自man curl:
-I, --head
(HTTP/FTP/FILE) Fetch the HTTP-header only! HTTP-servers feature
the command HEAD which this uses to get nothing but the header
of a document. When used on an FTP or FILE file, curl displays
the file size and last modification time only.
Run Code Online (Sandbox Code Playgroud)
另一种选择是安装lynx和使用lynx -head -dump.
HEAD 请求由 HTTP 1.1 协议 ( RFC 2616 ) 指定:
9.4 HEAD
The HEAD method is identical to GET except that the server MUST NOT
return a message-body in the response. The metainformation contained
in the HTTP headers in response to a HEAD request SHOULD be identical
to the information sent in response to a GET request. This method can
be used for obtaining metainformation about the entity implied by the
request without transferring the entity-body itself. This method is
often used for testing hypertext links for validity, accessibility,
and recent modification.
Run Code Online (Sandbox Code Playgroud)
mur*_*uru 18
使用wget,您需要使用--spider选项来发送像 curl 的 HEAD 请求:
$ wget -S --spider https://google.com
Spider mode enabled. Check if remote file exists.
--2017-01-03 00:08:38-- https://google.com/
Resolving google.com (google.com)... 216.58.197.174
Connecting to google.com (google.com)|216.58.197.174|:443... connected.
HTTP request sent, awaiting response...
HTTP/1.1 302 Found
Cache-Control: private
Content-Type: text/html; charset=UTF-8
Location: https://www.google.co.jp/?gfe_rd=cr&ei=...
Content-Length: 262
Date: Mon, 02 Jan 2017 15:08:38 GMT
Alt-Svc: quic=":443"; ma=2592000; v="35,34"
Location: https://www.google.co.jp/?gfe_rd=cr&ei=... [following]
Spider mode enabled. Check if remote file exists.
--2017-01-03 00:08:38-- https://www.google.co.jp/?gfe_rd=cr&ei=...
Resolving www.google.co.jp (www.google.co.jp)... 210.139.253.109, 210.139.253.93, 210.139.253.123, ...
Connecting to www.google.co.jp (www.google.co.jp)|210.139.253.109|:443... connected.
HTTP request sent, awaiting response...
HTTP/1.1 200 OK
Date: Mon, 02 Jan 2017 15:08:38 GMT
Expires: -1
Cache-Control: private, max-age=0
Content-Type: text/html; charset=Shift_JIS
P3P: CP="This is not a P3P policy! See https://www.google.com/support/accounts/answer/151657?hl=en for more info."
Server: gws
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
Set-Cookie: NID=...; expires=Tue, 04-Jul-2017 15:08:38 GMT; path=/; domain=.google.co.jp; HttpOnly
Alt-Svc: quic=":443"; ma=2592000; v="35,34"
Transfer-Encoding: chunked
Accept-Ranges: none
Vary: Accept-Encoding
Length: unspecified [text/html]
Remote file exists and could contain further links,
but recursion is disabled -- not retrieving.
Run Code Online (Sandbox Code Playgroud)