获取带套接字的网页

Bri*_*ian 4 java sockets

我目前正在研究套接字编程,并遇到了一个我需要帮助的问题.我试图做的是编写一个Java类,它将连接到Web主机,下载默认页面,然后断开与主机的连接.我知道使用URLConnection来做这件事比较简单,但我正在尝试学习套接字类.我已成功连接到Web服务器,但我在浏览页面时遇到了困难.到目前为止,这是我工作(而不是工作)的原因:

import java.io.*;
import java.net.*;
import java.lang.IllegalArgumentException;
public class SocketsFun{
    public static void main(String[] myArgs){
        // Set some variables
        String theServer = null;
        String theLine = null;
        int thePort = 0;
        Socket theSocket = null;
        boolean exit = false;
        boolean socketCheck = false;
        BufferedReader theInput = null;

        // Grab the server and port number
        try{
            theServer = myArgs[0];
            thePort = Integer.parseInt(myArgs[1]);
            System.out.println("Opening a connection to " + theServer + " on port " + thePort);
        } catch(ArrayIndexOutOfBoundsException aioobe){
            System.out.println("usage: SocketsFun host port");
            exit = true;
        } catch(NumberFormatException nfe) {
            System.out.println("usage: SocketsFun host port");
            exit = true;
        }

        if(!exit){
            // Open the socket
            try{
                theSocket = new Socket(theServer, thePort);
            } catch(UnknownHostException uhe){
                System.out.println("* " + theServer + " does not exist");
            } catch(IOException ioe){
                System.out.println("* " + "Connection Refused");
            } catch(IllegalArgumentException iae){
                System.out.println("* " + thePort + " Not A Valid TCP/UDP Port.");
            }

            // Print out some stuff
            try{
                System.out.println("Connected Socket: " + theSocket.toString());
            } catch(Exception e){
                System.out.println("* " + "No Open Socket");
            }

            try{
                theInput = new BufferedReader(new InputStreamReader(theSocket.getInputStream()));
                while ((theLine = theInput.readLine()) != null){
                    System.out.println(theLine);
                }
                theInput.close();
            } catch(IOException ioe){
                System.out.println("* " + "No Data To Read");
            } catch(NullPointerException npe){
                System.out.println("* " + "No Data To Read");
            }

            // Close the socket
            try{
                socketCheck = theSocket.isConnected();
            } catch(NullPointerException npe){
                System.out.println("* " + "No Socket To Close");
            }
        }
    }
}
Run Code Online (Sandbox Code Playgroud)

所有我想要的是这个课程吐出可能从"curl","lynx -dump"或"wget"等输出的内容.任何和所有的帮助将不胜感激.

Rob*_*ert 6

您有正确的想法,但您没有提交HTTP请求.发送:

GET / HTTP/1.1\r\nHost: <hostname\r\n\r\n

这遵循格式

[METHOD] [PATH] HTTP/1.1 [CRLF]
Host: [HOSTNAME] [CRLF]
OTHER: HEADERS [CRLF]
[CRLF]

您应该得到一个遵循类似格式的响应 - 标题,空行和数据.阅读有关HTTP协议的更多信息.

编辑也许它有助于了解HTTP请求语法,开始.这很简单,一般都是一件好事.打开终端并使用netcat(首选)或telnet.netcat google.com 80telnet google.com 80.类型:

GET / HTTP/1.1[ENTER]
Host: google.com[ENTER]
[ENTER]

我收到了回复(第二次回复):

HTTP/1.1 301 Moved Permanently
Location: http://www.google.com/
Content-Type: text/html; charset=UTF-8
Date: Thu, 09 Dec 2010 00:03:39 GMT
Expires: Sat, 08 Jan 2011 00:03:39 GMT
Cache-Control: public, max-age=2592000
Server: gws
Content-Length: 219
X-XSS-Protection: 1; mode=block

<HTML&<HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>301 Moved</TITLE></HEAD><BODY>
<H1>301 Moved</H1>
The document has moved
<A HREF="http://www.google.com/">here</A>.
</BODY></HTML>

一旦你了解了请求语法,只需将其写入套接字,然后读取行直到服务器关闭,就像你正在做的那样.