如何从Android中的HTML链接获取页面的HTML源代码?

Pra*_*een 26 html android android-emulator

我正在开发一个需要从链接获取网页源的应用程序,然后从该页面解析html.

你能给我一些例子,或者从哪里开始编写这样的应用程序吗?

Mar*_*k B 46

您可以使用HttpClient执行HTTP GET并检索HTML响应,如下所示:

HttpClient client = new DefaultHttpClient();
HttpGet request = new HttpGet(url);
HttpResponse response = client.execute(request);

String html = "";
InputStream in = response.getEntity().getContent();
BufferedReader reader = new BufferedReader(new InputStreamReader(in));
StringBuilder str = new StringBuilder();
String line = null;
while((line = reader.readLine()) != null)
{
    str.append(line);
}
in.close();
html = str.toString();
Run Code Online (Sandbox Code Playgroud)

  • 得到了未知的主机异常,对我来说这是一个权利问题,将这个`<uses-permission android:name ="android.permission.INTERNET"/>`添加到清单中 (9认同)
  • 现在已弃用. (4认同)
  • 糟糕的是我收到了一个未知的主机异常但我可以打开浏览器访问我的同一个URL. (2认同)

Spi*_*pau 17

我会建议jsoup.

根据他们的网站:

获取Wikipedia主页,将其解析为DOM,并从"新闻"部分中选择元素列表中的标题(在线示例):

Document doc = Jsoup.connect("http://en.wikipedia.org/").get();
Elements newsHeadlines = doc.select("#mp-itn b a");
Run Code Online (Sandbox Code Playgroud)

入门:

  1. 下载 jsoup jar核心库
  2. 阅读食谱介绍


Col*_*ite 14

这个问题是有点老了,但我想我现在应该张贴我的答案DefaultHttpClient,HttpGet等已被弃用.在给定URL的情况下,此函数应该获取并返回HTML.

public static String getHtml(String url) throws IOException {
    // Build and set timeout values for the request.
    URLConnection connection = (new URL(url)).openConnection();
    connection.setConnectTimeout(5000);
    connection.setReadTimeout(5000);
    connection.connect();

    // Read and store the result line by line then return the entire string.
    InputStream in = connection.getInputStream();
    BufferedReader reader = new BufferedReader(new InputStreamReader(in));
    StringBuilder html = new StringBuilder();
    for (String line; (line = reader.readLine()) != null; ) {
        html.append(line);
    }
    in.close();

    return html.toString();
}
Run Code Online (Sandbox Code Playgroud)


Jul*_*ian 6

public class RetrieveSiteData extends AsyncTask<String, Void, String> {
@Override
protected String doInBackground(String... urls) {
    StringBuilder builder = new StringBuilder(100000);

    for (String url : urls) {
        DefaultHttpClient client = new DefaultHttpClient();
        HttpGet httpGet = new HttpGet(url);
        try {
            HttpResponse execute = client.execute(httpGet);
            InputStream content = execute.getEntity().getContent();

            BufferedReader buffer = new BufferedReader(new InputStreamReader(content));
            String s = "";
            while ((s = buffer.readLine()) != null) {
                builder.append(s);
            }

        } catch (Exception e) {
            e.printStackTrace();
        }
    }

    return builder.toString();
}

@Override
protected void onPostExecute(String result) {

}
}
Run Code Online (Sandbox Code Playgroud)