Exa*_*gon 5 java connection proxy http stream
我有一个问题从URL www.example.com/example.pdf通过代理下载文件并将其保存在java中的文件系统上.有没有人对这如何运作有所了解?如果我得到InputStream,我可以简单地将它保存到文件系统:
final ReadableByteChannel rbc = Channels.newChannel(httpUrlConnetion.getInputStream());
final FileOutputStream fos = new FileOutputStream(file);
fos.getChannel().transferFrom(rbc, 0, Long.MAX_VALUE);
fos.close();
Run Code Online (Sandbox Code Playgroud)
但如何通过代理获取网址的输入流?如果我这样做:
SocketAddress addr = new InetSocketAddress("my.proxy.com", 8080);
Proxy proxy = new Proxy(Proxy.Type.HTTP, addr);
URL url = new URL("http://my.real.url.com/");
URLConnection conn = url.openConnection(proxy);
Run Code Online (Sandbox Code Playgroud)
我得到这个例外:
java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(Unknown Source)
at java.net.SocketInputStream.read(Unknown Source)
at java.io.BufferedInputStream.fill(Unknown Source)
at java.io.BufferedInputStream.read1(Unknown Source)
at java.io.BufferedInputStream.read(Unknown Source)
at sun.net.www.http.HttpClient.parseHTTPHeader(Unknown Source)
at sun.net.www.http.HttpClient.parseHTTP(Unknown Source)
at sun.net.www.http.HttpClient.parseHTTP(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(Unknown Source)
at app.model.mail.crawler.newimpl.FileLoader.getSourceOfSiteViaProxy(FileLoader.java:167)
at app.model.mail.crawler.newimpl.FileLoader.process(FileLoader.java:220)
at app.model.mail.crawler.newimpl.FileLoader.run(FileLoader.java:57)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Run Code Online (Sandbox Code Playgroud)
使用这个:
final HttpURLConnection httpUrlConnetion = (HttpURLConnection) website.openConnection(proxy);
httpUrlConnetion.setDoOutput(true);
httpUrlConnetion.setDoInput(true);
httpUrlConnetion.setRequestProperty("Content-type", "text/xml");
httpUrlConnetion.setRequestProperty("Accept", "text/xml, application/xml");
httpUrlConnetion.setRequestMethod("POST");
httpUrlConnetion.connect();
Run Code Online (Sandbox Code Playgroud)
我能够下载一个网站的源代码是html,但不是一个文件,也许有人可以帮我处理我必须设置下载文件的属性.
以编程方式设置代理:
SocketAddress addr = new InetSocketAddress("my.proxy.com", 8080);
Proxy proxy = new Proxy(Proxy.Type.HTTP, addr);
URL url = new URL("http://my.real.url.com/");
URLConnection conn = url.openConnection(proxy);
Run Code Online (Sandbox Code Playgroud)
然后您可以使用上面的代码并URLConnection在最后一行返回。如果您愿意,您还可以使用 SOCKS 代理,或者强制不使用代理。
这是从Oracle 文档中获取(并稍作编辑)的。
可以使用Apache httpclient库来解决代理的大部分问题.要编译下面的代码,您可以使用以下maven:
Maven的:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>stackoverflow.test</groupId>
<artifactId>proxyhttp</artifactId>
<version>0.0.1-SNAPSHOT</version>
<packaging>jar</packaging>
<name>proxy</name>
<url>http://maven.apache.org</url>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpclient</artifactId>
<version>4.5.1</version>
</dependency>
</dependencies>
</project>
Run Code Online (Sandbox Code Playgroud)
Java代码:
import org.apache.http.HttpHost;
import org.apache.http.client.config.RequestConfig;
import org.apache.http.client.methods.CloseableHttpResponse;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.HttpClients;
import org.apache.http.util.EntityUtils;
/**
* How to send a request via proxy.
*
* @since 4.0
*/
public class ClientExecuteProxy {
public static void main(String[] args)throws Exception {
CloseableHttpClient httpclient = HttpClients.createDefault();
try {
HttpHost target = new HttpHost("www.google.com", 80, "http");
HttpHost proxy = new HttpHost("127.0.0.1", 8889, "http");
RequestConfig config = RequestConfig.custom()
.setProxy(proxy)
.build();
HttpGet request = new HttpGet("/");
request.setConfig(config);
System.out.println("Executing request " + request.getRequestLine() + " to " + target + " via " + proxy);
CloseableHttpResponse response = httpclient.execute(target, request);
try {
System.out.println("----------------------------------------");
System.out.println(response.getStatusLine());
System.out.println(EntityUtils.toString(response.getEntity()));
} finally {
response.close();
}
} finally {
httpclient.close();
}
}
}
Run Code Online (Sandbox Code Playgroud)
以下内容与其他答案不同,对我有用:在连接前设置以下属性:
System.getProperties().put("http.proxySet", "true");
System.getProperties().put("http.proxyHost", "my.proxy.com");
System.getProperties().put("http.proxyPort", "8080"); //port is String, not int
Run Code Online (Sandbox Code Playgroud)
然后,打开URLConnection并尝试下载文件。
| 归档时间: |
|
| 查看次数: |
7550 次 |
| 最近记录: |