使用HtmlUnit下载文件

use*_*942 11 download htmlunit

我想下载一个网站的xls文件.当我点击链接下载文件时,我会收到一个javascript确认框.我在下面处理它

    ConfirmHandler okHandler = new ConfirmHandler(){
            public boolean handleConfirm(Page page, String message) {
                return true;
            }
        };
    webClient.setConfirmHandler(okHandler);
Run Code Online (Sandbox Code Playgroud)

有一个下载文件的链接.

<a href="./my_file.php?mode=xls&amp;w=d2hlcmUgc2VsbElkPSd3b3JsZGNvbScgYW5kIHN0YXR1cz0nV0FJVERFTEknIGFuZCBkYXRlIDw9IC0xMzQ4MTUzMjAwICBhbmQgZGF0ZSA%2BPSAtMTM1MDgzMTU5OSA%3D" target="actionFrame" onclick="return confirm('Do you want do download XLS file?')"><u>Download</u></a>
Run Code Online (Sandbox Code Playgroud)

我点击链接使用

HTMLPage x = webClient.getPage("http://working.com/download");
HtmlAnchor anchor = (HtmlAnchor) x.getFirstByXPath("//a[@target='actionFrame']");
anchor.click();
Run Code Online (Sandbox Code Playgroud)

handeConfirm()方法已被执行.但我不知道如何从服务器保存文件流.我尝试使用下面的代码查看流.

anchor.click().getWebResponse().getContentAsString();
Run Code Online (Sandbox Code Playgroud)

但是,结果与页面x相同.任何人都知道如何从服务器捕获流?谢谢.

use*_*942 9

我找到了一种使用WebWindowListener获取InputStream的方法.在webWindowContentChanged(WebWindowEvent事件)里面,我把代码放在下面.

InputStream xls = event.getWebWindow().getEnclosedPage().getWebResponse().getContentAsStream();
Run Code Online (Sandbox Code Playgroud)

在我获得xls之后,我可以将文件保存到我的硬盘中.


Edu*_*cio 9

我根据你的帖子做了..注意:你可以改变内容类型条件,只下载特定类型的文件.例如(application/octect-stream,application/pdf等).

package net.s4bdigital.export.main;

import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import java.util.List;

import org.junit.Before;
import org.junit.Test;
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.htmlunit.HtmlUnitDriver;

import com.gargoylesoftware.htmlunit.ConfirmHandler;
import com.gargoylesoftware.htmlunit.Page;
import com.gargoylesoftware.htmlunit.WebClient;
import com.gargoylesoftware.htmlunit.WebResponse;
import com.gargoylesoftware.htmlunit.WebWindowEvent;
import com.gargoylesoftware.htmlunit.WebWindowListener;
import com.gargoylesoftware.htmlunit.util.NameValuePair;

public class HtmlUnitDownloadFile {

    protected String baseUrl;
    protected static WebDriver driver;

    @Before
    public void openBrowser() {
        baseUrl = "http://localhost/teste.html";
        driver = new CustomHtmlUnitDriver();
        ((HtmlUnitDriver) driver).setJavascriptEnabled(true);

    }


    @Test
    public void downloadAFile() throws Exception {

        driver.get(baseUrl);
        driver.findElement(By.linkText("click to Downloadfile")).click();

    }

    public class CustomHtmlUnitDriver extends HtmlUnitDriver { 

          // This is the magic. Keep a reference to the client instance 
           protected WebClient modifyWebClient(WebClient client) { 


             ConfirmHandler okHandler = new ConfirmHandler(){
                    public boolean handleConfirm(Page page, String message) {
                        return true;
                    }
             };
             client.setConfirmHandler(okHandler);

             client.addWebWindowListener(new WebWindowListener() {

                public void webWindowOpened(WebWindowEvent event) {
                    // TODO Auto-generated method stub

                }

                public void webWindowContentChanged(WebWindowEvent event) {

                    WebResponse response = event.getWebWindow().getEnclosedPage().getWebResponse();
                    System.out.println(response.getLoadTime());
                    System.out.println(response.getStatusCode());
                    System.out.println(response.getContentType());

                    List<NameValuePair> headers = response.getResponseHeaders();
                    for(NameValuePair header: headers){
                        System.out.println(header.getName() + " : " + header.getValue());
                    }

                    // Change or add conditions for content-types that you would to like 
                    // receive like a file.
                    if(response.getContentType().equals("text/plain")){
                        getFileResponse(response, "target/testDownload.war");
                    }



                }

                public void webWindowClosed(WebWindowEvent event) {



                }
            });          

             return client; 
           } 


    } 

    public static void getFileResponse(WebResponse response, String fileName){

        InputStream inputStream = null;

        // write the inputStream to a FileOutputStream
        OutputStream outputStream = null; 

        try {       

            inputStream = response.getContentAsStream();

            // write the inputStream to a FileOutputStream
            outputStream = new FileOutputStream(new File(fileName));

            int read = 0;
            byte[] bytes = new byte[1024];

            while ((read = inputStream.read(bytes)) != -1) {
                outputStream.write(bytes, 0, read);
            }

            System.out.println("Done!");

        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            if (inputStream != null) {
                try {
                    inputStream.close();
                } catch (IOException e) {
                    e.printStackTrace();
                }
            }
            if (outputStream != null) {
                try {
                    // outputStream.flush();
                    outputStream.close();
                } catch (IOException e) {
                    e.printStackTrace();
                }

            }
        }

    }

}
Run Code Online (Sandbox Code Playgroud)

  • https://selenium.googlecode.com/svn/trunk/docs/api/java/org/openqa/selenium/htmlunit/HtmlUnitDriver.html#modifyWebClient(com.gargoylesoftware.htmlunit.WebClient)Anudeep Samaiya是一种超类方法.我们可以覆盖它添加一个句柄来确认下载文件的窗​​口..但你需要修改在你的情况下等待的内容类型. (2认同)