Selenium在下载时给出文件名

Question

Selenium在下载时给出文件名

我正在使用selenium脚本,我正在尝试下载Excel文件并为其指定一个特定名称.这是我的代码:

无论如何我可以给下载的文件一个特定的名字吗？

码:

#!/usr/bin/python
from selenium import webdriver
from selenium.webdriver.firefox.firefox_profile import FirefoxProfile

profile = FirefoxProfile()
profile.set_preference("browser.helperApps.neverAsk.saveToDisk", "text/plain, application/vnd.ms-excel, text/csv, text/comma-separated-values, application/octet-stream")
profile.set_preference("browser.download.dir", "C:\\Downloads" )
browser = webdriver.Firefox(firefox_profile=profile)

browser.get('https://test.com/')
browser.find_element_by_partial_link_text("Excel").click() # Download file

Run Code Online (Sandbox Code Playgroud)

Answer 1

par*_*dak 11

您不能通过selenium指定下载文件的名称.但是,您可以下载该文件,在下载的文件夹中查找最新文件,然后根据需要重命名.

注意:谷歌搜索中借用的方法可能有错误.但是你明白了.

import os
import shutil
filename = max([Initial_path + "\\" + f for f in os.listdir(Initial_path)],key=os.path.getctime)
shutil.move(filename,os.path.join(Initial_path,r"newfilename.ext"))

Run Code Online (Sandbox Code Playgroud)

这给了我``file"/usr/local/Cellar/python/2.7.10_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/genericpath.py",第72行,在getctime中返回os.stat( filename).st_ctime OSError:[Errno 2]没有这样的文件或目录:'.lococal' (3认同)

Answer 2

dmb*_*dmb 8

希望这个片段不会那么令人困惑。我花了一段时间来创建这个，它真的很有用，因为这个问题没有明确的答案，只有这个库。

import os
import time
def tiny_file_rename(newname, folder_of_download):
    filename = max([f for f in os.listdir(folder_of_download)], key=lambda xa :   os.path.getctime(os.path.join(folder_of_download,xa)))
    if '.part' in filename:
        time.sleep(1)
        os.rename(os.path.join(folder_of_download, filename), os.path.join(folder_of_download, newname))
    else:
        os.rename(os.path.join(folder_of_download, filename),os.path.join(folder_of_download,newname))

Run Code Online (Sandbox Code Playgroud)

希望这可以挽救某人的一天，干杯。

编辑：感谢@Om Prakash 编辑了我的代码，这让我想起了我没有详细解释代码。

使用该max([])函数可能会导致竞争条件，使您的文件为空或损坏（我从经验中知道）。您首先要检查文件是否已完全下载。这是因为 selenium 不会等待文件下载完成，因此当您检查上次创建的文件时，生成的列表中将显示一个不完整的文件，它会尝试移动该文件。即便如此，您最好稍等片刻，以便文件从 Firefox 中释放出来。

编辑 2：更多代码

有人问我 1 秒是否足够，大部分时间是这样，但如果您需要等待更多时间，您可以将上面的代码更改为：

import os
import time
def tiny_file_rename(newname, folder_of_download, time_to_wait=60):
    time_counter = 0
    filename = max([f for f in os.listdir(folder_of_download)], key=lambda xa :   os.path.getctime(os.path.join(folder_of_download,xa)))
    while '.part' in filename:
        time.sleep(1)
        time_counter += 1
        if time_counter > time_to_wait:
            raise Exception('Waited too long for file to download')
    filename = max([f for f in os.listdir(folder_of_download)], key=lambda xa :   os.path.getctime(os.path.join(folder_of_download,xa)))
    os.rename(os.path.join(folder_of_download, filename), os.path.join(folder_of_download, newname))

Run Code Online (Sandbox Code Playgroud)

Answer 3

tos*_*o92 6

我会针对@parishodak答案进行一些更正：

此处的文件名将仅返回相对路径（此处为文件名），而不是绝对路径。

这就是为什么@FreshRamen之后出现以下错误：

File "/usr/local/Cellar/python/2.7.10_2/Frameworks/Python.framework/Versions/2.7/lib/??python2.7/genericpath.py", 
line 72, in getctime return os.stat(filename).st_ctime OSError: 
[Errno 2] No such file or directory: '.localized'

Run Code Online (Sandbox Code Playgroud)

有正确的代码：

import os
import shutil

filepath = 'c:\downloads'
filename = max([filepath +"\"+ f for f in os.listdir(filepath)], key=os.path.getctime)
shutil.move(os.path.join(dirpath,filename),newfilename)

Run Code Online (Sandbox Code Playgroud)

Answer 4

sup*_*uri 5

这是另一个简单的解决方案，您可以等到下载完成后再从chrome下载中获取下载的文件名。

铬：

# method to get the downloaded file name
def getDownLoadedFileName(waitTime):
    driver.execute_script("window.open()")
    # switch to new tab
    driver.switch_to.window(driver.window_handles[-1])
    # navigate to chrome downloads
    driver.get('chrome://downloads')
    # define the endTime
    endTime = time.time()+waitTime
    while True:
        try:
            # get downloaded percentage
            downloadPercentage = driver.execute_script(
                "return document.querySelector('downloads-manager').shadowRoot.querySelector('#downloadsList downloads-item').shadowRoot.querySelector('#progress').value")
            # check if downloadPercentage is 100 (otherwise the script will keep waiting)
            if downloadPercentage == 100:
                # return the file name once the download is completed
                return driver.execute_script("return document.querySelector('downloads-manager').shadowRoot.querySelector('#downloadsList downloads-item').shadowRoot.querySelector('div#content  #file-link').text")
        except:
            pass
        time.sleep(1)
        if time.time() > endTime:
            break

Run Code Online (Sandbox Code Playgroud)

Firefox：

def getDownLoadedFileName(waitTime):
    driver.execute_script("window.open()")
    WebDriverWait(driver,10).until(EC.new_window_is_opened)
    driver.switch_to.window(driver.window_handles[-1])
    driver.get("about:downloads")

    endTime = time.time()+waitTime
    while True:
        try:
            fileName = driver.execute_script("return document.querySelector('#contentAreaDownloadsView .downloadMainArea .downloadContainer description:nth-of-type(1)').value")
            if fileName:
                return fileName
        except:
            pass
        time.sleep(1)
        if time.time() > endTime:
            break

Run Code Online (Sandbox Code Playgroud)

单击下载链接/按钮后，只需调用上述方法即可。

 # click on download link
 browser.find_element_by_partial_link_text("Excel").click()
 # get the downloaded file name
 latestDownloadedFileName = getDownLoadedFileName(180) #waiting 3 minutes to complete the download
 print(latestDownloadedFileName)

Run Code Online (Sandbox Code Playgroud)

JAVA + Chrome：

这是java中的方法。

public String waitUntilDonwloadCompleted(WebDriver driver) throws InterruptedException {
      // Store the current window handle
      String mainWindow = driver.getWindowHandle();

      // open a new tab
      JavascriptExecutor js = (JavascriptExecutor)driver;
      js.executeScript("window.open()");
     // switch to new tab
    // Switch to new window opened
      for(String winHandle : driver.getWindowHandles()){
          driver.switchTo().window(winHandle);
      }
     // navigate to chrome downloads
      driver.get("chrome://downloads");

      JavascriptExecutor js1 = (JavascriptExecutor)driver;
      // wait until the file is downloaded
      Long percentage = (long) 0;
      while ( percentage!= 100) {
          try {
              percentage = (Long) js1.executeScript("return document.querySelector('downloads-manager').shadowRoot.querySelector('#downloadsList downloads-item').shadowRoot.querySelector('#progress').value");
              //System.out.println(percentage);
          }catch (Exception e) {
            // Nothing to do just wait
        }
          Thread.sleep(1000);
      }
     // get the latest downloaded file name
      String fileName = (String) js1.executeScript("return document.querySelector('downloads-manager').shadowRoot.querySelector('#downloadsList downloads-item').shadowRoot.querySelector('div#content #file-link').text");
     // get the latest downloaded file url
      String sourceURL = (String) js1.executeScript("return document.querySelector('downloads-manager').shadowRoot.querySelector('#downloadsList downloads-item').shadowRoot.querySelector('div#content #file-link').href");
      // file downloaded location
      String donwloadedAt = (String) js1.executeScript("return document.querySelector('downloads-manager').shadowRoot.querySelector('#downloadsList downloads-item').shadowRoot.querySelector('div.is-active.focus-row-active #file-icon-wrapper img').src");
      System.out.println("Download deatils");
      System.out.println("File Name :-" + fileName);
      System.out.println("Donwloaded path :- " + donwloadedAt);
      System.out.println("Downloaded from url :- " + sourceURL);
     // print the details
      System.out.println(fileName);
      System.out.println(sourceURL);
     // close the downloads tab2
      driver.close();
     // switch back to main window
      driver.switchTo().window(mainWindow);
      return fileName;
  }

Run Code Online (Sandbox Code Playgroud)

这是在您的Java脚本中调用此方法的方法。

// download triggering step 
downloadExe.click();
// now waituntil download finish and then get file name
System.out.println(waitUntilDonwloadCompleted(driver));

Run Code Online (Sandbox Code Playgroud)

输出：

下载详细信息

文件名：-RubyMine-2019.1.2（7）.exe

下载的路径：-chrome：//fileicon/C%3A%5CUsers%5Csupputuri%5CDownloads%5CRubyMine-2019.1.2%20（7）.exe？scale = 1.25x

从URL下载：-https: //download-cf.jetbrains.com/ruby/RubyMine-2019.1.2.exe

RubyMine-2019.1.2（7）.exe

您需要以下 3 个导入。`from selenium.webdriver.support.ui import WebDriverWait`、`from selenium.webdriver.common.by import By` 和 `from selenium.webdriver.support import Expected_conditions as EC` (2认同)
请注意，Chrome 的下载页面“chrome://downloads”在“headless”模式下不可用。 (2认同)

Answer 5

And*_*hai 5

我想出了一个不同的解决方案。既然你只关心最后下载的文件，那为什么不把它下载成一个dummy_dir？因此，该文件将成为该目录中的唯一文件。下载后，您可以将其移动到您的位置destination_dir并更改其名称。

这是一个适用于Firefox的示例：

def rename_last_downloaded_file(dummy_dir, destination_dir, new_file_name):
    def get_last_downloaded_file_path(dummy_dir):
        """ Return the last modified -in this case last downloaded- file path.

            This function is going to loop as long as the directory is empty.
        """
        while not os.listdir(dummy_dir):
            time.sleep(1)
        return max([os.path.join(dummy_dir, f) for f in os.listdir(dummy_dir)], key=os.path.getctime)

    while '.part' in get_last_downloaded_file_path(dummy_dir):
        time.sleep(1)
    shutil.move(get_last_downloaded_file_path(dummy_dir), os.path.join(destination_dir, new_file_name))

Run Code Online (Sandbox Code Playgroud)

您可以摆弄sleep时间并根据需要添加一个TimeoutException。

归档时间：	9 年，11 月前
查看次数：	20840 次
最近记录：	6 年，2 月前