Windows x64上的Python x64位复制文件性能评估/问题

luc*_*x7B 5 python windows performance copy win32com

在编写一种备份应用程序时,我对Windows上的文件复制性能进行了评估.

我有几个问题,我想知道你的意见.

谢谢!

卢卡斯.

问题:

  1. 与1 GiB文件相比,复制10 GiB文件时为什么性能会慢得多?

  2. 为什么shutil.copyfile这么慢?

  3. 为什么win32file.CopyFileEx这么慢?这可能是因为标志win32file.COPY_FILE_RESTARTABLE?但是,它不接受int 1000作为标志(COPY_FILE_NO_BUFFERING),建议用于大型文件:http: //msdn.microsoft.com/en-us/library/aa363852%28VS.85%29.aspx

  4. 使用空的ProgressRoutine似乎对完全不使用ProgressRoutine没有影响.

  5. 是否有一种替代的,更好的复制文件的方式,但也获得进度更新?

1 GiB和10 GiB文件的结果:

test_file_size             1082.1 MiB    10216.7 MiB

METHOD                      SPEED           SPEED
robocopy.exe                111.0 MiB/s     75.4 MiB/s
cmd.exe /c copy              95.5 MiB/s     60.5 MiB/s
shutil.copyfile              51.0 MiB/s     29.4 MiB/s
win32api.CopyFile           104.8 MiB/s     74.2 MiB/s
win32file.CopyFile          108.2 MiB/s     73.4 MiB/s
win32file.CopyFileEx A       14.0 MiB/s     13.8 MiB/s
win32file.CopyFileEx B       14.6 MiB/s     14.9 MiB/s
Run Code Online (Sandbox Code Playgroud)

测试环境:

Python:
ActivePython 2.7.0.2 (ActiveState Software Inc.) based on
Python 2.7 (r27:82500, Aug 23 2010, 17:17:51) [MSC v.1500 64 bit (AMD64)] on win32

source = mounted network drive
source_os = Windows Server 2008 x64

destination = local drive
destination_os = Windows Server 2008 R2 x64
Run Code Online (Sandbox Code Playgroud)

笔记:

'robocopy.exe' and 'cmd.exe /c copy' were run using subprocess.call()
Run Code Online (Sandbox Code Playgroud)

win32file.CopyFileEx A(不使用ProgressRoutine):

def Win32_CopyFileEx_NoProgress( ExistingFileName, NewFileName):
    win32file.CopyFileEx(
        ExistingFileName,                             # PyUNICODE           | File to be copied
        NewFileName,                                  # PyUNICODE           | Place to which it will be copied
        None,                                         # CopyProgressRoutine | A python function that receives progress updates, can be None
        Data = None,                                  # object              | An arbitrary object to be passed to the callback function
        Cancel = False,                               # boolean             | Pass True to cancel a restartable copy that was previously interrupted
        CopyFlags = win32file.COPY_FILE_RESTARTABLE,  # int                 | Combination of COPY_FILE_* flags
        Transaction = None                            # PyHANDLE            | Handle to a transaction as returned by win32transaction::CreateTransaction
        )
Run Code Online (Sandbox Code Playgroud)

win32file.CopyFileEx B(使用空ProgressRoutine):

def Win32_CopyFileEx( ExistingFileName, NewFileName):
    win32file.CopyFileEx(
        ExistingFileName,                             # PyUNICODE           | File to be copied
        NewFileName,                                  # PyUNICODE           | Place to which it will be copied
        Win32_CopyFileEx_ProgressRoutine,             # CopyProgressRoutine | A python function that receives progress updates, can be None
        Data = None,                                  # object              | An arbitrary object to be passed to the callback function
        Cancel = False,                               # boolean             | Pass True to cancel a restartable copy that was previously interrupted
        CopyFlags = win32file.COPY_FILE_RESTARTABLE,  # int                 | Combination of COPY_FILE_* flags
        Transaction = None                            # PyHANDLE            | Handle to a transaction as returned by win32transaction::CreateTransaction
        )

def Win32_CopyFileEx_ProgressRoutine(
    TotalFileSize,
    TotalBytesTransferred,
    StreamSize,
    StreamBytesTransferred,
    StreamNumber,
    CallbackReason,                         # CALLBACK_CHUNK_FINISHED or CALLBACK_STREAM_SWITCH
    SourceFile,
    DestinationFile,
    Data):                                  # Description
    return win32file.PROGRESS_CONTINUE      # return of any win32file.PROGRESS_* constant
Run Code Online (Sandbox Code Playgroud)

小智 3

问题3:

您误解了 Microsoft API 中的 COPY_FILE_NO_BUFFERING 标志。它不是 int 1000,而是十六进制 1000(0x1000 => int 值:4096)。当您设置 CopyFlags = 4096 时,您将拥有 Windows 环境中(?)最快的复制例程。我在数据备份代码中使用相同的例程,该例程非常快并且每天传输 TB 大小的数据。

问题4:

这并不重要,因为它是回调。但总的来说,你不应该在里面放入太多代码并保持干净整洁。

问题5:

根据我的经验,这是标准 Windows 环境中最快的复制例程。可能有更快的自定义复制例程,但是当使用普通的 Windows API 时,找不到更好的了。