在中断的传输上恢复rsync partial(-P/ - partial)

Gli*_*hes 21 linux backup rsync partial remote-backup

我正在尝试使用rsync将我的文件服务器备份到删除文件服务器.传输中断时,Rsync无法成功恢复.我使用了部分选项,但rsync找不到它已经启动的文件,因为它将它重命名为临时文件,并且在恢复时它创建一个新文件并从头开始.

这是我的命令:

rsync -avztP -e "ssh -p 2222" /volume1/ myaccont@backup-server-1:/home/myaccount/backup/ --exclude "@spool" --exclude "@tmp"

运行此命令时,会在远程计算机上创建一个名为OldDisk.dmg的备份文件,如.OldDisk.dmg.SjDndj23.

现在当互联网连接中断并且我必须恢复传输时,我必须通过找到像.OldDisk.dmg.SjDndj23这样的临时文件找到rsync停止的位置并将其重命名为OldDisk.dmg,以便它看到已存在它可以恢复的文件.

我如何解决这个问题,以便每次都不必手动干预?

Ric*_*ael 26

TL; DR:使用--timeout=X(以秒为单位的X)更改默认的rsync服务器超时,而不是--inplace.

问题是rsync服务器进程(其中有两个,rsync --server ...ps接收器的输出中看到)继续运行,等待rsync客户端发送数据.

如果rsync服务器进程没有足够的时间接收数据,它们确实会通过将临时文件移动到其"正确"名称(例如,没有临时后缀)来超时,自行终止和清除.然后你就可以恢复了.

If you don't want to wait for the long default timeout to cause the rsync server to self-terminate, then when your internet connection returns, log into the server and clean up the rsync server processes manually. However, you must politely terminate rsync -- otherwise, it will not move the partial file into place; but rather, delete it (and thus there is no file to resume). To politely ask rsync to terminate, do not SIGKILL (e.g., -9), but SIGTERM (e.g., pkill -TERM -x rsync - only an example, you should take care to match only the rsync processes concerned with your client).

Fortunately there is an easier way: use the --timeout=X (X in seconds) option; it is passed to the rsync server processes as well.

For example, if you specify rsync ... --timeout=15 ..., both the client and server rsync processes will cleanly exit if they do not send/receive data in 15 seconds. On the server, this means moving the temporary file into position, ready for resuming.

I'm not sure of the default timeout value of the various rsync processes will try to send/receive data before they die (it might vary with operating system). In my testing, the server rsync processes remain running longer than the local client. On a "dead" network connection, the client terminates with a broken pipe (e.g., no network socket) after about 30 seconds; you could experiment or review the source code. Meaning, you could try to "ride out" the bad internet connection for 15-20 seconds.

If you do not clean up the server rsync processes (or wait for them to die), but instead immediately launch another rsync client process, two additional server processes will launch (for the other end of your new client process). Specifically, the new rsync client will not re-use/reconnect to the existing rsync server processes. Thus, you'll have two temporary files (and four rsync server processes) -- though, only the newer, second temporary file has new data being written (received from your new rsync client process).

Interestingly, if you then clean up all rsync server processes (for example, stop your client which will stop the new rsync servers, then SIGTERM the older rsync servers, it appears to merge (assemble) all the partial files into the new proper named file. So, imagine a long running partial copy which dies (and you think you've "lost" all the copied data), and a short running re-launched rsync (oops!).. you can stop the second client, SIGTERM the first servers, it will merge the data, and you can resume.

Finally, a few short remarks:

  • Don't use --inplace to workaround this. You will undoubtedly have other problems as a result, man rsync for the details.
  • It's trivial, but -t in your rsync options is redundant, it is implied by -a.
  • 没有压缩的情况下通过rsync发送的已压缩磁盘映像可能会缩短传输时间(避免双重压缩).但是,我不确定两种情况下的压缩技术.我试试看.
  • 据我了解--checksum/ -c,在这种情况下它不会帮助你.它会影响rsync如何决定是否应该传输文件.虽然,在第一次rsync完成后,你可以运行第二个 rsync -c来坚持校验和,以防止奇怪的情况,文件大小和modtime在两边是相同的,但写入了错误的数据.


gao*_*the 8

抱歉,这里的其他答案太复杂了:-7.一个更简单的答案为我工作:(使用rsync over -e ssh)

# optionally move rsync temp file, then resume using rsync 
dst$ mv .<filename>.6FuChr <filename>
src$ rsync -avhzP --bwlimit=1000 -e ssh <fromfiles> <user@somewhere>:<destdir>/
Run Code Online (Sandbox Code Playgroud)

当从被中断的scp恢复时也可以工作.

Rsync创建一个临时文件...临时文件快速增长到部分传输文件的大小.转移恢复.

Scp写入实际的目标文件.如果传输中断,则这是一个截断的文件.

args的解释:

-avhz .. h = humanoid,v = verbose,a = archive,z = compression .. archive指示它保持time_t值,所以即使时钟已经出来,rsync也知道每个文件的真实日期

-P是--partial --progress的缩写.--partial告诉rsync保留部分传输的文件(并且在恢复时,rsync将在安全校验和之后始终使用部分传输的文件)

从手册页:http: //ss64.com/bash/rsync_options.html

--partial
By default, rsync will delete any partially transferred file if the transfer
is interrupted. In some circumstances it is more desirable to keep partially
transferred files. Using the --partial option tells rsync to keep the partial
file which should make a subsequent transfer of the rest of the file much faster.

--progress
This option tells rsync to print information showing the progress of the transfer.
This gives a bored user something to watch.
This option is normally combined with -v. Using this option without the -v option
will produce weird results on your display.

-P
The -P option is equivalent to --partial --progress.
I found myself typing that combination quite often so I created an option to make
it easier.
Run Code Online (Sandbox Code Playgroud)

注意:对于多次中断的连接: 如果需要在rsync之后恢复(在连接中断后),则最好在目标上重命名临时文件.scp在目标上创建一个与最终文件同名的文件.如果scp被中断,则此文件是文件的截断版本.rsync(-avzhP)将从该文件恢复,但开始写入临时文件名,如..Yhg7al.

使用scp启动时的过程:

scp; *interrupt*; rsync; [REPEAT_as_needed: *interrupt*; mv .destfile.tmpzhX destfile; rsync;]. 
Run Code Online (Sandbox Code Playgroud)

使用rsync启动时的过程:

rsync; [REPEAT_as_needed: *interrupt*; mv .destfile.tmpzhX destfile; rsync;].
Run Code Online (Sandbox Code Playgroud)

  • `--partial` 保留部分文件,但要从这些文件中恢复,应该使用 `--append` 或 `--append-verify` 并且目标应该小于源,[尽管源具有更近的时间戳。](http://unix.stackexchange.com/questions/48298/can-rsync-resume-after-being-interrupted/165417?noredirect=1#comment405796_165417) (2认同)