rka*_*kam 27 python download urllib2 google-drive-api pydrive
我正在尝试从谷歌驱动器下载文件,我所拥有的只是驱动器的URL.
我已经阅读了关于google api的内容,该内容涉及一些drive_service和MedioIO,它还需要一些凭据(主要是json文件/ oauth).但我无法知道它是如何工作的.
另外,尝试过urllib2 urlretrieve,但我的情况是从驱动器获取文件.尝试'wget'也没用.
尝试了pydrive库.它具有良好的上传功能,但没有下载选项.
任何帮助将不胜感激.谢谢.
tur*_*ula 37
如果"驱动器的网址"是指Google云端硬盘上文件的可共享链接,则以下内容可能有所帮助:
import requests
def download_file_from_google_drive(id, destination):
URL = "https://docs.google.com/uc?export=download"
session = requests.Session()
response = session.get(URL, params = { 'id' : id }, stream = True)
token = get_confirm_token(response)
if token:
params = { 'id' : id, 'confirm' : token }
response = session.get(URL, params = params, stream = True)
save_response_content(response, destination)
def get_confirm_token(response):
for key, value in response.cookies.items():
if key.startswith('download_warning'):
return value
return None
def save_response_content(response, destination):
CHUNK_SIZE = 32768
with open(destination, "wb") as f:
for chunk in response.iter_content(CHUNK_SIZE):
if chunk: # filter out keep-alive new chunks
f.write(chunk)
if __name__ == "__main__":
file_id = 'TAKE ID FROM SHAREABLE LINK'
destination = 'DESTINATION FILE ON YOUR DISK'
download_file_from_google_drive(file_id, destination)
Run Code Online (Sandbox Code Playgroud)
剪辑不使用pydrive,也不使用Google Drive SDK.它使用请求模块(不知何故,它是urllib2的替代品).
从Google云端硬盘下载大型文件时,单个GET请求是不够的.需要第二个 - 请参阅google驱动器中的wget/curl大文件.
Pad*_*ddy 34
我推荐gdown包。
pip install gdown
Run Code Online (Sandbox Code Playgroud)
获取您的分享链接
https://drive.google.com/file/d/0B9P1L--7Wd2vNm9zMTJWOGxobkU/view?usp=sharing
并获取 id - 例如。1TLNdIufzwesDbyr_nVTR7Zrx9oRHLM_N / 并在下面的 id 之后将其交换。
import gdown
url = 'https://drive.google.com/uc?id=0B9P1L--7Wd2vNm9zMTJWOGxobkU'
output = '20150428_collected_images.tgz'
gdown.download(url, output, quiet=False)
Run Code Online (Sandbox Code Playgroud)
ndr*_*plz 17
有过多次相似的需求后,我GoogleDriveDownloader从@ user115202上面的片段开始做了一个额外的简单课程.你可以在这里找到源代码.
你也可以通过pip安装它:
pip install googledrivedownloader
Run Code Online (Sandbox Code Playgroud)
然后使用就像:
from google_drive_downloader import GoogleDriveDownloader as gdd
gdd.download_file_from_google_drive(file_id='1iytA1n2z4go3uVCwE__vIKouTKyIDjEq',
dest_path='./data/mnist.zip',
unzip=True)
Run Code Online (Sandbox Code Playgroud)
此代码段将下载Google云端硬盘中共享的存档.在这种情况下,1iytA1n2z4go3uVCwE__vIKouTKyIDjEq是从Google云端硬盘获取的可共享链接的ID.
Ray*_*ayB 15
这是一种无需第三方库和服务帐户即可完成此操作的简单方法。
点安装google-api-core和google-api-python-client
from googleapiclient.discovery import build
from googleapiclient.http import MediaIoBaseDownload
from google.oauth2 import service_account
import io
credz = {} #put json credentials her from service account or the like
# More info: https://cloud.google.com/docs/authentication
credentials = service_account.Credentials.from_service_account_info(credz)
drive_service = build('drive', 'v3', credentials=credentials)
file_id = '0BwwA4oUTeiV1UVNwOHItT0xfa2M'
request = drive_service.files().get_media(fileId=file_id)
#fh = io.BytesIO() # this can be used to keep in memory
fh = io.FileIO('file.tar.gz', 'wb') # this can be used to write to disk
downloader = MediaIoBaseDownload(fh, request)
done = False
while done is False:
status, done = downloader.next_chunk()
print("Download %d%%." % int(status.progress() * 100))
Run Code Online (Sandbox Code Playgroud)
文档中有一个函数,当我们提供要下载的文件的 ID 时,它会下载文件,
from __future__ import print_function
import io
import google.auth
from googleapiclient.discovery import build
from googleapiclient.errors import HttpError
from googleapiclient.http import MediaIoBaseDownload
def download_file(real_file_id):
"""Downloads a file
Args:
real_file_id: ID of the file to download
Returns : IO object with location.
Load pre-authorized user credentials from the environment.
TODO(developer) - See https://developers.google.com/identity
for guides on implementing OAuth2 for the application.
"""
creds, _ = google.auth.default()
try:
# create drive api client
service = build('drive', 'v3', credentials=creds)
file_id = real_file_id
# pylint: disable=maybe-no-member
request = service.files().get_media(fileId=file_id)
file = io.BytesIO()
downloader = MediaIoBaseDownload(file, request)
done = False
while done is False:
status, done = downloader.next_chunk()
print(F'Download {int(status.progress() * 100)}.')
except HttpError as error:
print(F'An error occurred: {error}')
file = None
return file.getvalue()
if __name__ == '__main__':
download_file(real_file_id='1KuPmvGq8yoYgbfW74OENMCB5H0n_2Jm9')
Run Code Online (Sandbox Code Playgroud)
这就带来了一个问题:
我们如何获取文件ID来下载文件呢?
一般来说,来自 Google 云端硬盘的共享文件的 URL 如下所示
https://drive.google.com/file/d/1HV6vf8pB-EYnjcJcH65eGZVMa2v2tcMh/view?usp=sharing
Run Code Online (Sandbox Code Playgroud)
其中1HV6vf8pB-EYnjcJcH65eGZVMa2v2tcMh对应于文件ID。
您可以简单地从 URL 复制它,或者,如果您愿意,也可以创建一个函数来从 URL 获取 fileID。
例如,给定以下内容url = https://drive.google.com/file/d/1HV6vf8pB-EYnjcJcH65eGZVMa2v2tcMh/view?usp=sharing,
def url_to_id(url):
x = url.split("/")
return x[5]
Run Code Online (Sandbox Code Playgroud)
打印 x 会给出
['https:', '', 'drive.google.com', 'file', 'd', '1HV6vf8pB-EYnjcJcH65eGZVMa2v2tcMh', 'view?usp=sharing']
Run Code Online (Sandbox Code Playgroud)
因此,当我们想要返回第 6 个数组值时,我们使用x[5].
PyDrive允许您使用功能下载文件GetContentFile()。您可以在此处找到该函数的文档。
请参阅下面的示例:
# Initialize GoogleDriveFile instance with file id.
file_obj = drive.CreateFile({'id': '<your file ID here>'})
file_obj.GetContentFile('cats.png') # Download file as 'cats.png'.
Run Code Online (Sandbox Code Playgroud)
此代码假设您有一个经过身份验证的drive对象,可以在此处和此处找到有关此的文档。
在一般情况下,这是这样做的:
from pydrive.auth import GoogleAuth
gauth = GoogleAuth()
# Create local webserver which automatically handles authentication.
gauth.LocalWebserverAuth()
# Create GoogleDrive instance with authenticated GoogleAuth instance.
drive = GoogleDrive(gauth)
Run Code Online (Sandbox Code Playgroud)
可以在此处找到有关服务器上的静默身份验证的信息,并涉及编写一个settings.yaml(示例:此处)来保存身份验证详细信息。
小智 5
import requests
def download_file_from_google_drive(id, destination):
URL = "https://docs.google.com/uc?export=download"
session = requests.Session()
response = session.get(URL, params = { 'id' : id , 'confirm': 1 }, stream = True)
token = get_confirm_token(response)
if token:
params = { 'id' : id, 'confirm' : token }
response = session.get(URL, params = params, stream = True)
save_response_content(response, destination)
def get_confirm_token(response):
for key, value in response.cookies.items():
if key.startswith('download_warning'):
return value
return None
def save_response_content(response, destination):
CHUNK_SIZE = 32768
with open(destination, "wb") as f:
for chunk in response.iter_content(CHUNK_SIZE):
if chunk: # filter out keep-alive new chunks
f.write(chunk)
if __name__ == "__main__":
file_id = 'TAKE ID FROM SHAREABLE LINK'
destination = 'DESTINATION FILE ON YOUR DISK'
download_file_from_google_drive(file_id, destination)
Run Code Online (Sandbox Code Playgroud)
只需重复接受的答案,但添加confirm=1参数,这样即使文件太大,它也总是会下载
| 归档时间: |
|
| 查看次数: |
40853 次 |
| 最近记录: |