Mat*_*tel 7 python google-api google-drive-api
我正在构建一个使用Google驱动器API的python应用程序,所以开发很好,但是我有一个问题需要检索整个Google驱动器文件树,我需要它有两个目的:
现在我有一个获取Gdrive根目录的函数,我可以通过递归调用一个函数来构建三个函数,这个函数列出了我单个文件夹的内容,但它非常慢并且可能会向Google发出数千个请求,这是不能接受的.
这里获取root的函数:
def drive_get_root():
"""Retrieve a root list of File resources.
Returns:
List of dictionaries.
"""
#build the service, the driveHelper module will take care of authentication and credential storage
drive_service = build('drive', 'v2', driveHelper.buildHttp())
# the result will be a list
result = []
page_token = None
while True:
try:
param = {}
if page_token:
param['pageToken'] = page_token
files = drive_service.files().list(**param).execute()
#add the files in the list
result.extend(files['items'])
page_token = files.get('nextPageToken')
if not page_token:
break
except errors.HttpError, _error:
print 'An error occurred: %s' % _error
break
return result
Run Code Online (Sandbox Code Playgroud)
这里是一个从文件夹中获取文件的人
def drive_files_in_folder(folder_id):
"""Print files belonging to a folder.
Args:
folder_id: ID of the folder to get files from.
"""
#build the service, the driveHelper module will take care of authentication and credential storage
drive_service = build('drive', 'v2', driveHelper.buildHttp())
# the result will be a list
result = []
#code from google, is working so I didn't touch it
page_token = None
while True:
try:
param = {}
if page_token:
param['pageToken'] = page_token
children = drive_service.children().list(folderId=folder_id, **param).execute()
for child in children.get('items', []):
result.append(drive_get_file(child['id']))
page_token = children.get('nextPageToken')
if not page_token:
break
except errors.HttpError, _error:
print 'An error occurred: %s' % _error
break
return result
Run Code Online (Sandbox Code Playgroud)
例如,现在检查文件是否存在我正在使用这个:
def drive_path_exist(file_path, list = False):
"""
This is a recursive function to che check if the given path exist
"""
#if the list param is empty set the list as the root of Gdrive
if list == False:
list = drive_get_root()
#split the string to get the first item and check if is in the root
file_path = string.split(file_path, "/")
#if there is only one element in the filepath we are at the actual filename
#so if is in this folder we can return it
if len(file_path) == 1:
exist = False
for elem in list:
if elem["title"] == file_path[0]:
#set exist = to the elem because the elem is a dictionary with all the file info
exist = elem
return exist
#if we are not at the last element we have to keep searching
else:
exist = False
for elem in list:
#check if the current item is in the folder
if elem["title"] == file_path[0]:
exist = True
folder_id = elem["id"]
#delete the first element and keep searching
file_path.pop(0)
if exist:
#recursive call, we have to rejoin the filpath as string an passing as list the list
#from the drive_file_exist function
return drive_path_exist("/".join(file_path), drive_files_in_folder(folder_id))
Run Code Online (Sandbox Code Playgroud)
任何想法如何解决我的问题?我在这里看到了一些关于溢出的讨论,在一些答案中人们写道,这是可能的,但当然没有说怎么做!
谢谢
pin*_*yid 10
不要将Drive视为树形结构.事实并非如此."文件夹"只是标签,例如.一个文件可以有多个父项.
为了在您的应用中构建树的表示,您需要这样做......
如果您只想检查文件夹-B中是否存在文件-A,则该方法取决于名称"folder-B"是否保证是唯一的.
如果它是唯一的,只需对title ='file-A'执行FilesList查询,然后为每个父项执行文件获取,并查看它们中的任何一个是否被称为'folder-B'.
如果'folder-B'和'folder-D'下都可以存在'folder-B',那么它就会更复杂,你需要从上面的步骤1和2构建内存中的层次结构.
您没有说明这些文件和文件夹是由您的应用创建的,还是由用户使用Google云端硬盘Web应用创建的.如果您的应用是这些文件/文件夹的创建者,则可以使用一种技巧将搜索限制为单个根.说你有
MyDrive/app_root/folder-C/folder-B/file-A
Run Code Online (Sandbox Code Playgroud)
你可以创建app_root的所有文件夹-c,文件夹-B和文件-A子项
这样您就可以限制所有要包含的查询
and 'app_root_id' in parents
Run Code Online (Sandbox Code Playgroud)
除了非常小的树之外,永远不会像这样工作。您必须重新考虑云应用程序的整个算法(您将其编写为您拥有机器的桌面应用程序),因为它很容易超时。您需要事先镜像树(任务队列和数据存储),不仅是为了避免超时,也是为了避免驱动器速率限制,并以某种方式保持同步(注册推送等)。一点也不容易。我以前做过一个驱动器树查看器。
| 归档时间: |
|
| 查看次数: |
9153 次 |
| 最近记录: |