Hob*_*use 9 python multithreading cloudfiles
我正在使用cloudfile模块将文件上传到rackspace云文件,使用类似这样的伪代码:
import cloudfiles
username = '---'
api_key = '---'
conn = cloudfiles.get_connection(username, api_key)
testcontainer = conn.create_container('test')
for f in get_filenames():
obj = testcontainer.create_object(f)
obj.load_from_filename(f)
Run Code Online (Sandbox Code Playgroud)
我的问题是我要上传很多小文件,这种方式需要太长时间.
埋藏在文档中,我看到有一个类ConnectionPool,据说可以用来并行上传文件.
有人可以说明我如何让这段代码一次上传多个文件?
该ConnectionPool班是用于那些ocasionally有送东西给Rackspace的多线程应用程序.
这样您就可以重用连接,但如果您有100个线程,则不必保持100个连接打开.
您只是在寻找多线程/多处理上传器.以下是使用该multiprocessing库的示例:
import cloudfiles
import multiprocessing
USERNAME = '---'
API_KEY = '---'
def get_container():
conn = cloudfiles.get_connection(USERNAME, API_KEY)
testcontainer = conn.create_container('test')
return testcontainer
def uploader(filenames):
'''Worker process to upload the given files'''
container = get_container()
# Keep going till you reach STOP
for filename in iter(filenames.get, 'STOP'):
# Create the object and upload
obj = container.create_object(filename)
obj.load_from_filename(filename)
def main():
NUMBER_OF_PROCESSES = 16
# Add your filenames to this queue
filenames = multiprocessing.Queue()
# Start worker processes
for i in range(NUMBER_OF_PROCESSES):
multiprocessing.Process(target=uploader, args=(filenames,)).start()
# You can keep adding tasks until you add STOP
filenames.put('some filename')
# Stop all child processes
for i in range(NUMBER_OF_PROCESSES):
filenames.put('STOP')
if __name__ == '__main__':
multiprocessing.freeze_support()
main()
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
2072 次 |
| 最近记录: |