使用Python/Boto/Django直接上传到S3构建策略

MrO*_*les 5 python django amazon-s3 amazon-web-services plupload

到目前为止,我已经经历了这个问题的多次迭代,搜索了许多不同的例子,并且已经完成了整个文档.

我正在尝试将Plupload(http://www.plupload.com/)与AWS S3直接发布方法(http://aws.amazon.com/articles/1434)结合使用.但是,我认为我正在构建我的政策和传输签名的方式有问题.当我提交表单时,我没有收到服务器的响应,而是重置了与服务器的连接.

我试图在示例中使用python代码:

import base64
import hmac, sha

policy = base64.b64encode(policy_document)

signature = base64.b64encode(
hmac.new(aws_secret_key, policy, sha).digest())
Run Code Online (Sandbox Code Playgroud)

我还尝试在python中使用更新的hashlib库.无论我使用什么方法来构建我的策略和签名,我总是得到与这里生成的值不同的值:

http://s3.amazonaws.com/doc/s3-example-code/post/post_sample.html

我已经读完了这个问题:

如何将Plupload直接上传到Amazon S3?

但我发现提供的示例过于复杂,无法准确实现它们.

我最近的尝试是使用boto库的一部分:

http://boto.cloudhackers.com/ref/s3.html#module-boto.s3.connection

但是使用S3Commection.build_post_form_args方法对我来说也没有用.

如果有人能提供如何使用python创建帖子表单的正确示例,我将非常感激.甚至一些关于为什么连接总是被重置的简单见解也会很好.

一些警告:

我想尽可能使用hashlib.我想从亚马逊获得XML响应(大概是"success_action_status = '201'"这样做)我需要能够上传大型文件,最大大小~2GB.

最后一点,当我在Chrome中运行它时,它会提供上传进度,上传通常会失败大约37%.

小智 5

内森的答案帮助我入门。我提供了两种目前对我有用的解决方案。

第一个解决方案使用纯Python。第二个使用boto。

我尝试让Boto首先工作,但一直出现错误。因此,我返回到Amazon ruby​​文档,并让S3使用不带boto的python接受文件。(浏览器使用HTML POST上传到S3

了解发生了什么之后,我能够修复我的错误并使用boto,这是一个更简单的解决方案。

我包含解决方案1,因为它明确显示了如何使用python设置策略文档和签名。

我的目标是将html上传页面创建为动态页面,并在成功上传后将用户看到的“成功”页面创建为动态页面。解决方案1显示了动态创建表单上传页面,而解决方案2显示了同时创建上传表单页面和成功页面。

解决方案1:

import base64
import hmac, hashlib

###### EDIT ONLY THE FOLLOWING ITEMS ######

DEBUG = 1
AWS_SECRET_KEY = "MySecretKey"
AWS_ACCESS_KEY = "MyAccessKey"
HTML_NAME = "S3PostForm.html"
EXPIRE_DATE = "2015-01-01T00:00:00Z" # Jan 1, 2015 gmt
FILE_TO_UPLOAD = "${filename}"
BUCKET = "media.mysite.com"
KEY = ""
ACL = "public-read" # or "private"
SUCCESS = "http://media.mysite.com/success.html"
CONTENT_TYPE = ""
CONTENT_LENGTH = 1024**3 # One gigabyte
HTTP_OR_HTTPS = "http" # Or "https" for better security
PAGE_TITLE = "My Html Upload to S3 Form"
ACTION = "%s://%s.s3.amazonaws.com/" % (HTTP_OR_HTTPS, BUCKET)

###### DON'T EDIT FROM HERE ON DOWN ######

policy_document_data = {
"expire": EXPIRE_DATE,
"bucket_name": BUCKET,
"key_name": KEY,
"acl_name": ACL,
"success_redirect": SUCCESS,
"content_name": CONTENT_TYPE,
"content_length": CONTENT_LENGTH,
}

policy_document = """
{"expiration": "%(expire)s",
  "conditions": [ 
    {"bucket": "%(bucket_name)s"}, 
    ["starts-with", "$key", "%(key_name)s"],
    {"acl": "%(acl_name)s"},
    {"success_action_redirect": "%(success_redirect)s"},
    ["starts-with", "$Content-Type", "%(content_name)s"],
    ["content-length-range", 0, %(content_length)d]
  ]
}
""" % policy_document_data

policy = base64.b64encode(policy_document)
signature = base64.b64encode(hmac.new(AWS_SECRET_KEY, policy, hashlib.sha1).digest())

html_page_data = {
"page_title": PAGE_TITLE,
"action_name": ACTION,
"filename": FILE_TO_UPLOAD,
"access_name": AWS_ACCESS_KEY,
"acl_name": ACL,
"redirect_name": SUCCESS,
"policy_name": policy,
"sig_name": signature,
"content_name": CONTENT_TYPE,
}

html_page = """
<html> 
 <head>
  <title>%(page_title)s</title> 
  <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
 </head>
<body>
 <form action="%(action_name)s" method="post" enctype="multipart/form-data">
  <input type="hidden" name="key" value="%(filename)s">
  <input type="hidden" name="AWSAccessKeyId" value="%(access_name)s">
  <input type="hidden" name="acl" value="%(acl_name)s">
  <input type="hidden" name="success_action_redirect" value="%(redirect_name)s">
  <input type="hidden" name="policy" value="%(policy_name)s">
  <input type="hidden" name="signature" value="%(sig_name)s">
  <input type="hidden" name="Content-Type" value="%(content_name)s">

  <!-- Include any additional input fields here -->

  Browse to locate the file to upload:<br \> <br \>

  <input name="file" type="file"><br> <br \>
  <input type="submit" value="Upload File to S3"> 
 </form> 
</body>
</html>
""" % html_page_data

with open(HTML_NAME, "wb") as f:
    f.write(html_page)

###### Dump output if testing ######
if DEBUG:

    if 1: # Set true if not using the LEO editor
        class G:
            def es(self, data):print(data)
        g = G()

    items = [
    "",
    "",
    "policy_document: %s" % policy_document,
    "ploicy: %s" % policy,
    "signature: %s" % signature,
    "",
    "",
    ]
    for item in items:
        g.es(item)
Run Code Online (Sandbox Code Playgroud)

解决方案2:

from boto.s3 import connection

###### EDIT ONLY THE FOLLOWING ITEMS ######

DEBUG = 1
AWS_SECRET_KEY = "MySecretKey"
AWS_ACCESS_KEY = "MyAccessKey"
HTML_NAME = "S3PostForm.html"
SUCCESS_NAME = "success.html"
EXPIRES = 60*60*24*356 # seconds = 1 year
BUCKET = "media.mysite.com"
KEY = "${filename}" # will match file entered by user
ACL = "public-read" # or "private"
SUCCESS = "http://media.mysite.com/success.html"
CONTENT_TYPE = "" # seems to work this way
CONTENT_LENGTH = 1024**3 # One gigabyte
HTTP_OR_HTTPS = "http" # Or https for better security
PAGE_TITLE = "My Html Upload to S3 Form"

###### DON'T EDIT FROM HERE ON DOWN ######

conn = connection.S3Connection(AWS_ACCESS_KEY,AWS_SECRET_KEY)
args = conn.build_post_form_args(
    BUCKET,
    KEY,
    expires_in=EXPIRES,
    acl=ACL,
    success_action_redirect=SUCCESS,
    max_content_length=CONTENT_LENGTH,
    http_method=HTTP_OR_HTTPS,
    fields=None,
    conditions=None,
    storage_class='STANDARD',
    server_side_encryption=None,
    )

form_fields = ""
line = '  <input type="hidden" name="%s" value="%s" >\n'
for item in args['fields']:
    new_line = line % (item["name"], item["value"])
    form_fields += new_line

html_page_data = {
"page_title": PAGE_TITLE,
"action": args["action"],
"input_fields": form_fields,
}

html_page = """
<html> 
 <head>
  <title>%(page_title)s</title> 
  <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
 </head>
<body>
 <form action="%(action)s" method="post" enctype="multipart/form-data" >
%(input_fields)s
  <!-- Include any additional input fields here -->

  Browse to locate the file to upload:<br \> <br \>

  <input name="file" type="file"><br> <br \>
  <input type="submit" value="Upload File to S3"> 
 </form> 
</body>
</html>
""" % html_page_data

with open(HTML_NAME, "wb") as f:
    f.write(html_page)

success_page = """
<html>
  <head>
    <title>S3 POST Success Page</title>
      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
      <script src="jquery.js"></script>
      <script src="purl.js"></script>
<!--

    Amazon S3 passes three data items in the url of this page if
        the upload was successful:
        bucket = bucket name
        key = file name upload to the bucket
        etag = hash of file

    The following script parses these values and puts them in
    the page to be displayed.

-->

<script type="text/javascript">
var pname,url,val,params=["bucket","key","etag"];
$(document).ready(function()
{
  url = $.url();
  for (param in params)
  {
    pname = params[param];
    val = url.param(pname);
    if(typeof val != 'undefined')
      document.getElementById(pname).value = val;
  }
});
</script>

  </head>
  <body>
      <div style="margin:0 auto;text-align:center;">
      <p>Congratulations!</p>
      <p>You have successfully uploaded the file.</p>
        <form action="#" method="get"
          >Location:
        <br />
          <input type="text" name="bucket" id="bucket" />
        <br />File Name:
        <br />
          <input type="text" name="key" id="key" />
        <br />Hash:
        <br />
          <input type="text" name="etag" id="etag" />
      </form>
    </div>
  </body>
</html>
"""

with open(SUCCESS_NAME, "wb") as f:
    f.write(success_page)

###### Dump output if testing ######
if DEBUG:

    if 1: # Set true if not using the LEO editor
        class G:
            def es(self, data):print(data)
        g = G()

    g.es("conn = %s" % conn)
    for key in args.keys():
        if key is not "fields":
            g.es("%s: %s" % (key, args[key]))
            continue
        for item in args['fields']:
            g.es(item)
Run Code Online (Sandbox Code Playgroud)


小智 3

我尝试使用 Boto,但发现它不允许我放入我想要的所有标头。您可以在下面看到我如何生成策略、签名和发布表单值的字典。

请注意,所有 x-amz-meta-* 标记都是自定义标头属性,您不需要它们。另请注意,表单中的几乎所有内容都需要包含在进行编码和签名的策略中。

def generate_post_form(bucket_name, key, post_key, file_id, file_name, content_type):
  import hmac
  from hashlib import sha1
  from django.conf import settings
  policy = """{"expiration": "%(expires)s","conditions": [{"bucket":"%(bucket)s"},["eq","$key","%(key)s"],{"acl":"private"},{"x-amz-meta-content_type":"%(content_type)s"},{"x-amz-meta-file_name":"%(file_name)s"},{"x-amz-meta-post_key":"%(post_key)s"},{"x-amz-meta-file_id":"%(file_id)s"},{"success_action_status":"200"}]}"""
  policy = policy%{
    "expires":(datetime.utcnow()+settings.TIMEOUT).strftime("%Y-%m-%dT%H:%M:%SZ"), # This has to be formatted this way
    "bucket": bucket_name, # the name of your bucket
    "key": key, # this is the S3 key where the posted file will be stored
    "post_key": post_key, # custom properties begin here
    "file_id":file_id,
    "file_name": file_name,
    "content_type": content_type,
  }
  encoded = policy.encode('utf-8').encode('base64').replace("\n","") # Here we base64 encode a UTF-8 version of our policy.  Make sure there are no new lines, Amazon doesn't like them.
  return ("%s://%s.s3.amazonaws.com/"%(settings.HTTP_CONNECTION_TYPE, self.bucket_name),
          {"policy":encoded,
           "signature":hmac.new(settings.AWS_SECRET_KEY,encoded,sha1).digest().encode("base64").replace("\n",""), # Generate the policy signature using our Amazon Secret Key
           "key": key,
           "AWSAccessKeyId": settings.AWS_ACCESS_KEY, # Obviously the Amazon Access Key
           "acl":"private",
           "x-amz-meta-post_key":post_key,
           "x-amz-meta-file_id":file_id,
           "x-amz-meta-file_name": file_name,
           "x-amz-meta-content_type": content_type,
           "success_action_status":"200",
          })
Run Code Online (Sandbox Code Playgroud)

然后,返回的元组可用于生成一个表单,该表单将字典中的所有键值对作为隐藏字段和实际文件输入字段发布到生成的 S3 url,其名称/id 应为“file”。

作为例子希望能有所帮助。