我正确地解析了这个HTTP POST请求吗？

Question

我正确地解析了这个HTTP POST请求吗？

Car*_*ers 3 python parsing file-upload http twisted.web

让我先说,我正在使用twisted.web框架.Twisted.web文件上传不像我想要的那样(它只包含文件数据,而不是任何其他信息),cgi.parse_multipart不能像我想要的那样工作(同样的事情,twisted.web使用这个功能),cgi.FieldStorage不起作用(因为我通过扭曲而不是CGI接口获取POST数据 - 据我所知,FieldStorage尝试通过stdin获取请求),并且twisted.web2对我不起作用因为使用Deferred困惑和激怒了我(对我想要的东西太复杂了).

话虽这么说,我决定尝试自己解析HTTP请求.

使用Chrome,HTTP请求形成如下:

------WebKitFormBoundary7fouZ8mEjlCe92pq
Content-Disposition: form-data; name="upload_file_nonce"

11b03b61-9252-11df-a357-00266c608adb
------WebKitFormBoundary7fouZ8mEjlCe92pq
Content-Disposition: form-data; name="file"; filename="login.html"
Content-Type: text/html

<!DOCTYPE html>
<html>
  <head> 

...

------WebKitFormBoundary7fouZ8mEjlCe92pq
Content-Disposition: form-data; name="file"; filename=""


------WebKitFormBoundary7fouZ8mEjlCe92pq--

Run Code Online (Sandbox Code Playgroud)

它总是会如何形成？我正在用正则表达式解析它,就像这样(原谅代码墙):

(注意,我剪掉了大部分代码,只显示我认为相关的内容(正则表达式(是的,嵌套括号),这是我构建__init__的Uploads类中的方法(目前唯一的方法).完整的代码可以在修订历史中可以看到(我希望我没有错配任何括号)

if line == "--{0}--".format(boundary):
    finished = True

if in_header == True and not line:
    in_header = False
    if 'type' not in current_file:
        ignore_current_file = True

if in_header == True:
    m = re.match(
        "Content-Disposition: form-data; name=\"(.*?)\"; filename=\"(.*?)\"$", line)
    if m:
        input_name, current_file['filename'] = m.group(1), m.group(2)

    m = re.match("Content-Type: (.*)$", line)
    if m:
        current_file['type'] = m.group(1)

    else:
        if 'data' not in current_file:
            current_file['data'] = line
        else:
            current_file['data'] += line

Run Code Online (Sandbox Code Playgroud)

你可以看到我在到达边界时开始一个新的"文件"dict.我设置in_header到True说,我解析头.当我到达一个空行时,我将其切换到False- 但是在检查是否Content-Type为该表单值设置了之前- 如果没有,我设置,ignore_current_file因为我只是在寻找文件上传.

我知道我应该使用一个库,但是我厌倦了阅读文档,尝试在我的项目中使用不同的解决方案,并且仍然让代码看起来合理.我只是想通过这一部分 - 如果解析带文件上传的HTTP POST就这么简单,那么我将坚持下去.

注意:此代码目前运行良好,我只是想知道它是否会阻塞/吐出某些浏览器的请求.

Answer 1

lai*_*ack 7

我对此问题的解决方案是使用cgi.FieldStorage解析内容,如:

class Root(Resource):

def render_POST(self, request):

    self.headers = request.getAllHeaders()
    # For the parsing part look at [PyMOTW by Doug Hellmann][1]
    img = cgi.FieldStorage(
        fp = request.content,
        headers = self.headers,
        environ = {'REQUEST_METHOD':'POST',
                 'CONTENT_TYPE': self.headers['content-type'],
                 }
    )

    print img["upl_file"].name, img["upl_file"].filename,
    print img["upl_file"].type, img["upl_file"].type
    out = open(img["upl_file"].filename, 'wb')
    out.write(img["upl_file"].value)
    out.close()
    request.redirect('/tests')
    return ''

Run Code Online (Sandbox Code Playgroud)

归档时间：	15 年，1 月前
查看次数：	4720 次
最近记录：	13 年，5 月前