Django:验证上传文件的文件类型

Pla*_*sma 31 django django-models django-uploads

我有一个应用程序,让人们上传文件,表示为UploadedFiles.但是,我想确保用户只上传xml文件.我知道我可以使用magic,但我不知道在哪里进行此检查 - 我无法将其放在clean函数中,因为在clean运行时文件尚未上传,据我所知.

这是UploadedFile模型:

class UploadedFile(models.Model):
    """This represents a file that has been uploaded to the server."""
    STATE_UPLOADED = 0
    STATE_ANNOTATED = 1
    STATE_PROCESSING = 2
    STATE_PROCESSED = 4
    STATES = (
        (STATE_UPLOADED, "Uploaded"),
        (STATE_ANNOTATED, "Annotated"),
        (STATE_PROCESSING, "Processing"),
        (STATE_PROCESSED, "Processed"),
    )

    status = models.SmallIntegerField(choices=STATES,
        default=0, blank=True, null=True) 
    file = models.FileField(upload_to=settings.XML_ROOT)
    project = models.ForeignKey(Project)

    def __unicode__(self):
        return self.file.name

    def name(self):
        return os.path.basename(self.file.name)

    def save(self, *args, **kwargs):
        if not self.status:
            self.status = self.STATE_UPLOADED
        super(UploadedFile, self).save(*args, **kwargs)

    def delete(self, *args, **kwargs):
        os.remove(self.file.path)
        self.file.delete(False)
        super(UploadedFile, self).delete(*args, **kwargs)

    def get_absolute_url(self):
        return u'/upload/projects/%d' % self.id

    def clean(self):
        if not "XML" in magic.from_file(self.file.url):
            raise ValidationError(u'Not an xml file.')

class UploadedFileForm(forms.ModelForm):
    class Meta:                
        model = UploadedFile
        exclude = ('project',)
Run Code Online (Sandbox Code Playgroud)

Sul*_*ibi 24

验证文件是一个常见的挑战,所以我想使用验证器:

import magic

from django.utils.deconstruct import deconstructible
from django.template.defaultfilters import filesizeformat


@deconstructible
class FileValidator(object):
    error_messages = {
     'max_size': ("Ensure this file size is not greater than %(max_size)s."
                  " Your file size is %(size)s."),
     'min_size': ("Ensure this file size is not less than %(min_size)s. "
                  "Your file size is %(size)s."),
     'content_type': "Files of type %(content_type)s are not supported.",
    }

    def __init__(self, max_size=None, min_size=None, content_types=()):
        self.max_size = max_size
        self.min_size = min_size
        self.content_types = content_types

    def __call__(self, data):
        if self.max_size is not None and data.size > self.max_size:
            params = {
                'max_size': filesizeformat(self.max_size), 
                'size': filesizeformat(data.size),
            }
            raise ValidationError(self.error_messages['max_size'],
                                   'max_size', params)

        if self.min_size is not None and data.size < self.min_size:
            params = {
                'min_size': filesizeformat(self.mix_size),
                'size': filesizeformat(data.size)
            }
            raise ValidationError(self.error_messages['min_size'], 
                                   'min_size', params)

        if self.content_types:
            content_type = magic.from_buffer(data.read(), mime=True)
            data.seek(0)

            if content_type not in self.content_types:
                params = { 'content_type': content_type }
                raise ValidationError(self.error_messages['content_type'],
                                   'content_type', params)

    def __eq__(self, other):
        return (
            isinstance(other, FileValidator) and
            self.max_size == other.max_size and
            self.min_size == other.min_size and
            self.content_types == other.content_types
        )
Run Code Online (Sandbox Code Playgroud)

然后你可以FileValidator在你的model.FileFieldforms.FileField如下使用:

validate_file = FileValidator(max_size=1024 * 100, 
                             content_types=('application/xml',))
file = models.FileField(upload_to=settings.XML_ROOT, 
                        validators=[validate_file])
Run Code Online (Sandbox Code Playgroud)

  • 你应该在`content_type = magic.from_buffer(data.read(),mime = True)之后放置`data.seek(0)`,以便在视图或文件处理程序中再次读取有效字段而不显式寻求0. (2认同)

Pla*_*sma 16

对于后代:解决方案是使用该read方法并将其传递给magic.from_buffer.

class UploadedFileForm(ModelForm):
    def clean_file(self):
        file = self.cleaned_data.get("file", False)
        filetype = magic.from_buffer(file.read())
        if not "XML" in filetype:
            raise ValidationError("File is not XML.")
        return file

    class Meta:
        model = models.UploadedFile
        exclude = ('project',)
Run Code Online (Sandbox Code Playgroud)


rbe*_*ell 14

从django 1.11开始,您还可以使用FileExtensionValidator.

from django.core.validators import FileExtensionValidator
class UploadedFile(models.Model):
    file = models.FileField(upload_to=settings.XML_ROOT, 
        validators=[FileExtensionValidator(allowed_extensions=['xml'])])
Run Code Online (Sandbox Code Playgroud)

请注意,这必须在FileField上使用,并且不能在CharField上工作(例如),因为验证器在value.name上验证.

ref:https://docs.djangoproject.com/en/dev/ref/validators/#fileextensionvalidator

  • 仅验证文件扩展名是不够的。请使用使用 libmagic 检查文件内容的验证方法。请参阅第 3 节:http://opensourcehacker.com/2013/07/31/secure-user-uploads-and-exploiting-served-user-content/ (3认同)

Mik*_*maa 5

我认为你想要做的是在 Django 的Form.clean_your_field_name_here()方法中清理上传的文件 - 如果数据作为普通的 HTTP POST 请求提交,那么数据到那时就可以在你的系统上使用。

另外,如果您认为这种效率低下,请探索不同 Django 文件上传后端的选项以及如何进行流处理。

如果您在处理上传时需要考虑系统的安全性

  • 确保上传的文件具有正确的扩展名

  • 确保 mimetype 与文件扩展名匹配

如果您担心用户上传漏洞文件(用于攻击您的网站)

  • 重写保存时的所有文件内容,以消除可能的额外(利用)有效负载(因此您无法在 XML 中嵌入 HTML,浏览器在下载时会将 HTML 解释为站点源 HTML 文件)

  • 确保在下载时使用内容处置标头

这里有更多信息:http://opensourcehacker.com/2013/07/31/secure-user-uploads-and-exploiting-served-user-content/

下面是我如何清理上传的图像的示例:

class Example(models.Model):
    image = models.ImageField(upload_to=filename_gen("participant-images/"), blank=True, null=True)


class Example(forms.ModelForm):
    def clean_image(self):
        """ Clean the uploaded image attachemnt.
        """
        image = self.cleaned_data.get('image', False)
        utils.ensure_safe_user_image(image)
        return image


def ensure_safe_user_image(image):
    """ Perform various checks to sanitize user uploaded image data.

    Checks that image was valid header, then

    :param: InMemoryUploadedFile instance (Django form field value)

    :raise: ValidationError in the case the image content has issues
    """

    if not image:
        return

    assert isinstance(image, InMemoryUploadedFile), "Image rewrite has been only tested on in-memory upload backend"

    # Make sure the image is not too big, so that PIL trashes the server
    if image:
        if image._size > 4*1024*1024:
            raise ValidationError("Image file too large - the limit is 4 megabytes")

    # Then do header peak what the image claims
    image.file.seek(0)
    mime = magic.from_buffer(image.file.getvalue(), mime=True)
    if mime not in ("image/png", "image/jpeg"):
        raise ValidationError("Image is not valid. Please upload a JPEG or PNG image.")

    doc_type = mime.split("/")[-1].upper()

    # Read data from cStringIO instance
    image.file.seek(0)
    pil_image = Image.open(image.file)

    # Rewrite the image contents in the memory
    # (bails out with exception on bad data)
    buf = StringIO()
    pil_image.thumbnail((2048, 2048), Image.ANTIALIAS)
    pil_image.save(buf, doc_type)
    image.file = buf

    # Make sure the image has valid extension (can't upload .htm image)
    extension = unicode(doc_type.lower())
    if not image.name.endswith(u".%s" % extension):
        image.name = image.name + u"." + extension
Run Code Online (Sandbox Code Playgroud)