我正在用Python编写一个个人wiki风格的程序,它将文本文件存储在用户可配置的目录中.
该程序应该能够foo
从用户获取一个字符串(例如)并创建一个文件名foo.txt
.用户只能在wiki目录中创建文件,斜杠将创建一个子目录(例如,foo/bar
变为(path-to-wiki)/foo/bar.txt
).
检查输入是否尽可能安全的最佳方法是什么?我需要注意什么?我知道一些常见的陷阱是:
../
\0
我意识到获取文件名的用户输入绝不是100%安全,但程序只能在本地运行,我只是想保护任何常见的错误/故障.
您可以通过使用os.path.normpath规范化路径,然后检查路径是否以'(path-to-wiki)'开头来强制用户在wiki中创建文件/目录
os.path.normpath('(path-to-wiki)/foo/bar.txt').startswith('(path-to-wiki)')
Run Code Online (Sandbox Code Playgroud)
要确保用户输入的路径/文件名不包含任何令人讨厌的内容,您可以强制用户输入路径或文件名到下/上Alpha,数字数字或者可以是连字符或下划线.
然后,您始终可以使用类似的正则表达式检查规范化的文件名
userpath=os.path.normpath('(path-to-wiki)/foo/bar.txt')
re.findall(r'[^A-Za-z0-9_\-\\]',userpath)
Run Code Online (Sandbox Code Playgroud)
总结一下
如果userpath=os.path.normpath('(path-to-wiki)/foo/bar.txt')
那么
if not os.path.normpath('(path-to-wiki)/foo/bar.txt').startswith('(path-to-wiki)')
or re.search(r'[^A-Za-z0-9_\-\\]',userpath):
... Do what ever you want with an invalid path
Run Code Online (Sandbox Code Playgroud)
现在有一个完整的库来验证字符串: 检查一下:
from pathvalidate import sanitize_filepath
fpath = "fi:l*e/p\"a?t>h|.t<xt"
print("{} -> {}".format(fpath, sanitize_filepath(fpath)))
fpath = "\0_a*b:c<d>e%f/(g)h+i_0.txt"
print("{} -> {}".format(fpath, sanitize_filepath(fpath)))
Run Code Online (Sandbox Code Playgroud)
输出:
fi:l*e/p"a?t>h|.t<xt -> file/path.txt
_a*b:c<d>e%f/(g)h+i_0.txt -> _abcde%f/(g)h+i_0.txt
Run Code Online (Sandbox Code Playgroud)
Armin Ronacher有关于这个主题(和其他)的博客文章:http: //lucumr.pocoo.org/2010/12/24/common-mistakes-as-web-developer/
这些想法在Flask中作为safe_join()函数实现:
def safe_join(directory, filename):
"""Safely join `directory` and `filename`.
Example usage::
@app.route('/wiki/<path:filename>')
def wiki_page(filename):
filename = safe_join(app.config['WIKI_FOLDER'], filename)
with open(filename, 'rb') as fd:
content = fd.read() # Read and process the file content...
:param directory: the base directory.
:param filename: the untrusted filename relative to that directory.
:raises: :class:`~werkzeug.exceptions.NotFound` if the resulting path
would fall out of `directory`.
"""
filename = posixpath.normpath(filename)
for sep in _os_alt_seps:
if sep in filename:
raise NotFound()
if os.path.isabs(filename) or filename.startswith('../'):
raise NotFound()
return os.path.join(directory, filename)
Run Code Online (Sandbox Code Playgroud)