Law*_*one 5 php security encoding utf-8
有一个 utf-8 字符(HEX 字节 E2 80 AE),当它被支持 utf-8 的系统正确处理时,当向用户显示时,它会显示反相字符。通常被蛇用来隐藏或弄乱文件扩展名。
以下是此类文件名字符串的示例:
an .EXE called: EvilFile?.EXE
an .scr called: yo.na?.scr
Run Code Online (Sandbox Code Playgroud)
如果完成文件扩展名验证不会有问题,而是显示此类字符串会导致问题,htmlentities()导致字符串变为:EvilFileâ?®.EXE
那么,将文件名修复回 EvilFile.EXE 的最佳解决方案是什么?
我用 iconv 完成的测试在输出上产生了相同类型的编码问题。
<!DOCTYPE html>
<head>
<meta charset="utf-8">
<title></title>
</head>
<body>
<?php
$evilString = "EvilFile?.EXE";
$ret = null;
$ret .= '<h1>htmlentities/ENT_QUOTES | ENT_IGNORE</h1>';
$ret .= htmlentities($evilString, ENT_QUOTES | ENT_IGNORE, "UTF-8").'<br>';
//enc options
$enc = array(
"UTF-8",
"ASCII",
"Windows-1252",
"ISO-8859-15",
"ISO-8859-1",
"ISO-8859-6",
"CP1256",
"US-ASCII//TRANSLIT",
"UTF-8//IGNORE",
"UTF-8//TRANSLIT"
);
//iconv
foreach ($enc as $i) {
$ret .= '<h1>iconv/'.$i.'</h1>';
foreach ($enc as $j) {
$ret .= " $i - $j: ".@iconv($i, $j, $evilString).'<br>';
}
}
//mb_convert_encoding
$ret .= '<h1>mb_convert_encoding</h1>';
foreach (mb_list_encodings() as $chr) {
$ret .= $chr.' - '.mb_convert_encoding($evilString, 'UTF-8', $chr)."<br>";
}
echo $ret;
?>
</body>
</html>
Run Code Online (Sandbox Code Playgroud)
结果
iconv/US-ASCII//TRANSLIT
------------------------
US-ASCII//TRANSLIT - UTF-8: EvilFile
US-ASCII//TRANSLIT - ASCII: EvilFile
US-ASCII//TRANSLIT - Windows-1252: EvilFile
US-ASCII//TRANSLIT - ISO-8859-15: EvilFile
US-ASCII//TRANSLIT - ISO-8859-1: EvilFile
US-ASCII//TRANSLIT - ISO-8859-6: EvilFile
US-ASCII//TRANSLIT - CP1256: EvilFile
US-ASCII//TRANSLIT - US-ASCII//TRANSLIT: EvilFile
US-ASCII//TRANSLIT - UTF-8//IGNORE: EvilFile.EXE <<< - See answer below
US-ASCII//TRANSLIT - UTF-8//TRANSLIT: EvilFile
iconv/UTF-8//IGNORE
-------------------
UTF-8//IGNORE - UTF-8: EvilFile?.EXE
UTF-8//IGNORE - ASCII: EvilFile
UTF-8//IGNORE - Windows-1252: EvilFile
UTF-8//IGNORE - ISO-8859-15: EvilFile
UTF-8//IGNORE - ISO-8859-1: EvilFile
UTF-8//IGNORE - ISO-8859-6: EvilFile
UTF-8//IGNORE - CP1256: EvilFile
UTF-8//IGNORE - US-ASCII//TRANSLIT: EvilFile
UTF-8//IGNORE - UTF-8//IGNORE: EvilFile?.EXE
UTF-8//IGNORE - UTF-8//TRANSLIT: EvilFile?.EXE
iconv/UTF-8//TRANSLIT
---------------------
UTF-8//TRANSLIT - UTF-8: EvilFile?.EXE
UTF-8//TRANSLIT - ASCII: EvilFile
UTF-8//TRANSLIT - Windows-1252: EvilFile
UTF-8//TRANSLIT - ISO-8859-15: EvilFile
UTF-8//TRANSLIT - ISO-8859-1: EvilFile
UTF-8//TRANSLIT - ISO-8859-6: EvilFile
UTF-8//TRANSLIT - CP1256: EvilFile
UTF-8//TRANSLIT - US-ASCII//TRANSLIT: EvilFile
UTF-8//TRANSLIT - UTF-8//IGNORE: EvilFile?.EXE
UTF-8//TRANSLIT - UTF-8//TRANSLIT: EvilFile?.EXE
mb_convert_encoding
-------------------
pass - EvilFileâ®.EXE
auto - EvilFile?.EXE
wchar - EvilFileâ®.EXE
byte2be - ???????
byte2le - ???????
byte4be - ?????????????
byte4le - ??????????????????
BASE64 - ??)^q
UUENCODE -
HTML-ENTITIES - EvilFileâ®.EXE
Quoted-Printable - EvilFile?.EXE
7bit - EvilFileâ®.EXE
8bit - EvilFileâ®.EXE
UCS-4 - ?????????????
UCS-4BE - ?????????????
UCS-4LE - ??????????????????
UCS-2 - ???????
UCS-2BE - ???????
UCS-2LE - ???????
UTF-32 - ?
UTF-32BE - ?
UTF-32LE -
UTF-16 - ???????
UTF-16BE - ???????
UTF-16LE - ???????
UTF-8 - EvilFile?.EXE
UTF-7 - EvilFile???.EXE
UTF7-IMAP - EvilFile???.EXE
ASCII - EvilFileâ®.EXE
EUC-JP - EvilFile??EXE
SJIS - EvilFile??.EXE
eucJP-win - EvilFile??EXE
SJIS-win - EvilFile??.EXE
CP932 - EvilFile??.EXE
CP51932 - EvilFile??EXE
JIS - EvilFile???.EXE
ISO-2022-JP - EvilFile???.EXE
ISO-2022-JP-MS - EvilFile???.EXE
Windows-1252 - EvilFile‮.EXE
Windows-1254 - EvilFile‮.EXE
ISO-8859-1 - EvilFileâ®.EXE
ISO-8859-2 - EvilFileâŽ.EXE
ISO-8859-3 - EvilFileâ?.EXE
ISO-8859-4 - EvilFileâŽ.EXE
ISO-8859-5 - EvilFile??.EXE
ISO-8859-6 - EvilFile??.EXE
ISO-8859-7 - EvilFile??.EXE
ISO-8859-8 - EvilFile?®.EXE
ISO-8859-9 - EvilFileâ®.EXE
ISO-8859-10 - EvilFileâ?.EXE
ISO-8859-13 - EvilFile?®.EXE
ISO-8859-14 - EvilFileâ®.EXE
ISO-8859-15 - EvilFileâ®.EXE
ISO-8859-16 - EvilFileâ®.EXE
EUC-CN - EvilFile??EXE
CP936 - EvilFile??EXE
HZ - EvilFile???.EXE
EUC-TW - EvilFile??EXE
BIG-5 - EvilFile??EXE
EUC-KR - EvilFile??EXE
UHC - EvilFile??EXE
ISO-2022-KR - EvilFile???.EXE
Windows-1251 - EvilFile??®.EXE
CP866 - EvilFile???.EXE
KOI8-R - EvilFile???.EXE
KOI8-U - EvilFile???.EXE
ArmSCII-8 - EvilFile?….EXE
CP850 - EvilFileÔÇ«.EXE
JIS-ms - EvilFile???.EXE
CP50220 - EvilFile???.EXE
CP50220raw - EvilFile???.EXE
CP50221 - EvilFile???.EXE
CP50222 - EvilFile???.EXE
Run Code Online (Sandbox Code Playgroud)
我想有(我不喜欢)。将字符串通过utf8_encode()然后通过preg_replace()删除喜怒无常的字符。但必须有更好/更清洁的方法。
echo preg_replace('/[^a-z0-9_ \[\]\.\(\)#%&-]/si', '', utf8_encode($evilString)).'<br>';
Run Code Online (Sandbox Code Playgroud)