用于从Content-Disposition标头中提取文件名的javascript正则表达式

adn*_*ili 11 javascript regex

Content-disposition标头包含可以轻松提取的文件名,但有时它包含双引号,有时没有引号,也可能还有其他一些变体.有人可以编写一个适用于所有情况的正则表达式.

Content-Disposition: attachment; filename=content.txt
Run Code Online (Sandbox Code Playgroud)

以下是一些可能的目标字符串:

attachment; filename=content.txt
attachment; filename*=UTF-8''filename.txt
attachment; filename="EURO rates"; filename*=utf-8''%e2%82%ac%20rates
attachment; filename="omáèka.jpg"
and some other combinations might also be there
Run Code Online (Sandbox Code Playgroud)

Rob*_*bin 26

你可以尝试这种精神:

filename[^;=\n]*=((['"]).*?\2|[^;\n]*)

filename      # match filename, followed by
[^;=\n]*      # anything but a ;, a = or a newline
=
(             # first capturing group
    (['"])    # either single or double quote, put it in capturing group 2
    .*?       # anything up until the first...
    \2        # matching quote (single if we found single, double if we find double)
|             # OR
    [^;\n]*   # anything but a ; or a newline
)
Run Code Online (Sandbox Code Playgroud)

您的文件名在第一个捕获组中:http://regex101.com/r/hJ7tS6

  • /filename[^;=\n]*=((['"]).*?\2|[^;\n]*)/.exec(contentDisposition)[1] (4认同)

小智 7

稍作修改以匹配我的用例(将所有引号和UTF标签剥离)

filename\*?=['"]?(?:UTF-\d['"]*)?([^;\r\n"']*)['"]?;?

https://regex101.com/r/UhCzyI/3

  • 如果文件名包含 ' 则失败 (2认同)

小智 6

/filename[^;=\n]*=(?:(\\?['"])(.*?)\1|(?:[^\s]+'.*?')?([^;\n]*))/i
Run Code Online (Sandbox Code Playgroud)

https://regex101.com/r/hJ7tS6/51

编辑:您也可以使用此解析器:https: //github.com/Rob--W/open-in-browser/blob/master/extension/content-disposition.js


kir*_*ipk 5

filename[^;\\n]*=(UTF-\\d[\'"]*)?(([\'"]).*?[.]$\\2|[^;\\n]*)?\n
Run Code Online (Sandbox Code Playgroud)\n

我已经升级了Robin\xe2\x80\x99s 解决方案以执行另外两件事:

\n
    \n
  1. 捕获文件名,即使它已转义双引号。\n在此输入图像描述

    \n
  2. \n
  3. 将 UTF-8\'\' 部分捕获为单独的组。\n在此输入图像描述

    \n
  4. \n
\n

这是一个 ECMAScript 解决方案。

\n

https://regex101.com/r/7Csdp4/3/

\n