Windows FINDSTR命令的未记录功能和限制是什么?

dbe*_*ham 184 cmd batch-file findstr

Windows FINDSTR命令可怕地记录在案.有一个非常基本的命令行帮助可用FINDSTR /?,或者HELP FINDSTR,但是它非常不合适.在https://docs.microsoft.com/en-us/windows-server/administration/windows-commands/findstr上有更多的在线文档.

有许多FINDSTR功能和限制甚至在文档中都没有暗示.如果没有先前的知识和/或仔细的实验​​,他们也无法预料到.

所以问题是 - 未记录的FINDSTR功能和限制是什么?

此问题的目的是提供许多未记录的功能的一站式存储库,以便:

A)开发人员可以充分利用那里的功能.

B)开发人员不要浪费时间,想知道为什么某些东西看起来不应该起作用.

在回复之前,请确保您知道现有文档.如果HELP涵盖了这些信息,则它不属于此处.

这也不是展示FINDSTR有趣用途的地方.如果逻辑人员可以根据文档预测FINDSTR的特定用法的行为,那么它不属于此处.

同样,如果逻辑人员可以基于任何现有答案中包含的信息预测特定用法的行为,那么再次,它不属于此处.

dbe*_*ham 272

前言
本答案中的大部分信息都是基于在Vista机器上运行的实验收集的.除非另有明确说明,否则我尚未确认该信息是否适用于其他Windows版本.

FINDSTR输出
文档从不困扰解释FINDSTR的输出.它暗示了印刷匹配线的事实,但仅此而已.

匹配行输出的格式如下:

文件名:行号:lineOffset:文本

哪里

fileName: =包含匹配行的文件的名称.如果请求明确针对单个文件,或者搜索管道输入或重定向输入,则不会打印文件名.打印时,fileName将始终包含提供的任何路径信息.如果使用该/S选项,将添加其他路径信息.打印路径始终相对于提供的路径,或者相对于当前目录(如果未提供).

注 - 使用非标准(并且记录不良)通配符 <和搜索多个文件时,可以避免使用文件名前缀>.这里可以找到这些通配符如何工作的确切规则.最后,您可以查看非标准通配符如何与FINDSTR一起使用的示例.

lineNumber: =匹配行的行号,表示为十进制值,1表示输入的第一行.仅/N在指定选项时打印.

lineOffset: =匹配行开头的十进制字节偏移量,0表示第一行的第一个字符.仅/O在指定选项时打印.这不是线内匹配的偏移量.它是从文件开头到行首的字节数.

text =匹配行的二进制表示,包括任何<CR>和/或<LF>.二进制输出中没有任何内容,因此匹配所有行的此示例将生成原始文件的精确二进制副本.

FINDSTR "^" FILE >FILE_COPY
Run Code Online (Sandbox Code Playgroud)

大多数控制字符和许多扩展ASCII字符在XP
显示为点 FINDSTR在XP上显示大多数不可打印的控制字符,从匹配行作为屏幕上的点(句点).以下控制字符是例外; 它们显示为自己:0x09选项卡,0x0A LineFeed,0x0B垂直选项卡,0x0C换页,0x0D回车.

XP FINDSTR还将许多扩展ASCII字符转换为点.在XP上显示为点的扩展ASCII字符与在命令行上提供时转换的字符相同.请参阅本文后面的"命令行参数的字符限制 - 扩展ASCII转换"部分

如果输出通过管道传输,重定向到文件或在FOR IN()子句中,控制字符和扩展ASCII不会在XP上转换为点.

Vista和Windows 7始终将所有字符显示为自身,而不是像点一样.

退货代码(ERRORLEVEL)

  • 0(成功)
    • 在至少一个文件的至少一行中找到匹配.
  • 1(失败)
    • 在任何文件的任何行中都找不到匹配项.
    • /A:xx选项指定的颜色无效
  • 2(错误)
    • 不兼容的选项/L/R两者都指定
    • 缺少参数后/A:,/F:,/C:,/D:,或者/G:
    • 指定/F:file或未/G:file找到的文件
  • 255(错误)

要搜索的数据源 (基于Windows 7测试更新)
Findstr只能搜索以下某个来源的数据:

  • 文件名指定为参数和/或使用/F:file选项.

  • stdin通过重定向 findstr "searchString" <file

  • 来自管道的数据流 type file | findstr "searchString"

参数/选项优先于重定向,重定向优先于管道数据.

文件名参数/F:file可以组合使用.可以使用多个文件名参数.如果/F:file指定了多个选项,则仅使用最后一个选项.文件名参数中允许使用通配符,但不在指向的文件中/F:file.

Source of search strings (Updated based on tests with Windows 7)
The /G:file and /C:string options may be combined. Multiple /C:string options may be specified. If multiple /G:file options are specified, then only the last one is used. If either /G:file or /C:string is used, then all non-option arguments are assumed to be files to search. If neither /G:file nor /C:string is used, then the first non-option argument is treated as a space delimited list of search terms.

File names must not be quoted within the file when using the /F:FILE option.
File names may contain spaces and other special characters. Most commands require that such file names are quoted. But the FINDSTR /F:files.txt option requires that filenames within files.txt must NOT be quoted. The file will not be found if the name is quoted.

BUG - Short 8.3 filenames can break the /D and /S options
As with all Windows commands, FINDSTR will attempt to match both the long name and the short 8.3 name when looking for files to search. Assume the current folder contains the following non-empty files:

b1.txt
b.txt2
c.txt
Run Code Online (Sandbox Code Playgroud)

The following command will successfully find all 3 files:

findstr /m "^" *.txt
Run Code Online (Sandbox Code Playgroud)

b.txt2 matches because the corresponding short name B9F64~1.TXT matches. This is consistent with the behavior of all other Windows commands.

But a bug with the /D and /S options causes the following commands to only find b1.txt

findstr /m /d:. "^" *.txt
findstr /m /s "^" *.txt
Run Code Online (Sandbox Code Playgroud)

The bug prevents b.txt2 from being found, as well as all file names that sort after b.txt2 within the same directory. Additional files that sort before, like a.txt, are found. Additional files that sort later, like d.txt, are missed once the bug has been triggered.

Each directory searched is treated independently. For example, the /S option would successfully begin searching in a child folder after failing to find files in the parent, but once the bug causes a short file name to be missed in the child, then all subsequent files in that child folder would also be missed.

The commands work bug free if the same file names are created on a machine that has NTFS 8.3 name generation disabled. Of course b.txt2 would not be found, but c.txt would be found properly.

Not all short names trigger the bug. All instances of bugged behavior I have seen involve an extension that is longer than 3 characters with a short 8.3 name that begins the same as a normal name that does not require an 8.3 name.

The bug has been confirmed on XP, Vista, and Windows 7.

Non-Printable characters and the /P option
The /P option causes FINDSTR to skip any file that contains any of the following decimal byte codes:
0-7, 14-25, 27-31.

Put another way, the /P option will only skip files that contain non-printable control characters. Control characters are codes less than or equal to 31 (0x1F). FINDSTR treats the following control characters as printable:

 8  0x08  backspace
 9  0x09  horizontal tab
10  0x0A  line feed
11  0x0B  vertical tab
12  0x0C  form feed
13  0x0D  carriage return
26  0x1A  substitute (end of text)
Run Code Online (Sandbox Code Playgroud)

All other control characters are treated as non-printable, the presence of which causes the /P option to skip the file.

Piped and Redirected input may have <CR><LF> appended
If the input is piped in and the last character of the stream is not <LF>, then FINDSTR will automatically append <CR><LF> to the input. This has been confirmed on XP, Vista and Windows 7. (I used to think that the Windows pipe was responsible for modifying the input, but I have since discovered that FINDSTR is actually doing the modification.)

The same is true for redirected input on Vista. If the last character of a file used as redirected input is not <LF>, then FINDSTR will automatically append <CR><LF> to the input. However, XP and Windows 7 do not alter redirected input.

FINDSTR hangs on XP and Windows 7 if redirected input does not end with <LF>
This is a nasty "feature" on XP and Windows 7. If the last character of a file used as redirected input does not end with <LF>, then FINDSTR will hang indefinitely once it reaches the end of the redirected file.

Last line of Piped data may be ignored if it consists of a single character
If the input is piped in and the last line consists of a single character that is not followed by <LF>, then FINDSTR completely ignores the last line.

Example - The first command with a single character and no <LF> fails to match, but the second command with 2 characters works fine, as does the third command that has one character with terminating newline.

> set /p "=x" <nul | findstr "^"

> set /p "=xx" <nul | findstr "^"
xx

> echo x| findstr "^"
x
Run Code Online (Sandbox Code Playgroud)

Reported by DosTips user Sponge Belly at new findstr bug. Confirmed on XP, Windows 7 and Windows 8. Haven't heard about Vista yet. (I no longer have Vista to test).

Option syntax
Options can be prefixed with either / or - Options may be concatenated after a single / or -. However, the concatenated option list may contain at most one multicharacter option such as OFF or F:, and the multi-character option must be the last option in the list.

The following are all equivalent ways of expressing a case insensitive regex search for any line that contains both "hello" and "goodbye" in any order

  • /i /r /c:"hello.*goodbye" /c:"goodbye.*hello"

  • -i -r -c:"hello.*goodbye" /c:"goodbye.*hello"

  • /irc:"hello.*goodbye" /c:"goodbye.*hello"

Search String length limits
On Vista the maximum allowed length for a single search string is 511 bytes. If any search string exceeds 511 then the result is a FINDSTR: Search string too long. error with ERRORLEVEL 2.

When doing a regular expression search, the maximum search string length is 254. A regular expression with length between 255 and 511 will result in a FINDSTR: Out of memory error with ERRORLEVEL 2. A regular expression length >511 results in the FINDSTR: Search string too long. error.

On Windows XP the search string length is apparently shorter. Findstr error: "Search string too long": How to extract and match substring in "for" loop? The XP limit is 127 bytes for both literal and regex searches.

Line Length limits
Files specified as a command line argument or via the /F:FILE option have no known line length limit. Searches were successfully run against a 128MB file that did not contain a single <LF>.

Piped data and Redirected input is limited to 8191 bytes per line. This limit is a "feature" of FINDSTR. It is not inherent to pipes or redirection. FINDSTR using redirected stdin or piped input will never match any line that is >=8k bytes. Lines >= 8k generate an error message to stderr, but ERRORLEVEL is still 0 if the search string is found in at least one line of at least one file.

Default type of search: Literal vs Regular Expression
/C:"string" - The default is /L literal. Explicitly combining the /L option with /C:"string" certainly works but is redundant.

"string argument" - The default depends on the content of the very first search string. (Remember that <space> is used to delimit search strings.) If the first search string is a valid regular expression that contains at least one un-escaped meta-character, then all search strings are treated as regular expressions. Otherwise all search strings are treated as literals. For example, "51.4 200" will be treated as two regular expressions because the first string contains an un-escaped dot, whereas "200 51.4" will be treated as two literals because the first string does not contain any meta-characters.

/G:file - The default depends on the content of the first non-empty line in the file. If the first search string is a valid regular expression that contains at least one un-escaped meta-character, then all search strings are treated as regular expressions. Otherwise all search strings are treated as literals.

Recommendation - Always explicitly specify /L literal option or /R regular expression option when using "string argument" or /G:file.

BUG - Specifying multiple literal search strings can give unreliable results

The following simple FINDSTR example fails to find a match, even though it should.

echo ffffaaa|findstr /l "ffffaaa faffaffddd"
Run Code Online (Sandbox Code Playgroud)

This bug has been confirmed on Windows Server 2003, Windows XP, Vista, and Windows 7.

Based on experiments, FINDSTR may fail if all of the following conditions are met:

  • The search is using multiple literal search strings
  • The search strings are of different lengths
  • A short search string has some amount of overlap with a longer search string
  • The search is case sensitive (no /I option)

In every failure I have seen, it is always one of the shorter search strings that fails.

For more info see Why doesn't this FINDSTR example with multiple literal search strings find a match?

Quotes and backslahses within command line arguments
Note - User MC ND's comments reflect the actual horrifically complicated rules for this section. There are 3 distinct parsing phases involved:

  • First cmd.exe may require some quotes to be escaped as ^" (really nothing to do with FINDSTR)
  • Next FINDSTR uses the pre 2008 MS C/C++ argument parser, which has special rules for " and \
  • After the argument parser finishes, FINDSTR additionally treats\followed by an alpha-numeric character as literal, but\followed by non-alpha-numeric character as an escape character

The remainder of this highlighted section is not 100% correct. It can serve as a guide for many situations, but the above rules are required for total understanding.

Escaping Quote within command line search strings
Quotes within command line search strings must be escaped with backslash like \". This is true for both literal and regex search strings. This information has been confirmed on XP, Vista, and Windows 7.

Note: The quote may also need to be escaped for the CMD.EXE parser, but this has nothing to do with FINDSTR. For example, to search for a single quote you could use:

FINDSTR \^" file && echo found || echo not found

Escaping Backslash within command line literal search strings
Backslash in a literal search string can normally be represented as \ or as \\. They are typically equivalent. (There may be unusual cases in Vista where the backslash must always be escaped, but I no longer have a Vista machine to test).

But there are some special cases:

When searching for consecutive backslashes, all but the last must be escaped. The last backslash may optionally be escaped.

  • \\ can be coded as \\\ or \\\\
  • \\\ can be coded as \\\\\ or \\\\\\

Searching for one or more backslashes before a quote is bizarre. Logic would suggest that the quote must be escaped, and each of the leading backslashes would need to be escaped, but this does not work! Instead, each of the leading backslashes must be double escaped, and the quote is escaped normally:

  • \" must be coded as \\\\\"
  • \\" must be coded as \\\\\\\\\"

As previously noted, one or more escaped quotes may also require escaping with ^ for the CMD parser

The info in this section has been confirmed on XP and Windows 7.

Escaping Backslash within command line regex search strings

  • Vista only: Backslash in a regex must be either double escaped like \\\\, or else single escaped within a character class set like [\\]

  • XP and Windows 7: Backslash in a regex can always be represented as [\\]. It can normally be represented as \\. But this never works if the backslash precedes an escaped quote.

    One or more backslashes before an escaped quote must either be double escaped, or else coded as [\\]

    • \" may be coded as \\\\\" or [\\]\"
    • \\" may be coded as \\\\\\\\\" or [\\][\\]\" or \\[\\]\"

Escaping Quote and Backslash within /G:FILE literal search strings
Standalone quotes and backslashes within a literal search string file specified by /G:file need not be escaped, but they can be.

" and \" are equivalent.

\ and \\ are equivalent.

If the intent is to find \\, then at least the leading backslash must be escaped. Both \\\ and \\\\ work.

If the intent is to find \", then at least the leading backslash must be escaped. Both \\" and \\\" work.

Escaping Quote and Backslash within /G:FILE regex search strings
This is the one case where the escape sequences work as expected based on the documentation. Quote is not a regex metacharacter, so it need not be escaped (but can be). Backslash is a regex metacharacter, so it must be escaped.

Character limits for command line parameters - Extended ASCII transformation
The null character (0x00) cannot appear in any string on the command line. Any other single byte character can appear in the string (0x01 - 0xFF). However, FINDSTR converts many extended ASCII characters it finds within command line parameters into other characters. This has a major impact in two ways:

1) Many extended ASCII characters will not match themselves if used as a search string on the command line. This limitation is the same for literal and regex searches. If a search string must contain extended ASCII, then the /G:FILE option should be used instead.

2) FINDSTR may fail to find a file if the name contains extended ASCII characters and the file name is specified on the command line. If a file to be searched contains extended ASCII in the name, then the /F:FILE option should be used instead.

Here is a complete list of extended ASCII character transformations that FINDSTR performs on command line strings. Each character is represented as the decimal byte code value. The first code represents the character as supplied on the command line, and the second code represents the character it is transformed into. Note - this list was compiled on a U.S machine. I do not know what impact other languages may have on this list.

158 treated as 080     199 treated as 221     226 treated as 071
169 treated as 170     200 treated as 043     227 treated as 112
176 treated as 221     201 treated as 043     228 treated as 083
177 treated as 221     202 treated as 045     229 treated as 115
178 treated as 221     203 treated as 045     231 treated as 116
179 treated as 221     204 treated as 221     232 treated as 070
180 treated as 221     205 treated as 045     233 treated as 084
181 treated as 221     206 treated as 043     234 treated as 079
182 treated as 221     207 treated as 045     235 treated as 100
183 treated as 043     208 treated as 045     236 treated as 056
184 treated as 043     209 treated as 045     237 treated as 102
185 treated as 221     210 treated as 045     238 treated as 101
186 treated as 221     211 treated as 043     239 treated as 110
187 treated as 043     212 treated as 043     240 treated as 061
188 treated as 043     213 treated as 043     242 treated as 061
189 treated as 043     214 treated as 043     243 treated as 061
190 treated as 043     215 treated as 043     244 treated as 040
191 treated as 043     216 treated as 043     245 treated as 041
192 treated as 043     217 treated as 043     247 treated as 126
193 treated as 045     218 treated as 043     249 treated as 250
194 treated as 045     219 treated as 221     251 treated as 118
195 treated as 043     220 treated as 095     252 treated as 110
196 treated as 045     222 treated as 221     254 treated as 221
197 treated as 043     223 treated as 095
198 treated as 221     224 treated as 097
Run Code Online (Sandbox Code Playgroud)

Any character >0 not in the list above is treated as itself, including <CR> and <LF>. The easiest way to include odd characters like <CR> and <LF> is to get them into an environment variable and use delayed expansion within the command line argument.

Chara

  • 出色的完整性.如果只有互联网上的所有答案都是这样的话. (42认同)
  • +1,我喜欢这种类型的深度分析 (8认同)
  • 编辑 - 在XP上将控制字符显示为点.还记录了源自短8.3文件名的错误的`/ S`和`/ D`选项. (2认同)
  • 仅供参考(我不知道您是否已经知道,但我在您的回答中没有提及)。大部分*"怪异"*反斜杠+引号规则的原因是`findstr`是一个`exe`文件,[一些规则](http://www.daviddeley.com/autohotkey/parameters/parameters.htm #WIN) 控制参数标记器如何处理反斜杠+引号,但是一旦解析了参数,`findstr` 代码就有一个 *string* 需要编译成 *regular expression* 实例。因此,某些反斜杠会被解释两次。 (2认同)
  • 文字反斜杠不需要任何转义(`findstr /l \ *.cmd`),但双引号文字反斜杠需要它(`findstr /l "\\" *.cmd`)以避免转义引号。但是`findstr` 字符串解析器将处理一个文字反斜杠,后跟一个*非字母数字字符*(`[a-zA-Z0-9]`)作为转义字符:`findstr /l /c:"\ o" * .cmd` 搜索一个空格后跟一个 `o` 字符,因为反斜杠转义了空格,但是 `findstr /l /c:"\w" *.cmd` 搜索一个反斜杠后跟一个 `w` 字符(它是字母数字,所以它没有被转义) (2认同)
  • @dbenham,想知道是否应该简要介绍一下 `/A:` 选项?`FINDSTR` 帮助没有指定在搜索多个文件时只对文件名进行颜色编码。人们可以从第一次阅读帮助中推断出它可以更改输出中找到的字符串的颜色。我想从技术上讲,它不是一个未记录的功能或限制,但微软没有特别指出这一点似乎很奇怪。**SS64** 上的文档确实如此。 (2认同)

dbe*_*ham 62

从上面的第1部分继续回答 - 我遇到了30,000个字符的答案限制:-(

有限正则表达式(正则表达式)支持
FINDSTR对正则表达式的支持非常有限.如果它不在HELP文档中,则不受支持.

除此之外,支持的正则表达式表达式以完全非标准的方式实现,因此结果可能会有所不同,然后可能会出现类似grep或perl的结果.

正则表达式行位置锚点^和$
^匹配输入流的开头以及紧跟在<LF>之后的任何位置.由于FINDSTR在<LF>之后也会断行,因此"^"的简单正则表达式将始终匹配文件中的所有行,甚至是二进制文件.

$匹配<CR>之前的任何位置.这意味着包含的正则表达式搜索字符串$永远不会匹配Unix样式文本文件中的任何行,如果它缺少<CR> <LF>的EOL标记,它也不会匹配Windows文本文件的最后一行.

注 - 如前所述,对FINDSTR的管道和重定向输入可能<CR><LF>附加了不在源中的输入.显然这会影响使用的正则表达式搜索$.

任何包含之前^或之后字符的搜索字符串$始终无法找到匹配项.

位置选项/ B/E/X
位置选项与^和的工作方式相同$,但它们也适用于文字搜索字符串.

/ B的功能与^正则表达式搜索字符串的开头相同.

/ E的功能与$正则表达式搜索字符串末尾的功能相同.

/ X的功能^$在正则表达式搜索字符串的开头和结尾都有相同的功能.

正则表达式边界
\<必须是正则表达式中的第一个术语.如果任何其他字符在它之前,则正则表达式将不匹配任何内容.\<对应于输入的开头,一行的开头(紧跟在<LF>之后的位置),或紧跟在任何"非单词"字符之后的位置.下一个字符不必是"单词"字符.

\>必须是正则表达式中的最后一个术语.如果任何其他字符跟随它,则正则表达式将不匹配任何内容.\>对应于输入的结束,紧接在<CR>之前的位置,或紧接在任何"非单词"字符之前的位置.前面的字符不必是"单词"字符.

以下是"非单词"字符的完整列表,表示为十进制字节代码.注意 - 此列表是在美国机器上编译的.我不知道其他语言可能对此列表有什么影响.

001   028   063   179   204   230
002   029   064   180   205   231
003   030   091   181   206   232
004   031   092   182   207   233
005   032   093   183   208   234
006   033   094   184   209   235
007   034   096   185   210   236
008   035   123   186   211   237
009   036   124   187   212   238
011   037   125   188   213   239
012   038   126   189   214   240
014   039   127   190   215   241
015   040   155   191   216   242
016   041   156   192   217   243
017   042   157   193   218   244
018   043   158   194   219   245
019   044   168   195   220   246
020   045   169   196   221   247
021   046   170   197   222   248
022   047   173   198   223   249
023   058   174   199   224   250
024   059   175   200   226   251
025   060   176   201   227   254
026   061   177   202   228   255
027   062   178   203   229
Run Code Online (Sandbox Code Playgroud)

正则表达式字符类范围[xy]
字符类范围无法按预期工作.看到这个问题:为什么findstr不能正确处理案例(在某些情况下)?,以及这个答案:https://stackoverflow.com/a/8767815/1012053.

问题是FINDSTR没有按字节代码值整理字符(通常被认为是ASCII代码,但ASCII仅定义为0x00 - 0x7F).大多数正则表达式实现都会将[AZ]视为所有大写英文大写字母.但FINDSTR使用的排序顺序大致对应于SORT的工作方式.因此[AZ]包括完整的英文字母,包括大写和小写("a"除外),以及带有变音符号的非英语字母字符.

下面是FINDSTR支持的所有字符的完整列表,按照FINDSTR用于建立正则表达式字符类范围的归类序列进行排序.字符表示为十进制字节代码值.如果使用代码页437查看字符,我相信整理顺序最有意义.注意 - 此列表是在美国机器上编译的.我不知道其他语言可能对此列表有什么影响.

001
002
003
004
005
006
007
008
014
015
016
017
018           
019
020
021
022
023
024
025
026
027
028
029
030
031
127
039
045
032
255
009
010
011
012
013
033
034
035
036
037
038
040
041
042
044
046
047
058
059
063
064
091
092
093
094
095
096
123
124
125
126
173
168
155
156
157
158
043
249
060
061
062
241
174
175
246
251
239
247
240
243
242
169
244
245
254
196
205
179
186
218
213
214
201
191
184
183
187
192
212
211
200
217
190
189
188
195
198
199
204
180
181
182
185
194
209
210
203
193
207
208
202
197
216
215
206
223
220
221
222
219
176
177
178
170
248
230
250
048
172
171
049
050
253
051
052
053
054
055
056
057
236
097
065
166
160
133
131
132
142
134
143
145
146
098
066
099
067
135
128
100
068
101
069
130
144
138
136
137
102
070
159
103
071
104
072
105
073
161
141
140
139
106
074
107
075
108
076
109
077
110
252
078
164
165
111
079
167
162
149
147
148
153
112
080
113
081
114
082
115
083
225
116
084
117
085
163
151
150
129
154
118
086
119
087
120
088
121
089
152
122
090
224
226
235
238
233
227
229
228
231
237
232
234
Run Code Online (Sandbox Code Playgroud)

正则表达式字符类术语限制和BUG
不仅FINDSTR限制在正则表达式中最多15个字符类术语,它无法正确处理超出限制的尝试.使用16个或更多字符类术语会导致交互式Windows弹出,说明"查找字符串(QGREP)实用程序遇到问题需要关闭.我们很抱歉给您带来不便." 根据Windows版本,消息文本会略有不同.以下是将失败的FINDSTR的一个示例:

echo 01234567890123456|findstr [0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]
Run Code Online (Sandbox Code Playgroud)

DosTips用户Judago 在这里报告了这个错误.它已在XP,Vista和Windows 7上得到确认.

如果它们包含字节代码0xFF(十进制255),
则正则表达式搜索失败(并且可能无限期挂起)任何包含字节代码0xFF(十进制255)的正则表达式搜索都将失败.如果直接包含字节代码0xFF,或者它隐含地包含在字符类范围内,则它会失败.请记住,FINDSTR字符类范围不会根据字节代码值整理字符.字符<0xFF>在字符<space><tab>字符之间的校对序列中相对较早出现.因此任何包含<space><tab>将失败的字符类范围.

确切的行为会根据Windows版本略有变化.如果包含0xFF,Windows 7将无限期挂起.XP没有挂起,但它始终无法找到匹配项,偶尔会打印以下错误消息 - "进程尝试写入不存在的管道."

我无法访问Vista机器,所以我无法在Vista上进行测试.

正则表达式错误:.并且[^anySet]可以匹配文件结束
正则表达式.元字符应该只匹配除<CR>or 之外的任何字符<LF>.如果文件中的最后一行未被<CR>或终止,则存在允许其匹配文件结尾的错误<LF>.但是,.将不匹配空文件.

例如,一个名为"test.txt"的文件包含一行x,而不是终止<CR><LF>将匹配以下内容:

findstr /r x......... test.txt
Run Code Online (Sandbox Code Playgroud)

此错误已在XP和Win7上得到确认.

负面字符集似乎也是如此.类似的东西[^abc]会匹配End-Of-File.正面的字符集[abc]似乎工作得很好.我只在Win7上测试了这个.


Dis*_*ned 7

findstr 搜索大文件时有时会意外挂起.

我还没有确认确切的条件或边界大小.我怀疑任何大于2GB的文件都可能存在风险.

我对此有过不同的经验,所以它不仅仅是文件大小.如果重定向输入不以LF结尾,这看起来可能是FINDSTR在XP和Windows 7挂起的变体,但正如所示,当输入重定向时,此特定问题会显现.

以下命令行会话(Windows 7)演示了findstr搜索3GB文件时如何挂起.

C:\Data\Temp\2014-04>echo 1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890> T100B.txt

C:\Data\Temp\2014-04>for /L %i in (1,1,10) do @type T100B.txt >> T1KB.txt

C:\Data\Temp\2014-04>for /L %i in (1,1,1000) do @type T1KB.txt >> T1MB.txt

C:\Data\Temp\2014-04>for /L %i in (1,1,1000) do @type T1MB.txt >> T1GB.txt

C:\Data\Temp\2014-04>echo find this line>> T1GB.txt

C:\Data\Temp\2014-04>copy T1GB.txt + T1GB.txt + T1GB.txt T3GB.txt
T1GB.txt
T1GB.txt
T1GB.txt
        1 file(s) copied.

C:\Data\Temp\2014-04>dir
 Volume in drive C has no label.
 Volume Serial Number is D2B2-FFDF

 Directory of C:\Data\Temp\2014-04

2014/04/08  04:28 PM    <DIR>          .
2014/04/08  04:28 PM    <DIR>          ..
2014/04/08  04:22 PM               102 T100B.txt
2014/04/08  04:28 PM     1 020 000 016 T1GB.txt
2014/04/08  04:23 PM             1 020 T1KB.txt
2014/04/08  04:23 PM         1 020 000 T1MB.txt
2014/04/08  04:29 PM     3 060 000 049 T3GB.txt
               5 File(s)  4 081 021 187 bytes
               2 Dir(s)  51 881 050 112 bytes free
C:\Data\Temp\2014-04>rem Findstr on the 1GB file does not hang

C:\Data\Temp\2014-04>findstr "this" T1GB.txt
find this line

C:\Data\Temp\2014-04>rem On the 3GB file, findstr hangs and must be aborted... even though it clearly reaches end of file

C:\Data\Temp\2014-04>findstr "this" T3GB.txt
find this line
find this line
find this line
^C
C:\Data\Temp\2014-04>
Run Code Online (Sandbox Code Playgroud)

注意,我已在十六进制编辑器中验证所有行都以终止CRLF.唯一的异常是0x1A由于工作方式copy而终止文件.但请注意,此异常不会导致"小"文件出现问题.

通过额外的测试,我确认了以下内容:

  • 使用copy/b二进制文件选项可防止添加的0x1A角色,并findstr在3GB的文件不挂.
  • 使用不同的字符终止3GB文件也会导致findstr挂起.
  • 0x1A字符不会对"小"文件造成任何问题.(与其他终止字符类似.)
  • 解决问题CRLF后添加0x1A.(LF本身可能就足够了.)
  • 使用type管道文件放入findstr作品没有挂.(这可能是由于其中一个的副作用type|插入另一个End Of Line 的副作用.)
  • 使用重定向输入<也会导致findstr挂起.但这是预期的; 正如dbenham的帖子中所解释的那样:"重定向的输入必须以LF" 结尾".


Aac*_*ini 6

当括号中包含多个命令并且重定向文件到整个块时:

< input.txt (
   command1
   command2
   . . .
) > output.txt
Run Code Online (Sandbox Code Playgroud)

...然后只要块中的命令处于活动状态,文件就会保持打开状态,因此命令可以移动重定向文件的文件指针.MORE和FIND命令在处理之前将Stdin文件指针移动到文件的开头,因此可以在块内多次处理相同的文件.例如,这段代码:

more < input.txt >  output.txt
more < input.txt >> output.txt
Run Code Online (Sandbox Code Playgroud)

...产生与此相同的结果:

< input.txt (
   more
   more
) > output.txt
Run Code Online (Sandbox Code Playgroud)

这段代码:

find    "search string" < input.txt > matchedLines.txt
find /V "search string" < input.txt > unmatchedLines.txt
Run Code Online (Sandbox Code Playgroud)

...产生与此相同的结果:

< input.txt (
   find    "search string" > matchedLines.txt
   find /V "search string" > unmatchedLines.txt
)
Run Code Online (Sandbox Code Playgroud)

FINDSTR是不同的; 它不是标准输入文件指针从当前位置移动.例如,此代码在搜索行后插入一个新行:

call :ProcessFile < input.txt
goto :EOF

:ProcessFile
   rem Read the next line from Stdin and copy it
   set /P line=
   echo %line%
   rem Test if it is the search line
   if "%line%" neq "search line" goto ProcessFile
rem Insert the new line at this point
echo New line
rem And copy the rest of lines
findstr "^"
exit /B
Run Code Online (Sandbox Code Playgroud)

我们可以借助于允许我们移动重定向文件的文件指针的辅助程序来充分利用此功能,如本例所示.

jeb这篇文章中首次报道了这种行为.


编辑2018-08-18:报告了新的FINDSTR错误

当此命令用于显示颜色字符并且此类命令的输出重定向到CON设备时,FINDSTR命令会发生奇怪的错误.有关如何使用FINDSTR命令以彩色显示文本的详细信息,请参阅此主题.

当这种形式的FINDSTR命令的输出被重定向到CON时,在以所需颜色输出文本之后发生了一些奇怪的事情:它之后的所有文本都被输出为"不可见"字符,尽管更精确的描述是文本是输出作为黑色背景上的黑色文本.如果使用COLOR命令重置整个屏幕的前景色和背景色,则会出现原始文本.但是,当文本"不可见"时,我们可以执行SET/P命令,因此输入的所有字符都不会出现在屏幕上.此行为可用于输入密码.

@echo off
setlocal

set /P "=_" < NUL > "Enter password"
findstr /A:1E /V "^$" "Enter password" NUL > CON
del "Enter password"
set /P "password="
cls
color 07
echo The password read is: "%password%"
Run Code Online (Sandbox Code Playgroud)