Windows命令解释器(CMD.EXE)如何解析脚本?

Ben*_*oit 132 windows parsing cmd batch-file variable-expansion

我遇到了ss64.com,它为如何编写Windows命令解释器将运行的批处理脚本提供了很好的帮助.

但是,我一直无法找到批处理脚本的语法,扩展或不扩展的方法,以及如何逃避事情的良好解释.

以下是我无法解决的示例问题:

  • 如何管理报价系统?我制作了一个TinyPerl脚本
    (foreach $i (@ARGV) { print '*' . $i ; }),编译它并以这种方式调用它:
    • my_script.exe "a ""b"" c"?输出是 *a "b*c
    • my_script.exe """a b c"""?输出它*"a*b*c"
  • 内部echo命令如何工作?在那个命令中扩展了什么?
  • 为什么我必须for [...] %%I在文件脚本中使用,但for [...] %I在交互式会话中?
  • 什么是转义字符,以及在什么情况下?如何逃脱百分号?例如,我怎么能%PROCESSOR_ARCHITECTURE%字面回声?我发现echo.exe %""PROCESSOR_ARCHITECTURE%有效,有没有更好的解决方案?
  • 如何%配对?例:
    • set b=a,echo %a %b% c%%a a c%
    • set a =b,echo %a %b% c%bb c%
  • 如果此变量包含双引号,如何确保变量作为单个参数传递给命令?
  • 使用set命令时如何存储变量?例如,如果我这样做set a=a" b,那么echo.%a%我获得a" b.但是,如果我使用echo.exeUnxUtils,我会得到a b.怎么%a%以不同的方式扩展?

谢谢你的灯.

jeb*_*jeb 179

I performed many experiments to investigate the grammar of batch scripts. I also investigated the differences between batch and command line mode.

The Batch Line Parser:

Processing a line of code in a batch file involves multiple phases.

Here is a brief overview of the various phases:

Phase 0) Read Line:

Phase 1) Percent Expansion:

Phase 1.5) Remove <CR>: Remove all Carriage Return (0x0D) characters

Phase 2) Process special characters, tokenize, and build a cached command block: This is a complex process that is affected by things such as quotes, special characters, token delimiters, and caret escapes.

Phase 3) Echo the parsed command(s) Only if the command block did not begin with @, and ECHO was ON at the start of the preceding step.

阶段4)FOR %X变量扩展:仅当FOR命令处于活动状态且正在处理DO之后的命令时.

阶段5)延迟扩展:仅在启用延迟扩展时

阶段5.3)管道处理:仅当命令位于管道的任何一侧时

阶段5.5)执行重定向:

阶段6)CALL处理/插入符号加倍:仅当命令令牌为CALL时

阶段7)执行:执行命令


以下是每个阶段的详细信息:

请注意,下面描述的阶段只是批处理解析器工作方式的模型.实际的cmd.exe内部可能无法反映这些阶段.但是这种模型可以有效地预测批处理脚本的行为.

阶段0)读取线:读取输入线.

  • When reading a line to be parsed as a command, <LF> (0x1A) is read as <Ctrl-Z> (LineFeed 0x0A)
  • When GOTO or CALL reads lines while scanning for a :label, <LF>, is treated as itself - it is not converted to <Ctrl-Z>

Phase 1) Percent Expansion:

  • A double <LF> is replaced by a single %%
  • Expansion of argument variables (%, %*, etc.)
  • Expansion of %1, if var does not exists replace it with nothing
  • For a complete explanation read the first half of this from dbenham Same thread: Percent Phase

Phase 1.5) Remove %2: Remove all Carriage Returns (0x0D) from the line

Phase 2) Process special characters, tokenize, and build a cached command block: This is a complex process that is affected by things such as quotes, special characters, token delimiters, and caret escapes. What follows is an approximation of this process.

There are some concepts that are important throughout this phase.

  • A token is simply a string of characters that is treated as a unit.
  • Tokens are separated by token delimiters. The standard token delimiters are %var% <LF> %var% <CR> <space> <tab> ; and ,
    Consecutive token delimiters are treated as one - there are no empty tokens between token delimiters
  • There are no token delimiters within a quoted string. The entire quoted string is always treated as part of a single token. A single token may consist of a combination of quoted strings and unquoted characters.

The following characters may have special meaning in this phase, depending on context: = <0x0B> <0x0C> <0xFF> ^ ( @ & | < > <LF> <space> <tab> ; ,

Look at each character from left to right:

  • If it is a caret (=), the next character is escaped, and the escaping caret is removed. Escaped characters lose all special meaning (except for <0x0B>).
  • If it is a quote (<0x0C>), toggle the quote flag. If the quote flag is active, then only <0xFF> and ^ are special. All other characters lose their special meaning until the next quote toggles the quote flag off. It is not possible to escape the closing quote. All quoted characters are always within the same token.
  • <LF> always turns off the quote flag. Other behaviors vary depending on context, but quotes never alter the behavior of ".
    • Escaped "
      • <LF> is stripped
      • The next character is escaped. If at the end of line buffer, then the next line is read and appended to the current one before escaping the next character. If the next character is <LF>, then it is treated as a literal, meaning this process is not recursive.
    • Unescaped <LF> not within parentheses
      • <LF> is stripped and parsing of the current line is terminated.
      • Any remaining characters in the line buffer are simply ignored.
    • Unescaped <LF> within a FOR IN parenthesized block
      • <LF> is converted into a <LF>
      • If at the end of the line buffer, then the next line is read and appended to the current one.
    • Unescaped <LF> within a parenthesized command block
      • <LF> is converted into <LF>, and the <space> is treated as part of the next line of the command block.
      • If at the end of line buffer, then the next line is read and appended to the space.
  • If it is one of the special characters <LF> <LF> <LF><space> or <space>, split the line at this point in order to handle pipes, command concatenation, and redirection.
    • In the case of a pipe (&), each side is a separate command (or command block) that gets special handling in phase 5.3
    • In the case of |, <, or > command concatenation, each side of the concatenation is treated as a separate command.
    • In the case of |, &, &&, or || redirection, the redirection clause is parsed, temporarily removed, and then appended to the end of the current command. A redirection clause consists of an optional file handle digit, the redirection operator, and the redirection destination token.
      • If the token that precedes the redirection operator is a single digit, then the digit specifies the file handle to be redirected. If the handle token is not found, then output redirection defaults to 1 (stdout), and input redirection defaults to 0 (stdin).
  • If the very first token for this command (prior to moving redirection to the end) begins with <, then the << has special meaning. (> is not special in any other context)
    • The special >> is removed.
    • If ECHO is ON, then this command, along with any following concatenated commands on this line, are excluded from the phase 3 echo. If the @ is before an opening @, then the entire parenthesized block is excluded from the phase 3 echo.
  • Process parenthesis (provides for compound statements across multiple lines):
    • If the parser is not looking for a command token, then @ is not special.
    • If the parser is looking for a command token and finds @, then start a new compound statement and increment the parenthesis counter
    • If the parenthesis counter is > 0 then @ terminates the compound statement and decrements the parenthesis counter.
    • If the line end is reached and the parenthesis counter is > 0 then the next line will be appended to the compound statement (starts again with phase 0)
    • If the parenthesis counter is 0 and the parser is looking for a command, then ( functions similar to a ( statement as long as it is immediately followed by a token delimiter, special character, newline, or end-of-file
      • All special characters lose their meaning except ( (line concatenation is possible)
      • Once the end of the logical line is reached, the entire "command" is discarded.
  • Each command is parsed into a series of tokens. The first token is always treated as a command token (after special ) have been stripped and redirection moved to the end).
    • Leading token delimiters prior to the command token are stripped
    • When parsing the command token, ) functions as a command token delimiter, in addition to the standard token delimiters
    • The handling of subsequent tokens depends on the command.
  • Most commands simply concatenate all arguments after the command token into a single argument token. All argument token delimiters are preserved. Argument options are typically not parsed until phase 7.
  • Three commands get special handling - IF, FOR, and REM
    • IF is split into two or three distinct parts that are processed independently. A syntax error in the IF construction will result in a fatal syntax error.
      • The comparison operation is the actual command that flows all the way through to phase 7
        • All IF options are fully parsed in phase 2.
        • Consecutive token delimiters collapse into a single space.
        • Depending on the comparison operator, there will be one or two value tokens that are identified.
      • The True command block is the set of commands after the condition, and is parsed like any other command block. If ELSE is to be used, then the True block must be parenthesized.
      • The optional False command block is the set of commands after ELSE. Again, this command block is parsed normally.
      • The True and False command blocks do not automatically flow into the subsequent phases. Their subsequent processing is controled by phase 7.
    • FOR is split in two after the DO. A syntax error in the FOR construction will result in a fatal syntax error.
      • The portion through DO is the actual FOR iteration command that flows all the way through phase 7
        • All FOR options are fully parsed in phase 2.
        • The IN parenthesized clause treats REM as ^. After the IN clause is parsed, all tokens are concatenated together to form a single token.
        • Consecutive unescaped/unquoted token delimiters collapse into a single space throughout the FOR command through DO.
      • The portion after DO is a command block that is parsed normally. Subsequent processing of the DO command block is controled by the iteration in phase 7.
    • REM detected in phase 2 is treated dramatically different than all other commands.
      • Only one argument token is parsed - the parser ignores characters after the first argument token.
      • If there is only one argument token that ends with an unescaped @ that ends the line, then the argument token is thrown away, and the subsequent line is parsed and appended to the REM. This repeats until there is more than one token, or the last character is not (.
      • The REM command may appear in phase 3 output, but the command is never executed, and the original argument text is echoed - escaping carets are not removed.
  • If the command token begins with <LF>, and this is the first round of phase 2 (not a restart due to CALL in phase 6) then
    • The token is normally treated as an Unexecuted Label.
      • The remainder of the line is parsed, however <space>, ^, ^, : and ) no longer have special meaning. The entire remainder of the line is considered to be part of the label "command".
      • The < continues to be special, meaning that line continuation can be used to append the subsequent line to the label.
      • An Unexecuted Label within a parenthesized block will result in a fatal syntax error unless it is immediately followed by a command or Executed Label on the next line.
        • Note that > no longer has special meaning for the first command that follows the Unexecuted Label in this context.
      • The command is aborted after label parsing is complete. Subsequent phases do not take place for the label
    • There are three exceptions that can cause a label found in phase 2 to be treated as an Executed Label that continues parsing through phase 7.
      • There is redirection that precedes the label token, and there is a & pipe or |, ^, or ( command concatenation on the line.
      • There is redirection that precedes the label token, and the command is within a parenthesized block.
      • The label token is the very first command on a line within a parenthesized block, and the line above ended with an Unexecuted Label.
    • The following occurs when an Executed Label is discovered in phase 2
      • The label, its arguments, and its redirection are all excluded from any echo output in phase 3
      • Any subsequent concatenated commands on the line are fully parsed and executed.
    • For more information about Executed Labels vs. Unexecuted Labels, see https://www.dostips.com/forum/viewtopic.php?f=3&t=3803&p=55405#p55405

Phase 3) Echo the parsed command(s) Only if the command block did not begin with |, and ECHO was ON at the start of the preceding step.

Phase 4) FOR & variable expansion: Only if a FOR command is active and the commands after DO are being processed.

  • At this point, phase 1 of batch processing will have already converted a FOR variable like && into ||. The command line has different percent expansion rules for phase 1. This is the reason that command lines use @ but batch files use %X for FOR variables.
  • FOR variable names are case sensitive, but %%X are not case sensitive.
  • %X take precedence over variable names. If a character following %X is both a modifier and a valid FOR variable name, and there exists a subsequent character that is an active FOR variable name, then the character is interpreted as a modifier.
  • FOR variable names are global, but only within the context of a DO clause. If a routine is CALLed from within a FOR DO clause, then the FOR variables are not expanded within the CALLed routine. But if the routine has its own FOR command, then all currently defined FOR variables are accessible to the inner DO commands.
  • FOR variable names can be reused within nested FORs. The inner FOR value takes precedence, but once the INNER FOR closes, then the outer FOR value is restored.
  • If ECHO was ON at the start of this phase, then phase 3) is repeated to show the parsed DO commands after the FOR variables have been expanded.

---- From this point onward, each command identified in phase 2 is processed separately.
---- Phases 5 through 7 are completed for one command before moving on to the next.

Phase 5) Delayed Expansion: Only if delayed expansion is on

  • If the command is within a parenthesized block on either side of a pipe, then skip this step.
  • Each token for a command is parsed for delayed expansion independently.
    • Most commands parse two or more tokens - the command token, the arguments token, and each redirection destination token.
    • The FOR command parses the IN clause token only.
    • The IF command parses the comparison values only - either one or two, depending on the comparison operator.
  • For each parsed token, first check if it contains any %%X. If not, then the token is not parsed - important for ~modifiers characters. If the token does contain ~modifiers, then scan each character from left to right:
    • If it is a caret (~) the next character has no special meaning, the caret itself is removed
    • If it is an exclamation mark, search for the next exclamation mark (carets are not observed anymore), expand to the value of the variable.
      • Consecutive opening ! are collapsed into a single ^
      • Any remaining ! that cannot be paired is removed
    • Important: At this phase quotes and other special characters are ignored
    • Expanding vars at this stage is "safe", because special characters are not detected anymore (even ^ or !)
    • For a more complete explanation, read the 2nd half of this from dbenham same thread - Exclamation Point Phase
    • There are some edge cases where these rules seem to fail:
      See Delayed expansion fails in some cases

Phase 5.3) Pipe processing: Only if commands are on either side of a pipe
Each side of the pipe is processed independently.

  • If dealing with a parenthesized command block, then all ! with a command before and after are converted to !. Other <CR> are stripped.
  • The command (or command block) is executed asynchronously in a new cmd.exe thread via
    <LF>. This means the command block gets a phase restart, but this time in command line mode.
  • This is the end of processing for the pipe commands.
  • For more info on how pipes are parsed and processed, look at this question and answers: Why does delayed expansion fail when inside a piped block of code?

Phase 5.5) Execute Redirection: Any redirection that was discovered in phase 2 is now executed.

Phase 6) CALL processing/Caret doubling: Only if the command token is CALL, or if the text before the first occurring standard token delimiter is CALL. If CALL is parsed from a larger command token, then the unused portion is prepended to the arguments token bef

  • 你好jeb,谢谢你的见解......可能很难理解,但我会试着去思考它!你似乎已经进行了很多测试!感谢您的翻译(http://www.administrator.de/Die_Geheimnisse_des_Batch_Zeilen_Interpreters.html) (4认同)
  • 杰布 - 也许第0阶段可以移动并与第6阶段结合?这对我来说更有意义,还是有理由将它们分开? (3认同)
  • 更新了第2和第5阶段 (3认同)
  • 批处理阶段 5) - %%a 将在阶段 1 中已更改为 %a,因此 for 循环扩展确实扩展了 %a。另外,我在下面的答案中添加了对批处理阶段 1 的更详细说明(我没有编辑权限) (2认同)
  • @dbenham - 你是对的,我从来都不喜欢0阶段 (2认同)

Mik*_*ark 60

从命令窗口调用命令时,命令行参数的标记化不是由cmd.exe(也称为"shell")完成的.大多数情况下,标记化是由新形成的进程的C/C++运行时完成的,但这不一定是这样 - 例如,如果新进程不是用C/C++编写的,或者新进程选择忽略argv和处理自己的原始命令行(例如,使用GetCommandLine()).在操作系统级别,Windows将未命名的命令行作为单个字符串传递给新进程.这与大多数*nix shell形成对比,其中shell在将参数传递给新形成的进程之前以一致,可预测的方式对参数进行标记.所有这些意味着您可能会在Windows上的不同程序中遇到极为不同的参数标记化行为,因为单个程序通常会将参数标记化放在自己手中.

如果它听起来像无政府状态,那就是它.但是,由于大量Windows程序确实使用了Microsoft C/C++运行时argv,因此了解MSVCRT如何标记参数通常很有用.这是一段摘录:

  • 参数由空格分隔,空格可以是空格或制表符.
  • 由双引号括起的字符串被解释为单个参数,而不管其中包含的空格.带引号的字符串可以嵌入参数中.请注意,插入符号(^)不会被识别为转义字符或分隔符.
  • 带有反斜杠的双引号""被解释为文字双引号(").
  • 反斜杠按字面解释,除非它们紧跟在双引号之前.
  • 如果偶数个反斜杠后面跟一个双引号,那么每个反斜杠(\)都会在argv数组中放置一个反斜杠(),双引号(")将被解释为字符串分隔符.
  • 如果奇数个反斜杠后面跟一个双引号,那么每个反斜杠对都会在argv数组中放置一个反斜杠(),并且双引号会被剩余的反斜杠解释为转义序列,从而导致要放在argv中的文字双引号(").

Microsoft"批处理语言"(.bat)也不例外,它已经开发了自己独特的标记化和转义规则.在将参数传递给新执行的进程之前,它看起来像cmd.exe的命令提示符确实对命令行参数进行了一些预处理(主要用于变量替换和转义).您可以在本页的jeb和dbenham的优秀答案中阅读有关批处理语言和cmd转义的低级详细信息的更多信息.


让我们在C中构建一个简单的命令行实用程序,看看它对你的测试用例的描述:

int main(int argc, char* argv[]) {
    int i;
    for (i = 0; i < argc; i++) {
        printf("argv[%d][%s]\n", i, argv[i]);
    }
    return 0;
}
Run Code Online (Sandbox Code Playgroud)

(注意:argv [0]始终是可执行文件的名称,为简洁起见,在下面省略.在Windows XP SP3上测试.使用Visual Studio 2005编译.)

> test.exe "a ""b"" c"
argv[1][a "b" c]

> test.exe """a b c"""
argv[1]["a b c"]

> test.exe "a"" b c
argv[1][a" b c]
Run Code Online (Sandbox Code Playgroud)

还有一些我自己的测试:

> test.exe a "b" c
argv[1][a]
argv[2][b]
argv[3][c]

> test.exe a "b c" "d e
argv[1][a]
argv[2][b c]
argv[3][d e]

> test.exe a \"b\" c
argv[1][a]
argv[2]["b"]
argv[3][c]
Run Code Online (Sandbox Code Playgroud)

  • "请记住,从Win32的角度来看,命令行只是一个被复制到新进程的地址空间的字符串.启动进程和新进程如何解释这个字符串不是由规则控制的,而是按照惯例来管理的." -Raymond Chen http://blogs.msdn.com/b/oldnewthing/archive/2009/11/25/9928372.aspx (4认同)
  • 这是很好的信息,但 Microsoft 文档不完整!(大惊喜)实际缺失的规则记录在http://www.daviddeley.com/autohotkey/parameters/parameters.htm#WINCRULES。 (3认同)
  • 谢谢你真正好的答案.这在我看来解释得很多.这也解释了为什么我有时会发现使用Windows真的很糟糕...... (2认同)

dbe*_*ham 44

以下是jeb答案中阶段1的扩展说明(对于批处理模式和命令行模式均有效).

阶段1)扩展百分比 从左开始,扫描每个字符%.如果发现那么

  • 1.1(转义<LF>) 如果命令行模式跳过
    • 如果批处理模式,紧接着又<LF>那么
      替换<LF><LF>,并继续扫描
  • 1.2(扩展参数) 如果命令行模式跳过
    • 否则,如果是批处理模式
      • 如果后跟<CR>并且启用了命令扩展,则
        使用%所有命令行参数的文本替换(如果没有参数则替换为空)并继续扫描.
      • 否则,如果后面%再跟随参数值替换(如果未定义,则
        替换%为空)并继续扫描.
      • 否则,如果后面跟着,%%则启用命令扩展
        • 如果后跟可选的有效参数修饰符列表,后跟必需,%
          替换*为修改后的参数值(如果未定义或未指定$ PATH:modifier,则替换为空)并继续扫描.
          注意:修饰符不区分大小写,并且可以按任何顺序出现多次,除了$ PATH:修饰符只能出现一次,并且必须是之前的最后一个修饰符%*
        • 其他无效的修改后的参数语法会引发致命错误:所有已解析的命令都会中止,如果处于批处理模式,则批处理将中止!
  • 1.3(扩展变量)
    • 否则,如果禁用了命令扩展,则
      查看下一个字符串,在之前<digit>或之前打破%<digit>,并将它们称为VAR(可能是一个空列表)
      • 如果下一个字符是~那么
        • 如果定义了VAR,则
          替换<digit>为VAR值并继续扫描
        • 否则,如果是批处理模式,则
          删除%~[modifiers]<digit>并继续扫描
        • 其他转到1.4
      • 其他转到1.4
    • 否则,如果启用了命令扩展,则
      查看下一个字符串,在之前<digit> %或之前打破%,并将其称为VAR(可能是一个空列表).如果VAR之前中断%VAR%,则后续字符%VAR%包含%在VAR中的最后一个字符并且之前中断:.
      • 如果下一个字符是:那么
        • 如果定义了VAR,则
          替换%为VAR值并继续扫描
        • 否则,如果是批处理模式,则
          删除:并继续扫描
        • 其他转到1.4
      • 否则,如果下一个字符是%那么
        • 如果VAR未定义则
          • 如果是批处理模式,则
            删除%并继续扫描.
          • 其他转到1.4
        • 否则,如果下一个字符是%VAR%那么
          • 如果下一个字符串匹配模式%VAR%然后
            替换:为VAR值的子字符串(可能导致空字符串)并继续扫描.
          • 其他转到1.4
        • 否则,如果随后%VAR:还是~那么
          无效变量的搜索和替换语法引发致命错误:所有解析命令中止,批处理在批处理模式下,如果中止!
        • 否则,如果下一个字符串匹配模式[integer][,[integer]]%,其中搜索可以包括除%VAR:~[integer][,[integer]]%和之外的任何字符集=,并且替换可以包括除了*=和之外的任何字符集[*]search=[replace]%,然后
          =在执行搜索和替换后替换为VAR的值(可能导致空字符串)并继续扫描
        • 其他转到1.4
  • 1.4(条带%)
    • 否则如果是批处理模式,则
      删除%并继续扫描
    • 否则保留%VAR:[*]search=[replace]%并继续扫描

以上有助于解释为什么这一批

@echo off
setlocal enableDelayedExpansion
set "1var=varA"
set "~f1var=varB"
call :test "arg1"
exit /b  
::
:test "arg1"
echo %%1var%% = %1var%
echo ^^^!1var^^^! = !1var!
echo --------
echo %%~f1var%% = %~f1var%
echo ^^^!~f1var^^^! = !~f1var!
exit /b
Run Code Online (Sandbox Code Playgroud)

给出这些结果:

%1var% = "arg1"var
!1var! = varA
--------
%~f1var% = P:\arg1var
!~f1var! = varB
Run Code Online (Sandbox Code Playgroud)

注1 - 第1阶段发生在识别REM声明之前.这非常重要,因为这意味着如果注释具有无效的参数扩展语法或无效的变量搜索和替换语法,则即使注释也会产生致命错误!

@echo off
rem %~x This generates a fatal argument expansion error
echo this line is never reached
Run Code Online (Sandbox Code Playgroud)

注2 - %解析规则的另一个有趣结果:可以定义包含在名称中的变量,但除非禁用命令扩展,否则无法扩展它们.有一个例外 - 在启用命令扩展时,可以扩展末尾包含单个冒号的变量名.但是,您不能对以冒号结尾的变量名执行子字符串或搜索和替换操作.下面的批处理文件(由jeb提供)演示了这种行为

@echo off
setlocal
set var=content
set var:=Special
set var::=double colon
set var:~0,2=tricky
set var::~0,2=unfortunate
echo %var%
echo %var:%
echo %var::%
echo %var:~0,2%
echo %var::~0,2%
echo Now with DisableExtensions
setlocal DisableExtensions
echo %var%
echo %var:%
echo %var::%
echo %var:~0,2%
echo %var::~0,2%
Run Code Online (Sandbox Code Playgroud)

注3 - jeb在其帖子中列出的解析规则顺序的一个有趣的结果:当执行搜索并用正常扩展替换时,不应该转义特殊字符(尽管它们可能被引用).但是当执行搜索并用延迟扩展替换时,必须转义特殊字符(除非它们被引用).

@echo off
setlocal enableDelayedExpansion
set "var=this & that"
echo %var:&=and%
echo "%var:&=and%"
echo !var:^&=and!
echo "!var:&=and!"
Run Code Online (Sandbox Code Playgroud)

以下是对jeb答案中第5阶段的扩展且更准确的解释(对批处理模式和命令行模式都有效)

请注意,有些边缘情况会导致这些规则失败:
请参阅使用CALL检查换行符

阶段5)延迟扩展仅当启用了延迟扩展,并且该行至少包含一个%,然后从左开始,扫描每个字符%%,如果找到,则

  • 5.1(插入符号)需要%CALL文字
    • 如果字符是插入符号&,然后
      • 除掉 &&
      • 扫描下一个字符并将其保存为文字
      • 继续扫描
  • 5.2(扩展变量)
    • 如果是||,那么
      • 如果禁用了命令扩展,则
        查看下一个字符串,在之前|或之前打破for ... in(TOKEN) do,并将它们称为VAR(可能是一个空列表)
        • 如果下一个字符是if defined TOKEN那么
          • 如果定义了VAR,则
            使用if exists TOKENVAR的值替换并继续扫描
          • 否则,如果是批处理模式,则
            删除if errorlevel TOKEN并继续扫描
          • 另外转到5.2.1
        • 另外转到5.2.1
      • 否则,如果启用了命令扩展,然后
        看看字符的下一个字符串,打破之前if cmdextversion TOKEN,if TOKEN comparison TOKEN==,并呼吁他们VAR(可能是一个空的列表).如果VAR之前中断equ,则后续字符neq包含lss在VAR中的最后一个字符并且之前中断leq
        • 如果下一个字符是gtr那么
          • 如果存在VAR,则
            使用geqVAR的值替换并继续扫描
          • 否则,如果是批处理模式,则
            删除!并继续扫描
          • 另外转到5.2.1
        • 否则,如果下一个字符是!那么
          • 如果VAR未定义则
            • 如果是批处理模式,则
              删除^并继续扫描
            • 另外转到5.2.1
          • 否则,如果下一个字符串匹配模式
            !然后
            替换!为VAR值的子字符串(可能导致空字符串)并继续扫描
          • 否则,如果字符的下一个字符串匹配的图案^,其中,搜索可以包括任何一组,除了字符^^,和替换可以包括任何一组,除了字符!!,然后
            替换<LF>执行搜索之后与VAR的值和替换(可能导致在一个空串)并继续扫描
          • 另外转到5.2.1
        • 另外转到5.2.1
      • 5.2.1
        • 如果批处理模式然后删除!
          Else保留!VAR!
        • 继续扫描,从下一个字符开始 !VAR!

  • +1,这里只缺少冒号语法和规则`%definedVar:a = b%`vs`%undefinedVar:a = b%`和`%var:~0x17,-010%`表格 (3认同)
  • 好点 - 我扩展了变量扩展部分以解决您的问题.我还扩展了参数扩展部分以填补一些遗漏的细节. (2认同)
  • 在从jeb获得一些额外的私人反馈后,我添加了一个以冒号结尾的变量名称的规则,并添加了注释2.我还添加了注释3,因为我认为它很有趣且重要. (2认同)

bob*_*ogo 7

正如所指出的,命令在μSoftland中传递整个参数字符串,由它们将它解析为单独的参数供自己使用.在不同的程序之间没有任何一致性,因此没有一套规则来描述这个过程.你真的需要检查你的程序使用的任何C库的每个角落案例.

就系统.bat文件而言,这是测试:

c> type args.cmd
@echo off
echo cmdcmdline:[%cmdcmdline%]
echo 0:[%0]
echo *:[%*]
set allargs=%*
if not defined allargs goto :eof
setlocal
@rem Wot about a nice for loop?
@rem Then we are in the land of delayedexpansion, !n!, call, etc.
@rem Plays havoc with args like %t%, a"b etc. ugh!
set n=1
:loop
    echo %n%:[%1]
    set /a n+=1
    shift
    set param=%1
    if defined param goto :loop
endlocal
Run Code Online (Sandbox Code Playgroud)

现在我们可以运行一些测试.看看你是否可以弄清楚μSoft正在尝试做什么:

C>args a b c
cmdcmdline:[cmd.exe ]
0:[args]
*:[a b c]
1:[a]
2:[b]
3:[c]
Run Code Online (Sandbox Code Playgroud)

好到目前为止.(我就离开了无趣%cmdcmdline%,并%0从现在开始.)

C>args *.*
*:[*.*]
1:[*.*]
Run Code Online (Sandbox Code Playgroud)

没有文件名扩展.

C>args "a b" c
*:["a b" c]
1:["a b"]
2:[c]
Run Code Online (Sandbox Code Playgroud)

没有引用剥离,虽然引号确实阻止了参数拆分.

c>args ""a b" c
*:[""a b" c]
1:[""a]
2:[b" c]
Run Code Online (Sandbox Code Playgroud)

连续的双引号会导致它们失去任何特殊的解析能力.@Beniot的例子:

C>args "a """ b "" c"""
*:["a """ b "" c"""]
1:["a """]
2:[b]
3:[""]
4:[c"""]
Run Code Online (Sandbox Code Playgroud)

测验:如何将任何环境var的值作为单个参数(即as %1)传递给bat文件?

c>set t=a "b c
c>set t
t=a "b c
c>args %t%
1:[a]
2:["b c]
c>args "%t%"
1:["a "b]
2:[c"]
c>Aaaaaargh!
Run Code Online (Sandbox Code Playgroud)

Sane解析似乎永远破碎了.

为了您的娱乐,尝试添加杂^,\,',&(下略)字符这些例子.


SS6*_*S64 5

你已经有了一些很好的答案,但要回答你问题的一部分:

set a =b, echo %a %b% c% ? bb c%
Run Code Online (Sandbox Code Playgroud)

发生的事情是因为你在=之前有一个空格,%a<space>% 所以当你echo %a %被正确计算为时,会创建一个名为的变量b.

b% c%然后将剩余部分作为纯文本+未定义变量进行评估% c%,该变量应作为类型回显,以便echo %a %b% c%返回bb% c%

我怀疑在变量名中包含空格的能力比计划的"特征"更具有疏忽性