一位客户报告了我们的软件“挂起”的问题。我最终设法找到了负责的代码段:
While strShortenedText.IndexOf(" ") > -1 'if there are two contiguous spaces
'anywhere in this string ...
strShortenedText = strShortenedText.Replace(" ", " ") '... replace each
'occurrence with a
'single string ...
End While '... and do this in a loop in order to handle situations with 3 or more
'contiguous spaces
Run Code Online (Sandbox Code Playgroud)
恐怕我无法在此处包含实际的字符串,因为它包含 GDPR 立法涵盖的数据,但我只是在寻找关于这如何可能进入无限循环的解释?
我发现的是:
strShortenedText.IndexOf(" ")
返回大于-1的值,因此该字符串中的某处有两个连续的空格(我确保这些空格都是 0x20 即 char(32))" "
在任何地方包含(2 个连续空格),并且我将" "
(2 个连续空格)替换为" "
(一个空格),那么,肯定(??????),strShortenedText
之后.Replace
的内容会比替换之前短?相反,我发现 REPLACE 使字符串保持不变,并且无论代码循环执行该语句的频率如何,变量的长度都不会改变。
这怎么可能呢?我能做些什么来处理这种情况?
问题在于IndexOf
,Replace
默认情况下使用不同的方式查找文本。这意味着,例如,如果您有一个包含U+200C(零宽度非连接符)IndexOf
的字符串,则会跳过它并找到每一侧的两个空格,但Replace
不会。
您需要制定IndexOf
并Replace
使用一致的方式来查找文本。最容易理解(但可能不是最理想的)选项是使用序数检查:
Module Program
Sub Main(args As String())
Console.WriteLine(ReplaceMultipleSpaces("x " & ChrW(&H200C) & " y"))
End Sub
Function ReplaceMultipleSpaces(text As String)
While text.IndexOf(" ", StringComparison.Ordinal) > -1
text = text.Replace(" ", " ", StringComparison.Ordinal)
End While
Return text
End Function
End Module
Run Code Online (Sandbox Code Playgroud)
(一般来说,使用String.Contains
notString.IndexOf
可能更可取 - 但如果您不想只使用默认值,则需要指定要执行的检查类型。)
正如评论中提到的,问题的原因是两个空格之间存在不可打印的字符(ASCII 代码 173)。显然 .IndexOf 方法看不到这个字符,而 .Replace 方法却可以。因此,.IndexOf 方法找到一个它认为有两个连续空格的位置,但 .Replace 方法无法替换该位置,因为该位置中的实际文本不是两个 ASCII 字符 32,而是 ASCII 字符 32,后跟ASCII 字符 173,后跟第二个 ASCII 字符 32。
那么解决方案就是围绕这种可能性进行编码:
nDoubleSpace = strShortenedText.IndexOf(" ") 'Okay - we found a situation where there may be non-printing characters
' between two spaces, where this IndexOf method returns a value > -1 ....
' And this caused an infinite loop as the Replace did nothing.
While nDoubleSpace > -1
If strShortenedText.Substring(nDoubleSpace, 2) <> " " Then
'The IndexOf found a location with two spaces separated by non-printing characters (for example, character 173),
'which means a .Replace won't work. Sadly this software targets .NET Framework, not .NET Core so we don't have
'the option to use a StringComparison override in the .Replace. This is the next best thing we can do...
'Basically, we know that the location we found with the .IndexOf STARTS and ENDS with a space. So instead of
'doing a .Replace, we just concatenate everything before the FIRST space with everything
'from the SECOND space onward.
Dim nNextSpace As Integer
nNextSpace = 1
'Find the position of the second space
While strShortenedText.Substring(nDoubleSpace + nNextSpace, 1) <> " "
nNextSpace = nNextSpace + 1
End While
' everything before the first space everything from the second space onward
strShortenedText = strShortenedText.Substring(0, nDoubleSpace) + strShortenedText.Substring(nDoubleSpace + nNextSpace)
Else
'Now we know that the IndexOf and the Substring are in agreement, we can be happy that this
'simple Replace will do the job
strShortenedText = strShortenedText.Replace(" ", " ")
End If
'And check again to handle more than 2 contiguous spaces
nDoubleSpace = strShortenedText.IndexOf(" ")
End While
Run Code Online (Sandbox Code Playgroud)
是的,它很丑。但它有效