Sha*_*elt 2 delphi string replace delphi-run-time-library
我有这三个函数可以成功地从给定字符串中删除所有非数字字符:
第一个函数循环输入字符串的字符,如果当前字符是数字,则将其添加到作为函数结果返回的新字符串中。
function RemoveNonNumericChars(const s: string): string;
begin
Result := '';
for var i := 1 to Length(s) do
begin
if s[i] in ['0'..'9'] then
Result := Result + s[i];
end;
end;
Run Code Online (Sandbox Code Playgroud)
第二个函数从右到左循环输入字符串的字符,如果当前字符不是数字,则使用该Delete函数将其从字符串中删除
function RemoveNonNumericChars(const s: string): string;
begin
Result := s;
for var i := Length(Result) downto 1 do
begin
if not(Result[i] in ['0'..'9']) then
Delete(Result, i, 1);
end;
end;
Run Code Online (Sandbox Code Playgroud)
第三个函数使用正则表达式将所有非数字字符替换为空,从而将其删除。TRegEx是来自System.RegularExpressions单位的。
function RemoveNonNumericChars(const s: string): string;
begin
var RegEx := TRegEx.Create('[^0-9]');
Result := RegEx.Replace(s, '');
end;
Run Code Online (Sandbox Code Playgroud)
他们三个都做了我需要的事情,但我想知道 Delphi 中是否有一个内置函数可以做到这一点......或者甚至可能是比我正在做的方式更好的方法。在 Delphi 中从字符串中删除非数字字符的最佳和/或最快方法是什么?
你的两种方法都很慢,因为你不断改变字符串的长度。此外,他们只识别阿拉伯数字。
为了解决性能问题,预先分配最大结果长度:
function RemoveNonDigits(const S: string): string;
begin
SetLength(Result, S.Length);
var LActualLength := 0;
for var i := 1 to S.Length do
if CharInSet(S[i], ['0'..'9']) then
begin
Inc(LActualLength);
Result[LActualLength] := S[i];
end;
SetLength(Result, LActualLength);
end;
Run Code Online (Sandbox Code Playgroud)
要支持非阿拉伯数字,请使用以下TCharacter.IsDigit函数:
function RemoveNonDigits(const S: string): string;
begin
SetLength(Result, S.Length);
var LActualLength := 0;
for var i := 1 to S.Length do
if S[i].IsDigit then
begin
Inc(LActualLength);
Result[LActualLength] := S[i];
end;
SetLength(Result, LActualLength);
end;
Run Code Online (Sandbox Code Playgroud)
为了进一步优化,按照Stefan Glienke的建议,您可以绕过 RTL 的字符串处理机制并直接写入每个字符,但会损失一些代码可读性:
function RemoveNonDigits(const S: string): string;
begin
SetLength(Result, S.Length);
var ResChr := PChar(Result);
var LActualLength := 0;
for var i := 1 to S.Length do
if CharInSet(S[i], ['0'..'9']) then
begin
Inc(LActualLength);
ResChr^ := S[i];
Inc(ResChr);
end;
SetLength(Result, LActualLength);
end;
Run Code Online (Sandbox Code Playgroud)
基准
只是为了好玩,我对长度 < 100 的随机输入字符串做了一个非常原始的基准测试,并且 char 是数字的可能性约为 24%:
program Benchmark;
{$APPTYPE CONSOLE}
{$R *.res}
uses
System.SysUtils, System.RegularExpressions, Windows;
function OP1(const s: string): string;
begin
Result := '';
for var i := 1 to Length(s) do
begin
if s[i] in ['0'..'9'] then
Result := Result + s[i];
end;
end;
function OP2(const s: string): string;
begin
Result := s;
for var i := Length(Result) downto 1 do
begin
if not(Result[i] in ['0'..'9']) then
Delete(Result, i, 1);
end;
end;
function OP3(const s: string): string;
begin
var RegEx := TRegEx.Create('[^0-9]');
Result := RegEx.Replace(s, '');
end;
function AR1(const S: string): string;
begin
SetLength(Result, S.Length);
var LActualLength := 0;
for var i := 1 to S.Length do
if CharInSet(S[i], ['0'..'9']) then
begin
Inc(LActualLength);
Result[LActualLength] := S[i];
end;
SetLength(Result, LActualLength);
end;
function AR2(const S: string): string;
begin
SetLength(Result, S.Length);
var ResChr := PChar(Result);
var LActualLength := 0;
for var i := 1 to S.Length do
if CharInSet(S[i], ['0'..'9']) then
begin
Inc(LActualLength);
ResChr^ := S[i];
Inc(ResChr);
end;
SetLength(Result, LActualLength);
end;
function AR3(const S: string): string;
begin
SetLength(Result, S.Length);
var ResChr := PChar(Result);
for var i := 1 to S.Length do
if CharInSet(S[i], ['0'..'9']) then
begin
ResChr^ := S[i];
Inc(ResChr);
end;
SetLength(Result, ResChr - PChar(Result));
end;
function RandomInputString: string;
begin
SetLength(Result, Random(100));
for var i := 1 to Result.Length do
Result[i] := Chr(Ord('0') + Random(42));
end;
begin
Randomize;
const N = 1000000;
var Data := TArray<string>(nil);
SetLength(Data, N);
for var i := 0 to N - 1 do
Data[i] := RandomInputString;
var f, c0, cOP1, cOP2, cOP3, cAR1, cAR2, cAR3: Int64;
QueryPerformanceFrequency(f);
QueryPerformanceCounter(c0);
for var i := 0 to High(Data) do
OP1(Data[i]);
QueryPerformanceCounter(cOP1);
Dec(cOP1, c0);
QueryPerformanceCounter(c0);
for var i := 0 to High(Data) do
OP2(Data[i]);
QueryPerformanceCounter(cOP2);
Dec(cOP2, c0);
QueryPerformanceCounter(c0);
for var i := 0 to High(Data) do
OP3(Data[i]);
QueryPerformanceCounter(cOP3);
Dec(cOP3, c0);
QueryPerformanceCounter(c0);
for var i := 0 to High(Data) do
AR1(Data[i]);
QueryPerformanceCounter(cAR1);
Dec(cAR1, c0);
QueryPerformanceCounter(c0);
for var i := 0 to High(Data) do
AR2(Data[i]);
QueryPerformanceCounter(cAR2);
Dec(cAR2, c0);
QueryPerformanceCounter(c0);
for var i := 0 to High(Data) do
AR3(Data[i]);
QueryPerformanceCounter(cAR3);
Dec(cAR3, c0);
Writeln('Computations per second:');
Writeln('OP1: ', Round(N / (cOP1 / f)));
Writeln('OP2: ', Round(N / (cOP2 / f)));
Writeln('OP3: ', Round(N / (cOP3 / f)));
Writeln('AR1: ', Round(N / (cAR1 / f)));
Writeln('AR2: ', Round(N / (cAR2 / f)));
Writeln('AR3: ', Round(N / (cAR3 / f)));
Readln;
end.
Run Code Online (Sandbox Code Playgroud)
结果:
Computations per second:
OP1: 1398134
OP2: 875116
OP3: 39162
AR1: 3406172
AR2: 4063260
AR3: 4032343
Run Code Online (Sandbox Code Playgroud)
正如您所看到的,至少在这个测试中,正则表达式是迄今为止最慢的方法。预分配会产生重大影响,而避免该_UniqueStringU问题似乎只会带来相对较小的改进。
但即使使用非常慢的 RegEx 方法,您每秒也可以执行 40 000 次调用。在我用了 13 年的电脑上。