从.NET中的文本中提取关键字

Sha*_*air 4 .net linq sorting search keyword

我需要计算每个关键字在字符串中重复出现的次数,并按最高数字排序.为此目的,.NET代码中可用的最快算法是什么?

Ste*_*end 6

编辑:下面的代码对具有计数的唯一令牌进行分组

string[] target = src.Split(new char[] { ' ' });

var results = target.GroupBy(t => new
{
    str = t,
    count = target.Count(sub => sub.Equals(t))
});
Run Code Online (Sandbox Code Playgroud)

这终于开始让我更有意义......

编辑:下面的代码导致与目标子字符串相关的计数:

string src = "for each character in the string, take the rest of the " +
    "string starting from that character " +
    "as a substring; count it if it starts with the target string";
string[] target = {"string", "the", "in"};

var results = target.Select((t, index) => new {str = t, 
    count = src.Select((c, i) => src.Substring(i)).
    Count(sub => sub.StartsWith(t))});
Run Code Online (Sandbox Code Playgroud)

结果现在是:

+       [0] { str = "string", count = 4 }   <Anonymous Type>
+       [1] { str = "the", count = 4 }  <Anonymous Type>
+       [2] { str = "in", count = 6 }   <Anonymous Type>
Run Code Online (Sandbox Code Playgroud)

原始代码如下:

string src = "for each character in the string, take the rest of the " +
    "string starting from that character " +
    "as a substring; count it if it starts with the target string";
string[] target = {"string", "the", "in"};

var results = target.Select(t => src.Select((c, i) => src.Substring(i)).
    Count(sub => sub.StartsWith(t))).OrderByDescending(t => t);
Run Code Online (Sandbox Code Playgroud)

此前的回复表示感谢.

调试器的结果(需要额外的逻辑来包含匹配的字符串及其计数):

-       results {System.Linq.OrderedEnumerable<int,int>}    
-       Results View    Expanding the Results View will enumerate the IEnumerable   
        [0] 6   int
        [1] 4   int
        [2] 4   int
Run Code Online (Sandbox Code Playgroud)