如果我有一个文本文件
"如果你不能隐藏,就不要跑,否则你会被打成两串,你是一个邪恶的男人"
我想计算你在文本文件中的单词的次数,并将该值放入int变量.
我怎么去做那样的事情?
Sco*_*hic 13
用正则表达式来说...
Console.WriteLine((new Regex(@"(?i)you")).Matches("dont run if you cant hide, or you will be broken in two strings, your a evil man").Count)
Run Code Online (Sandbox Code Playgroud)
或者如果你需要单独的单词
Console.WriteLine((new Regex(@"(?i)\byou\b")).Matches("dont run if you cant hide, or you will be broken in two strings, your a evil man").Count)
Run Code Online (Sandbox Code Playgroud)
编辑:为了正确起见,用(?i)\ byou\b替换\ s +你\ s +
jas*_*son 10
string s = "dont run if you cant hide, or you will be broken in two strings, your a evil man";
var wordCounts = from w in s.Split(' ')
group w by w into g
select new { Word = g.Key, Count = g.Count() };
int youCount = wordCounts.Single(w => w.Word == "you").Count;
Console.WriteLine(youCount);
Run Code Online (Sandbox Code Playgroud)
理想情况下,应忽略标点符号.我会让你处理那样凌乱的细节.
假设存在常规换行符,那么如果文件很大,那么这将比其他方法的内存密集度更低.使用杰森的计数方法:
var total = 0;
using(StreamReader sr=new StreamReader("log.log"))
{
while (!sr.EndOfStream)
{
var counts = sr
.ReadLine()
.Split(' ')
.GroupBy(s => s)
.Select(g => new{Word = g.Key,Count = g.Count()});
var wc = counts.SingleOrDefault(c => c.Word == "you");
total += (wc == null) ? 0 : wc.Count;
}
}
Run Code Online (Sandbox Code Playgroud)
或者,将Scoregraphic的答案与IEnumerable方法结合起来:
static IEnumerable<string> Lines(string filename)
{
using (var sr = new StreamReader(filename))
{
while (!sr.EndOfStream)
{
yield return sr.ReadLine();
}
}
}
Run Code Online (Sandbox Code Playgroud)
你可以得到一个漂亮的单行
Lines("log.log")
.Select(line => Regex.Matches(line, @"(?i)\byou\b").Count)
.Sum();
Run Code Online (Sandbox Code Playgroud)
或者使用框架方法,File.ReadLines()您可以将其减少为:
File.ReadLines("log.log")
.Select(line => Regex.Matches(line, @"(?i)\byou\b").Count)
.Sum();
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
12263 次 |
| 最近记录: |