如何让word freq counter更有效率?

cod*_*ous 3 f# word-count

我编写了这个F#代码来计算列表中的字频率并将元组返回给C#.你能告诉我如何使代码更高效或更短?

let rec internal countword2 (tail : string list) wrd ((last : string list), count) =
match tail with
| [] -> last, wrd, count
| h::t -> countword2 t wrd (if h = wrd then last, count+1 else last @ [h], count)

let internal countword1 (str : string list) wrd =
let temp, wrd, count = countword2 str wrd ([], 0) in
temp, wrd, count

let rec public countword (str : string list) =
match str with
| [] -> []
| h::_ ->
  let temp, wrd, count = countword1 str h in
       [(wrd, count)] @ countword temp
Run Code Online (Sandbox Code Playgroud)

Dan*_*iel 16

即使pad的版本也可以更加高效和简洁:

let countWords = Seq.countBy id
Run Code Online (Sandbox Code Playgroud)

例:

countWords ["a"; "a"; "b"; "c"] //returns: seq [("a", 2); ("b", 1); ("c", 1)]
Run Code Online (Sandbox Code Playgroud)


pad*_*pad 7

如果你想计算字符串列表中的单词频率,你的方法似乎有点矫枉过正.Seq.groupBy适合这个目的:

let public countWords (words: string list) = 
   words |> Seq.groupBy id
         |> Seq.map (fun (word, sq) -> word, Seq.length sq)
         |> Seq.toList
Run Code Online (Sandbox Code Playgroud)