代码高尔夫7月版第4期:计算前十个发生的单词

ojb*_*ass 11 code-golf counting text-files

鉴于以下总统名单可以在最小的计划中进行前十个字数:

输入文件

    Washington
    Washington
    Adams
    Jefferson
    Jefferson
    Madison
    Madison
    Monroe
    Monroe
    John Quincy Adams
    Jackson
    Jackson
    Van Buren
    Harrison 
    DIES
    Tyler
    Polk
    Taylor 
    DIES
    Fillmore
    Pierce
    Buchanan
    Lincoln
    Lincoln 
    DIES
    Johnson
    Grant
    Grant
    Hayes
    Garfield 
    DIES
    Arthur
    Cleveland
    Harrison
    Cleveland
    McKinley
    McKinley
    DIES
    Teddy Roosevelt
    Teddy Roosevelt
    Taft
    Wilson
    Wilson
    Harding
    Coolidge
    Hoover
    FDR
    FDR
    FDR
    FDR
    Dies
    Truman
    Truman
    Eisenhower
    Eisenhower
    Kennedy 
    DIES
    Johnson
    Johnson
    Nixon
    Nixon 
    ABDICATES
    Ford
    Carter
    Reagan
    Reagan
    Bush
    Clinton
    Clinton
    Bush
    Bush
    Obama

bash 97个字符开始

cat input.txt | tr " " "\n" | tr -d "\t " | sed 's/^$//g' | sort | uniq -c | sort -n | tail -n 10
Run Code Online (Sandbox Code Playgroud)

输出:

      2 Nixon
      2 Reagan
      2 Roosevelt
      2 Truman
      2 Washington
      2 Wilson
      3 Bush
      3 Johnson
      4 FDR
      7 DIES

如你所愿打破关系!快乐第四!

对于那些关心总统的人来说,可以在这里找到.

jas*_*son 12

C#,153:

读入文件p并将结果打印到控制台:

File.ReadLines(p)
    .SelectMany(s=>s.Split(' '))
    .GroupBy(w=>w)
    .OrderBy(g=>-g.Count())
    .Take(10)
    .ToList()
    .ForEach(g=>Console.WriteLine(g.Count()+"|"+g.Key));
Run Code Online (Sandbox Code Playgroud)

如果仅生成列表但不打印到控制台,则为93个字符.

6|DIES
4|FDR
3|Johnson
3|Bush
2|Washington
2|Adams
2|Jefferson
2|Madison
2|Monroe
2|Jackson
Run Code Online (Sandbox Code Playgroud)

  • 我想,这很干净整洁.并且至少是半可理解的,至少与Perl版本相比. (2认同)

Sti*_*set 11

较短的shell版本:

xargs -n1 < input.txt | sort | uniq -c | sort -nr | head
Run Code Online (Sandbox Code Playgroud)

如果您想要不区分大小写的排名,请更改uniq -cuniq -ci.

稍微短一点,如果你对排名被逆转感到高兴,可读性因缺乏空间而受损.这个时钟有46个字符:

xargs -n1<input.txt|sort|uniq -c|sort -n|tail
Run Code Online (Sandbox Code Playgroud)

(如果允许您首先将输入文件重命名为"i",则可以将其删除为38.)

观察到,在这种特殊情况下,没有任何单词出现超过9次,我们可以通过从最终排序中删除'-n'参数来减少3个字符:

xargs -n1<input.txt|sort|uniq -c|sort|tail
Run Code Online (Sandbox Code Playgroud)

这将此解决方案降低到43个字符,而无需重命名输入文件.(或35,如果你这样做.)

使用xargs -n1将文件拆分为每行上的一个单词比tr \ \\n解决方案更可取,因为这会产生大量空白行.这意味着该解决方案不正确,因为它错过了Nixon并显示一个显示256次的空白字符串.但是,空字符串不是"单词".


ojb*_*ass 7

vim 60

    :1,$!tr " " "\n"|tr -d "\t "|sort|uniq -c|sort -n|tail -n 10


Jos*_*ger 7

Vim 36

:%s/\W/\r/g|%!sort|uniq -c|sort|tail
Run Code Online (Sandbox Code Playgroud)

  • 你可以丢失4个字符,因为'tail'相当于'tail -10'或'tail -n10'. (2认同)

eph*_*ent 5

Haskell,102个字符(哇,非常接近原始版本):

import List
(take 10.map snd.sort.map(\(x:y)->(-length y,x)).group.sort.words)`fmap`readFile"input.txt"
Run Code Online (Sandbox Code Playgroud)

J,只有55个字符:

10{.\:~~.(,.~[:<"0@(+/)=/~);;:&.><;._2[1!:1<'input.txt'
Run Code Online (Sandbox Code Playgroud)

(我还没弄清楚如何在J中优雅地执行文本操作......它在数组结构数据方面要好得多.)


   NB. read the file
   <1!:1<'input.txt'
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------...
|    Washington     Washington     Adams     Jefferson     Jefferson     Madison     Madison     Monroe     Monroe     John Quincy Adams     Jackson     Jackson     Van Buren     Harrison DIES     Tyler     Polk     Taylor DIES     Fillmore     Pierce     ...
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------...
   NB. split into lines
   <;._2[1!:1<'input.txt'
+--------------+--------------+---------+-------------+-------------+-----------+-----------+----------+----------+---------------------+-----------+-----------+-------------+-----------------+---------+--------+---------------+------------+----------+----...
|    Washington|    Washington|    Adams|    Jefferson|    Jefferson|    Madison|    Madison|    Monroe|    Monroe|    John Quincy Adams|    Jackson|    Jackson|    Van Buren|    Harrison DIES|    Tyler|    Polk|    Taylor DIES|    Fillmore|    Pierce|    ...
+--------------+--------------+---------+-------------+-------------+-----------+-----------+----------+----------+---------------------+-----------+-----------+-------------+-----------------+---------+--------+---------------+------------+----------+----...
   NB. split into words
   ;;:&.><;._2[1!:1<'input.txt'
+----------+----------+-----+---------+---------+-------+-------+------+------+----+------+-----+-------+-------+---+-----+--------+----+-----+----+------+----+--------+------+--------+-------+-------+----+-------+-----+-----+-----+--------+----+------+---...
|Washington|Washington|Adams|Jefferson|Jefferson|Madison|Madison|Monroe|Monroe|John|Quincy|Adams|Jackson|Jackson|Van|Buren|Harrison|DIES|Tyler|Polk|Taylor|DIES|Fillmore|Pierce|Buchanan|Lincoln|Lincoln|DIES|Johnson|Grant|Grant|Hayes|Garfield|DIES|Arthur|Cle...
+----------+----------+-----+---------+---------+-------+-------+------+------+----+------+-----+-------+-------+---+-----+--------+----+-----+----+------+----+--------+------+--------+-------+-------+----+-------+-----+-----+-----+--------+----+------+---...
   NB. count reptititions
   |:~.(,.~[:<"0@(+/)=/~);;:&.><;._2[1!:1<'input.txt'
+----------+-----+---------+-------+------+----+------+-------+---+-----+--------+----+-----+----+------+--------+------+--------+-------+-------+-----+-----+--------+------+---------+--------+---------+----+------+-------+--------+------+---+------+------...
|2         |2    |2        |2      |2     |1   |1     |2      |1  |1    |2       |6   |1    |1   |1     |1       |1     |1       |2      |3      |2    |1    |1       |1     |2        |2       |2        |1   |2     |1      |1       |1     |4  |2     |2     ...
+----------+-----+---------+-------+------+----+------+-------+---+-----+--------+----+-----+----+------+--------+------+--------+-------+-------+-----+-----+--------+------+---------+--------+---------+----+------+-------+--------+------+---+------+------...
|Washington|Adams|Jefferson|Madison|Monroe|John|Quincy|Jackson|Van|Buren|Harrison|DIES|Tyler|Polk|Taylor|Fillmore|Pierce|Buchanan|Lincoln|Johnson|Grant|Hayes|Garfield|Arthur|Cleveland|McKinley|Roosevelt|Taft|Wilson|Harding|Coolidge|Hoover|FDR|Truman|Eisenh...
+----------+-----+---------+-------+------+----+------+-------+---+-----+--------+----+-----+----+------+--------+------+--------+-------+-------+-----+-----+--------+------+---------+--------+---------+----+------+-------+--------+------+---+------+------...
   NB. sort
   |:\:~~.(,.~[:<"0@(+/)=/~);;:&.><;._2[1!:1<'input.txt'
+----+---+-------+----+------+----------+------+---------+------+-----+------+--------+-------+-------+---------+-------+--------+-----+----------+-------+---------+-----+---+-----+------+----+------+----+------+-----+-------+----+------+-----+-------+----...
|6   |4  |3      |3   |2     |2         |2     |2        |2     |2    |2     |2       |2      |2      |2        |2      |2       |2    |2         |2      |2        |2    |1  |1    |1     |1   |1     |1   |1     |1    |1      |1   |1     |1    |1      |1   ...
+----+---+-------+----+------+----------+------+---------+------+-----+------+--------+-------+-------+---------+-------+--------+-----+----------+-------+---------+-----+---+-----+------+----+------+----+------+-----+-------+----+------+-----+-------+----...
|DIES|FDR|Johnson|Bush|Wilson|Washington|Truman|Roosevelt|Reagan|Nixon|Monroe|McKinley|Madison|Lincoln|Jefferson|Jackson|Harrison|Grant|Eisenhower|Clinton|Cleveland|Adams|Van|Tyler|Taylor|Taft|Quincy|Polk|Pierce|Obama|Kennedy|John|Hoover|Hayes|Harding|Garf...
+----+---+-------+----+------+----------+------+---------+------+-----+------+--------+-------+-------+---------+-------+--------+-----+----------+-------+---------+-----+---+-----+------+----+------+----+------+-----+-------+----+------+-----+-------+----...
   NB. take 10
   10{.\:~~.(,.~[:<"0@(+/)=/~);;:&.><;._2[1!:1<'input.txt'
+-+----------+
|6|DIES      |
+-+----------+
|4|FDR       |
+-+----------+
|3|Johnson   |
+-+----------+
|3|Bush      |
+-+----------+
|2|Wilson    |
+-+----------+
|2|Washington|
+-+----------+
|2|Truman    |
+-+----------+
|2|Roosevelt |
+-+----------+
|2|Reagan    |
+-+----------+
|2|Nixon     |
+-+----------+