我尝试编写一个程序,它将计算列表中每个元素的频率.
In: "aabbcabb"
Out: [("a",3),("b",4),("c",1)]
Run Code Online (Sandbox Code Playgroud)
您可以在以下链接中查看我的代码:http://codepad.org/nyIECIT2 在此代码中,唯一函数的输出将是这样的
In: "aabbcabb"
Out: "abc"
Run Code Online (Sandbox Code Playgroud)
使用唯一的输出我们将计算目标列表的频率.你也可以在这里看到代码:
frequencyOfElt xs=ans
where ans=countElt(unique xs) xs
unique []=[]
unique xs=(head xs):(unique (filter((/=)(head xs))xs))
countElt ref target=ans'
where ans'=zip ref lengths
lengths=map length $ zipWith($)(map[(=='a'),(==',b'),(==',c')](filter.(==))ref)(repeat target)
Error:Syntax error in input (unexpected symbol "unique")
Run Code Online (Sandbox Code Playgroud)
但在ghci 6.13中也出现了其他类型的错误
很少有人问我使用[(=='a'),(==',b'),(==',c')]的目的是什么.我的期望:如果ref ="abc"和target ="aabbaacc"那么
zipWith($) (map filter ref)(repeat target)
Run Code Online (Sandbox Code Playgroud)
将显示["aaaa","bb","cc"]然后我可以使用这里的地图长度来获得频率这里的过滤列表根据ref我使用[(=='a'),(==' ,b '),(==',C')]
我假设一些逻辑错误在于[(=='a'),(==',b'),(==',c')]这里..
Wil*_*ess 16
你没有说你是想自己编写它,还是从一些标准函数中编写它是否可以.
import Data.List
g s = map (\x -> ([head x], length x)) . group . sort $ s
-- g = map (head &&& length) . group . sort -- without the [...]
Run Code Online (Sandbox Code Playgroud)
是编码它的标准快速脏方法.
好吧,所以你最初的想法就是编码点免费风格(某些曲调在我脑海中播放......):
frequencyOfElt :: (Eq a) => [a] -> [(a,Int)]
frequencyOfElt xs = countElt (unique xs) xs -- change the result type
where
unique [] = []
unique (x:xs) = x : unique (filter (/= x) xs)
countElt ref target = -- Code it Point-Free Style (your original idea)
zip
ref $ -- your original type would need (map (:[]) ref) here
map length $
zipWith ($) -- ((filter . (==)) c) === (filter (== c))
(zipWith ($) (repeat (filter . (==))) ref)
(repeat target)
Run Code Online (Sandbox Code Playgroud)
我把这里的类型更改为更合理的[a] -> [(a,Int)]
btw.注意
zipWith ($) fs (repeat z) === map ($ z) fs
zipWith ($) (repeat f) zs === map (f $) zs === map f zs
Run Code Online (Sandbox Code Playgroud)
因此代码简化为
countElt ref target =
zip
ref $
map length $
map ($ target)
(zipWith ($) (repeat (filter . (==))) ref)
Run Code Online (Sandbox Code Playgroud)
然后
countElt ref target =
zip
ref $
map length $
map ($ target) $
map (filter . (==)) ref
Run Code Online (Sandbox Code Playgroud)
但是map f $ map g xs === map (f.g) xs
,所以
countElt ref target =
zip
ref $
map (length . ($ target) . filter . (==)) ref -- (1)
Run Code Online (Sandbox Code Playgroud)
用列表理解写的更清楚(根据我的口味),
countElt ref target =
[ (c, (length . ($ target) . filter . (==)) c) | c <- ref]
== [ (c, length ( ($ target) ( filter (== c)))) | c <- ref]
== [ (c, length $ filter (== c) target) | c <- ref]
Run Code Online (Sandbox Code Playgroud)
这让我们有了进一步重写(1)的想法
countElt ref target =
zip <*> map (length . (`filter` target) . (==)) $ ref
Run Code Online (Sandbox Code Playgroud)
但是这种对无点代码的迷恋在这里变得毫无意义.
所以回到可读的列表理解,使用nub
相当于你的标准函数unique
,你的想法就变成了
import Data.List
frequencyOfElt xs = [ (c, length $ filter (== c) xs) | c <- nub xs]
Run Code Online (Sandbox Code Playgroud)
这个算法实际上是二次方(~ n^2
),所以它比上面的第一个版本更糟糕,因为它sort
是线性的(~ n log(n)
).
这段代码可以通过等效转换原理进一步操作:
= [ (c, length . filter (== c) $ sort xs) | c <- nub xs]
Run Code Online (Sandbox Code Playgroud)
...因为在列表中搜索与在列表中搜索相同,已排序.在这里做更多的工作 - 它会得到回报吗?..
= [ (c, length . filter (== c) $ sort xs) | (c:_) <- group $ sort xs]
Run Code Online (Sandbox Code Playgroud)
... 对?但是现在,group
已经将它们分组了(==)
,所以不需要filter
调用重复已经完成的工作group
:
= [ (c, length . get c . group $ sort xs) | (c:_) <- group $ sort xs]
where get c gs = fromJust . find ((== c).head) $ gs
= [ (c, length g) | g@(c:_) <- group $ sort xs]
= [ (head g, length g) | g <- group (sort xs)]
= (map (head &&& length) . group . sort) xs
Run Code Online (Sandbox Code Playgroud)
不是吗?这里,它就是从这篇文章的开头处具有相同linearithmic算法,实际上衍生从您通过分解出其隐藏的共同计算,使它们可重用和代码简化代码.
使用multiset-0.1:
import Data.Multiset
freq = toOccurList . fromList
Run Code Online (Sandbox Code Playgroud)
归档时间: |
|
查看次数: |
8162 次 |
最近记录: |