Haskell中的木薯解析错误

nat*_*nat 5 csv parsing haskell

我正在尝试使用木薯将 csv 转换为向量。我尝试转换的 csv 是用于机器学习的 fischer iris 数据集。它由四个双打和一根弦组成。我的代码如下:

{-# LANGUAGE OverloadedStrings #-}

module Main where
import Data.Csv
import qualified Data.ByteString.Lazy as BS
import qualified Data.Vector as V

data Iris = Iris
  { sepal_length  :: !Double
  , sepal_width   :: !Double
  , petal_length  :: !Double
  , petal_width   :: !Double
  , iris_type     :: !String
 } deriving (Show, Eq, Read)

instance FromNamedRecord Iris where
  parseNamedRecord r =
    Iris
      <$> r .: "sepal_length"
      <*> r .: "sepal_width"
      <*> r .: "petal_length"
      <*> r .: "petal_width"
      <*> r .: "iris_type"

printIris :: Iris -> IO ()
printIris r  = putStrLn $  show (sepal_length r) ++ show (sepal_width r)
   ++ show(petal_length r) ++ show(petal_length r) ++ "hola"

main :: IO ()
main = do
  csvData <- BS.readFile "./iris/test-iris"
  print csvData
  case decodeByName csvData of
    Left err -> putStrLn err
    -- forM : O(n) Apply the monadic action to all elements of the vector,
    -- yielding a vector of results.
    Right (h, v) -> V.forM_ v $ printIris
Run Code Online (Sandbox Code Playgroud)

当我运行它时,似乎 csvData 格式正确,打印 csvData 的第一行返回以下内容:

"5.1,3.5,1.4,0.2,Iris-setosa\n4.9,3.0,1.4,0.2,Iris- setosa\n4.7,3.2,1.3,0.2,Iris-setosa\n4.6,3.1,1.5,0.2,Iris-setosa\n5.0,3.6,1.4,0.2,Iris-setosa\n5.4,3.9,1.7,0.4,Iris-setosa\n4.6,3.4,1.4,0.3,Iris-setosa\n5.0,3.4,1.5,0.2,Iris-setosa\n4.4,2.9,1.4,0.2,Iris-setosa\n4.9,3.1,1.5,0.1,Iris-setosa\n5.4,3.7,1.5,0.2,Iris-setosa\n4.8,3.4,1.6,0.2,Iris-setosa\n4.8,3.0,1.4,0.1,Iris-setosa\n4.3,3.0,1.1,0.1,Iris-setosa\n5.8,4.0,1.2,0.2,Iris-setosa\n5.7,4.4,1.5,0.4,Iris-set
Run Code Online (Sandbox Code Playgroud)

但我收到以下错误:

parse error (Failed reading: conversion error: no field named "sepal_length")  at 
4.7,3.2,1.3,0.2,Iris-setosa
4.6,3.1,1.5,0.2,Iris-setosa
5.0,3.6,1.4,0.2,Iris-setosa
5.4,3.9,1.7,0.4 (truncated)
Run Code Online (Sandbox Code Playgroud)

有没有人知道为什么我会收到这个错误?csv 没有缺失值,如果我替换为另一行产生错误的行,我会得到相同的错误。

Li-*_*Xia 4

您的数据似乎没有标题,这是假设的decodeByName

假定数据前面有一个标头。

添加标头,或使用decode NoHeader类型FromRecord类。