ValueError: 无法将字符串转换为浮点数:

Tho*_*ott 4 python csv machine-learning scikit-learn

我正在按照本教程编写朴素贝叶斯分类器:http : //machinelearningmastery.com/naive-bayes-classifier-scratch-python/

我不断收到此错误:

dataset[i] = [float(x) for x in dataset[i]]
ValueError: could not convert string to float: 
Run Code Online (Sandbox Code Playgroud)

这是我的代码发生错误的部分:

def loadDatasetNB(filename):
    lines = csv.reader(open(filename, "rt"))
    dataset = list(lines)
    for i in range(len(dataset)):
        dataset[i] = [float(x) for x in dataset[i]]
    return dataset
Run Code Online (Sandbox Code Playgroud)

这是文件的调用方式:

def NB_Analysis():
    filename = 'fvectors.csv'
    splitRatio = 0.67
    dataset = loadDatasetNB(filename)
    trainingSet, testSet = splitDatasetNB(dataset, splitRatio)
    print('Split {0} rows into train={1} and test={2} rows').format(len(dataset), len(trainingSet), len(testSet))
    # prepare model
    summaries = summarizeByClassNB(trainingSet)
    # test model
    predictions = getPredictionsNB(summaries, testSet)
    accuracy = getAccuracyNB(testSet, predictionsNB)
    print('Accuracy: {0}%').format(accuracy)

NB_Analysis()
Run Code Online (Sandbox Code Playgroud)

我的文件 fvectors.csv 看起来像这样

这里出了什么问题,我该如何解决?

Tar*_*syk 5

尝试跳过标题,第一列中的空标题导致问题。

>>> float(' ')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: could not convert string to float:
Run Code Online (Sandbox Code Playgroud)

如果你想跳过标题,你可以通过以下方式实现:

def loadDatasetNB(filename):
    lines = csv.reader(open(filename, "rt"))
    next(reader, None)  # <<- skip the headers
    dataset = list(lines)
    for i in range(len(dataset)):
        dataset[i] = [float(x) for x in dataset[i]]
    return dataset
Run Code Online (Sandbox Code Playgroud)

(2) 或者你可以忽略异常:

try:
    float(element)
except ValueError:
    pass
Run Code Online (Sandbox Code Playgroud)

如果您决定使用选项 (2),请确保仅跳过第一行或仅跳过包含文本的行,并且您肯定知道它。