ML.NET - 特征列“Features”的架构不匹配:预期为 Vector<Single>,得到了 Vector<Int32>

Eva*_*Eva 5 c# classification multiclass-classification ml.net

我只是尝试制作我的第一个ML.NET项目,之前我使用 Azure ML、可视化界面、Python 等构建了该项目,但现在我想使用C#来完成。

我正在遵循教程,但具有完全不同的数据集和目的。

数据集有很多额外的列,但我的数据模型如下所示(指向数据集中列的索引):

using Microsoft.ML.Data;

namespace ML_Net
{

    public class Earthquake
    {
        [LoadColumn(1)]
        public int geo_level_1_id { get; set; }
        [LoadColumn(2)]
        public int geo_level_2_id { get; set; }
        [LoadColumn(3)]
        public int geo_level_3_id { get; set; }
        [LoadColumn(4)]
        public int count_floors_pre_eq { get; set; }
        [LoadColumn(5)]
        public int age { get; set; }
        [LoadColumn(6)]
        public int area { get; set; }
        [LoadColumn(7)]
        public int height { get; set; }
        [LoadColumn(8)]
        public int count_families { get; set; }
        [LoadColumn(26)]
        public int has_secondary_use { get; set; }
        [LoadColumn(27)]
        public double square { get; set; }
        [LoadColumn(39)]
        public double difference { get; set; }
        [LoadColumn(40)]
        public int damage_grade { get; set; }
    }

    public class DamagePrediction
    {
        [ColumnName("PredictedLabel")]
        public int damage_grade;
    }
}
Run Code Online (Sandbox Code Playgroud)

错误来自于训练函数:

public static IEstimator<ITransformer> BuildAndTrainModel(IDataView trainingDataView, IEstimator<ITransformer> pipeline)
{
    var trainingPipeline = pipeline
        .Append(_mlContext.MulticlassClassification.Trainers
        .SdcaMaximumEntropy("Label", "Features"))
        .Append(_mlContext.Transforms.Conversion
        .MapKeyToValue("PredictedLabel"));

    _trainedModel = trainingPipeline.Fit(trainingDataView);
    _predEngine = _mlContext.Model
        .CreatePredictionEngine<Earthquake, DamagePrediction>(_trainedModel);

    Earthquake building = new Earthquake()
    {
        geo_level_1_id = 1,
        geo_level_2_id = 42,
        geo_level_3_id = 941,
        count_floors_pre_eq = 2,
        age = 0,
        area = 24,
        height = 4,
        count_families = 2,
        has_secondary_use = 0,
        square = 4.898979485566356,
        difference = 0.8989794855663558
    };

    var prediction = _predEngine.Predict(building);
    Console.WriteLine($"=============== Single Prediction just-trained-model - Result: {prediction.damage_grade} ===============");


    return trainingPipeline;
}
Run Code Online (Sandbox Code Playgroud)

其中说:

抛出异常: Microsoft.ML.Data.dll 中的“System.ArgumentOutOfRangeException” Microsoft.ML.Data.dll 中发生类型为“System.ArgumentOutOfRangeException”的未处理异常 功能列“Features”的架构不匹配:预期的 Vector < Single >,得到向量<Int32>

我似乎无法理解问题所在,您能帮我提供一些想法吗?

我只处理数值数据,这就是为什么我没有添加转换或特征化,但也许标准化可以有所帮助..因为我有一些浮点数..

预先感谢您的所有想法!