我正在尝试将以下 ML.NET F# 产品推荐示例改编为我自己的用例:https://github.com/dotnet/machinelearning-samples/tree/master/samples/fsharp/getting-started/MatrixFactorization_ProductRecommendation
但是,在我的数据集中,我没有两个数字 ID。相反,我有一个 UserId(数字)和一个 ProductId(字符串)。因为键值似乎只能是数字,所以我尝试使用该MapValueToKey函数来映射它。但是,我仍然收到以下错误:
Unhandled Exception: System.InvalidOperationException: Column 'UserId' with role MatrixColumnIndex should be a known cardinality U4 key, but is instead 'UInt32'
at Microsoft.ML.Recommender.RecommenderUtils.CheckRowColumnType(RoleMappedData data, ColumnRole role, Column& col, Boolean isDecode)
at Microsoft.ML.Recommender.RecommenderUtils.CheckAndGetMatrixIndexColumns(RoleMappedData data, Column& matrixColumnIndexColumn, Column& matrixRowIndexColumn, Boolean isDecode)
at Microsoft.ML.Trainers.MatrixFactorizationTrainer.TrainCore(IChannel ch, RoleMappedData data, RoleMappedData validData)
at Microsoft.ML.Trainers.MatrixFactorizationTrainer.Fit(IDataView trainData, IDataView validationData)
at Microsoft.ML.Trainers.MatrixFactorizationTrainer.Fit(IDataView input)
at <StartupCode$Recommender>.$Program.main@() in /Users/nat/Projects/Recommender/Recommender/Program.fs:line 75
Run Code Online (Sandbox Code Playgroud)
我的数据的架构类似于以下内容:
UserId,ProductId
1,test-product-id
Run Code Online (Sandbox Code Playgroud)
这是失败的代码,改编自链接的示例:
Unhandled Exception: System.InvalidOperationException: Column 'UserId' with role MatrixColumnIndex should be a known cardinality U4 key, but is instead 'UInt32'
at Microsoft.ML.Recommender.RecommenderUtils.CheckRowColumnType(RoleMappedData data, ColumnRole role, Column& col, Boolean isDecode)
at Microsoft.ML.Recommender.RecommenderUtils.CheckAndGetMatrixIndexColumns(RoleMappedData data, Column& matrixColumnIndexColumn, Column& matrixRowIndexColumn, Boolean isDecode)
at Microsoft.ML.Trainers.MatrixFactorizationTrainer.TrainCore(IChannel ch, RoleMappedData data, RoleMappedData validData)
at Microsoft.ML.Trainers.MatrixFactorizationTrainer.Fit(IDataView trainData, IDataView validationData)
at Microsoft.ML.Trainers.MatrixFactorizationTrainer.Fit(IDataView input)
at <StartupCode$Recommender>.$Program.main@() in /Users/nat/Projects/Recommender/Recommender/Program.fs:line 75
Run Code Online (Sandbox Code Playgroud)
我一直用作指导的另一个链接是https://medium.com/machinelearningadvantage/build-a-product-recommender-using-c-and-ml-net-machine-learning-ab890b802d25
我已经尝试让它工作几个小时了。我到底做错了什么?
更新
通过使我的程序与官方 .NET 示例更加相似,我已经取得了一些进展。我现在得到的是:
UserId,ProductId
1,test-product-id
Run Code Online (Sandbox Code Playgroud)
现在失败的地方是这一行:
let predictionengine = mlContext.Model.CreatePredictionEngine<ProductEntry, Prediction>(model)
与错误
Unhandled Exception: System.ArgumentOutOfRangeException: UserIdEncoded column 'MatrixColumnIndex' not found
Parameter name: schema
at Microsoft.ML.Data.RoleMappedSchema.MapFromNames(DataViewSchema schema, IEnumerable`1 roles, Boolean opt)
at Microsoft.ML.Data.RoleMappedSchema..ctor(DataViewSchema schema, IEnumerable`1 roles, Boolean opt)
at Microsoft.ML.Data.GenericScorer.Bindings.Create(IHostEnvironment env, ISchemaBindableMapper bindable, DataViewSchema input, IEnumerable`1 roles, String suffix, Boolean user)
at Microsoft.ML.Data.GenericScorer.Bindings.ApplyToSchema(IHostEnvironment env, DataViewSchema input)
at Microsoft.ML.Data.GenericScorer..ctor(IHostEnvironment env, GenericScorer transform, IDataView data)
at Microsoft.ML.Data.GenericScorer.ApplyToDataCore(IHostEnvironment env, IDataView newSource)
at Microsoft.ML.Data.RowToRowScorerBase.ApplyToData(IHostEnvironment env, IDataView newSource)
at Microsoft.ML.Data.PredictionTransformerBase`1.Microsoft.ML.ITransformer.GetRowToRowMapper(DataViewSchema inputSchema)
at Microsoft.ML.PredictionEngineBase`2..ctor(IHostEnvironment env, ITransformer transformer, Boolean ignoreMissingColumns, SchemaDefinition inputSchemaDefinition, SchemaDefinition outputSchemaDefinition)
at Microsoft.ML.PredictionEngine`2..ctor(IHostEnvironment env, ITransformer transformer, Boolean ignoreMissingColumns, SchemaDefinition inputSchemaDefinition, SchemaDefinition outputSchemaDefinition)
at Microsoft.ML.PredictionEngineExtensions.CreatePredictionEngine[TSrc,TDst](ITransformer transformer, IHostEnvironment env, Boolean ignoreMissingColumns, SchemaDefinition inputSchemaDefinition, SchemaDefinition outputSchemaDefinition)
at Microsoft.ML.ModelOperationsCatalog.CreatePredictionEngine[TSrc,TDst](ITransformer transformer, Boolean ignoreMissingColumns, SchemaDefinition inputSchemaDefinition, SchemaDefinition outputSchemaDefinition)
Run Code Online (Sandbox Code Playgroud)
我相信您已经克服了最初的障碍:您成功地训练了模型,现在您需要将所有经过训练的资产组装到预测引擎中。
请注意,您已经“训练”了两个转换器:预处理管道(调用 的结果pipeline.Fit(traindata))和推荐器本身(调用 的结果)est.Fit(mappedDataView)。
但是,您创建的预测引擎仅采用第二个变压器,因此只有当我们为其提供第一个变压器的输出时,它才会起作用。
更好的方法是使用预处理器和推荐器形成一个估计器(对于可能的错误,我深表歉意,F# 不是我的母语):
let pipeline =
EstimatorChain().Append(
mlContext.Transforms.Conversion
.MapValueToKey(inputColumnName="UserId",outputColumnName="UserIdEncoded"))
.Append(
mlContext.Transforms.Conversion
.MapValueToKey(inputColumnName="ProductId",outputColumnName="ProductIdEncoded"))
let traindata =
let columns =
[|
TextLoader.Column("Label", DataKind.Single, 0)
TextLoader.Column("UserId", DataKind.UInt32, source = [|TextLoader.Range(0)|], keyCount = KeyCount 6248UL)
TextLoader.Column("ProductId", DataKind.String, source = [|TextLoader.Range(1)|])
|]
mlContext.Data.LoadFromTextFile(trainDataPath, columns, hasHeader=true, separatorChar=',')
// No need to do it:
// let mappedDataView = pipeline.Fit(traindata).Transform(traindata)
let options = MatrixFactorizationTrainer.Options(MatrixColumnIndexColumnName = "UserIdEncoded",
MatrixRowIndexColumnName = "ProductIdEncoded",
LossFunction = MatrixFactorizationTrainer.LossFunctionType.SquareLossOneClass,
LabelColumnName = "Label",
Alpha = 0.01,
Lambda = 0.025)
// Rather than this:
// let est = mlContext.Recommendation().Trainers.MatrixFactorization(options)
// Do this:
let est = pipeline.Append( mlContext.Recommendation().Trainers.MatrixFactorization(options));
// Now train the whole pipeline.
let model = est.Fit(traindata)
// The rest should now work.
let predictionengine = mlContext.Model.CreatePredictionEngine<ProductEntry, Prediction>(model)
let prediction = predictionengine.Predict {ProductId = "farfetch-13470673"; UserId = (uint32 13854); Label = 0.f}
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
837 次 |
| 最近记录: |