LabelEncoder（）是否不存储参数？

Question

LabelEncoder（）是否不存储参数？

LabelEncoder不会“记住”参数。当我使用它拟合并转换数据然后询问参数时，我得到的只是{}。这使得不可能在新数据上重复使用编码器。

例：

from sklearn.preprocessing import LabelEncoder

encode = LabelEncoder()
encode.fit_transform(['one', 'two', 'three'])
print(encode.get_params())

Run Code Online (Sandbox Code Playgroud)

不确定预期的格式，但是我期望类似 {['one', 0], ['two', 1], ['three', 2]}

实际结果： {}

我上线了：

Darwin-16.7.0-x86_64-i386-64bit
Python 3.6.1 |Anaconda 4.4.0 (x86_64)| (default, May 11 2017, 13:04:09) 
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)]
NumPy 1.12.1
SciPy 0.19.0
Scikit-Learn 0.18.1

Run Code Online (Sandbox Code Playgroud)

Answer 1

Mil*_*o G 5

标签编码器将参数存储在classes_属性中。您可以获取转换这些类并创建字典的编码值。此编码器将使用具有相同标签的新数据，否则将引发ValueError。在要编码的标签上使用transform方法即可。

from sklearn import preprocessing

encode = preprocessing.LabelEncoder()
encode.fit_transform(['one', 'two', 'three'])
keys = encode.classes_
values = encode.transform(encode.classes_)
dictionary = dict(zip(keys, values))
print(dictionary)

Run Code Online (Sandbox Code Playgroud)

输出： {'three': 1, 'two': 2, 'one': 0}

`get_params` 方法继承自 `BaseEstimator`，并且在 `LabelEncoder` 中基本上不执行任何操作。该项目的 github 页面上有一个未解决的问题 (https://github.com/scikit-learn/scikit-learn/issues/9662) (2认同)

归档时间：	8 年，8 月前
查看次数：	613 次
最近记录：	8 年，8 月前