use*_*777 10 python featuretools
该featuretools在其第一句话文档状态:
“ Featuretools是执行自动化特征工程的框架。它擅长将时间和关系数据集转换为用于机器学习的特征矩阵。”
这似乎暗示数据集必须具有datetime列。我只想让它确认实际上是这样。也就是说,例如,我不能在“ iris”数据集上使用它来生成新功能吗?如果数据集不需要时间变量,我将如何使用它在“ iris”数据集上生成要素。我将不胜感激。谢谢。
wil*_*llk 18
Featuretools适用于带有或不带有日期时间的关系数据集,并且为回答您的问题,Featuretools 可以为没有日期时间的单个表制作功能。对于虹膜数据集,只有一个表,没有可对其进行归一化的即时特征(从现有表中创建新表),因此您将使用变换原语来创建新特征。
EntitySet entitytransform原语运行深度特征综合。这是一个完整的工作示例:
from sklearn.datasets import load_iris
import pandas as pd
import featuretools as ft
# Load data and put into dataframe
iris = load_iris()
df = pd.DataFrame(iris.data, columns = iris.feature_names)
df['species'] = iris.target
df['species'] = df['species'].map({0: 'setosa', 1: 'versicolor', 2: 'virginica'})
# Make an entityset and add the entity
es = ft.EntitySet(id = 'iris')
es.entity_from_dataframe(entity_id = 'data', dataframe = df,
make_index = True, index = 'index')
# Run deep feature synthesis with transformation primitives
feature_matrix, feature_defs = ft.dfs(entityset = es, target_entity = 'data',
trans_primitives = ['add_numeric', 'multiply_numeric'])
feature_matrix.head()
Run Code Online (Sandbox Code Playgroud)
特征矩阵的前五行:
sepal length (cm) sepal width (cm) petal length (cm) petal width (cm) species petal width (cm) + sepal width (cm) petal length (cm) + petal width (cm) petal length (cm) + sepal length (cm) petal length (cm) + sepal width (cm) sepal length (cm) + sepal width (cm) petal width (cm) + sepal length (cm) petal length (cm) * sepal width (cm) sepal length (cm) * sepal width (cm) petal width (cm) * sepal length (cm) petal width (cm) * sepal width (cm) petal length (cm) * sepal length (cm) petal length (cm) * petal width (cm) petal width (cm) + sepal width (cm) * sepal length (cm) + sepal width (cm) petal width (cm) + sepal width (cm) * sepal length (cm) petal length (cm) + petal width (cm) * petal width (cm) petal width (cm) + sepal length (cm) * sepal length (cm) petal length (cm) * petal width (cm) + sepal length (cm) petal width (cm) * sepal length (cm) + sepal width (cm) petal length (cm) + sepal length (cm) * sepal width (cm) petal length (cm) + petal width (cm) * sepal length (cm) petal length (cm) + sepal length (cm) * petal width (cm) + sepal width (cm) petal length (cm) + sepal length (cm) * sepal length (cm) + sepal width (cm) petal length (cm) * sepal length (cm) + sepal width (cm) petal length (cm) + sepal width (cm) * sepal length (cm) + sepal width (cm) petal length (cm) + sepal width (cm) * sepal length (cm) petal length (cm) + petal width (cm) * petal length (cm) + sepal length (cm) petal length (cm) + sepal length (cm) * petal width (cm) petal length (cm) + sepal width (cm) * sepal width (cm) petal length (cm) + petal width (cm) * petal length (cm) + sepal width (cm) sepal length (cm) + sepal width (cm) * sepal width (cm) petal length (cm) + sepal length (cm) * petal width (cm) + sepal length (cm) petal width (cm) + sepal length (cm) * sepal length (cm) + sepal width (cm) petal length (cm) + petal width (cm) * petal width (cm) + sepal width (cm) petal length (cm) + sepal width (cm) * petal width (cm) + sepal length (cm) petal length (cm) * petal length (cm) + sepal length (cm) petal width (cm) * petal width (cm) + sepal width (cm) petal length (cm) + petal width (cm) * sepal length (cm) + sepal width (cm) petal length (cm) * petal width (cm) + sepal width (cm) petal length (cm) + sepal width (cm) * petal width (cm) petal length (cm) * petal length (cm) + petal width (cm) petal length (cm) + petal width (cm) * sepal width (cm) petal length (cm) + sepal length (cm) * sepal length (cm) petal width (cm) + sepal length (cm) * sepal width (cm) petal length (cm) * petal length (cm) + sepal width (cm) petal width (cm) + sepal width (cm) * sepal width (cm) petal length (cm) + sepal length (cm) * petal length (cm) + sepal width (cm) sepal length (cm) * sepal length (cm) + sepal width (cm) petal width (cm) * petal width (cm) + sepal length (cm) petal length (cm) + sepal width (cm) * petal width (cm) + sepal width (cm) petal length (cm) + petal width (cm) * petal width (cm) + sepal length (cm) petal width (cm) + sepal length (cm) * petal width (cm) + sepal width (cm)
index
0 5.1 3.5 1.4 0.2 setosa 3.7 1.6 6.5 4.9 8.6 5.3 4.90 17.85 1.02 0.70 7.14 0.28 31.82 18.87 0.32 27.03 7.42 1.72 22.75 8.16 24.05 55.90 12.04 42.14 24.99 10.40 1.30 17.15 7.84 30.10 34.45 45.58 5.92 25.97 9.10 0.74 13.76 5.18 0.98 2.24 5.60 33.15 18.55 6.86 12.95 31.85 43.86 1.06 18.13 8.48 19.61
1 4.9 3.0 1.4 0.2 setosa 3.2 1.6 6.3 4.4 7.9 5.1 4.20 14.70 0.98 0.60 6.86 0.28 25.28 15.68 0.32 24.99 7.14 1.58 18.90 7.84 20.16 49.77 11.06 34.76 21.56 10.08 1.26 13.20 7.04 23.70 32.13 40.29 5.12 22.44 8.82 0.64 12.64 4.48 0.88 2.24 4.80 30.87 15.30 6.16 9.60 27.72 38.71 1.02 14.08 8.16 16.32
2 4.7 3.2 1.3 0.2 setosa 3.4 1.5 6.0 4.5 7.9 4.9 4.16 15.04 0.94 0.64 6.11 0.26 26.86 15.98 0.30 23.03 6.37 1.58 19.20 7.05 20.40 47.40 10.27 35.55 21.15 9.00 1.20 14.40 6.75 25.28 29.40 38.71 5.10 22.05 7.80 0.68 11.85 4.42 0.90 1.95 4.80 28.20 15.68 5.85 10.88 27.00 37.13 0.98 15.30 7.35 16.66
3 4.6 3.1 1.5 0.2 setosa 3.3 1.7 6.1 4.6 7.7 4.8 4.65 14.26 0.92 0.62 6.90 0.30 25.41 15.18 0.34 22.08 7.20 1.54 18.91 7.82 20.13 46.97 11.55 35.42 21.16 10.37 1.22 14.26 7.82 23.87 29.28 36.96 5.61 22.08 9.15 0.66 13.09 4.95 0.92 2.55 5.27 28.06 14.88 6.90 10.23 28.06 35.42 0.96 15.18 8.16 15.84
4 5.0 3.6 1.4 0.2 setosa 3.8 1.6 6.4 5.0 8.6 5.2
-
您可以使用ft.list_primitives()列出所有基元。 (2认同)