看来我的代码与 scikit-survival 文档的形式相同。
data_y = df[['sensored', 'sensored_2']].to_numpy()
data_x = df.drop(['sensored', 'sensored_2'], axis = 1)
data_y
array([[True, 481],
[True, 424],
[True, 519],
...,
[True, 13],
[True, 96],
[True, 6]], dtype=object)
Run Code Online (Sandbox Code Playgroud)
根据 scikit-survial 文档,该数组是在加载时从数据集创建的。我正在尝试从数据帧创建数组,但当我尝试将数组适合模型时,标题中仍然出现错误。
sksurv.linear_model import CoxPHSurvivalAnalysis
estimator = CoxPHSurvivalAnalysis()
estimator.fit(df_dummy_3, data_y)
ValueError: y must be a structured array with the first field being a binary
class event indicator and the second field the time of the event/censoring
Run Code Online (Sandbox Code Playgroud)
文档:
from sksurv.datasets import load_veterans_lung_cancer
data_x, data_y = load_veterans_lung_cancer()
data_y
array([( True, 72.), ( …Run Code Online (Sandbox Code Playgroud)