我有 3 列,即模型(应作为索引)、无归一化的准确度、归一化后的准确度(zscore、minmax、maxabs、robust),这些需要创建为:
------------------------------------------------------------------------------------
| Models | Accuracy without normalization | Accuracy with normalization |
| | |-----------------------------------|
| | | zscore | minmax | maxabs | robust |
------------------------------------------------------------------------------------
Run Code Online (Sandbox Code Playgroud)
dfmod-> Models column
dfacc-> Accuracy without normalization
dfacc1-> Accuracy with normalization - zscore
dfacc2-> Accuracy with normalization - minmax
dfacc3-> Accuracy with normalization - maxabs
dfacc4-> Accuracy with normalization - robust
Run Code Online (Sandbox Code Playgroud)
dfout=pd.DataFrame({('Accuracy without Normalization'):{dfacc},
('Accuracy using Normalization','zscore'):{dfacc1},
('Accuracy using Normalization','minmax'):{dfacc2},
('Accuracy using Normalization','maxabs'):{dfacc3},
('Accuracy using Normalization','robust'):{dfacc4},
},index=dfmod
)
Run Code Online (Sandbox Code Playgroud)
我试图做这样的事情,但我无法进一步弄清楚
测试数据:
qda 0.6333 0.6917 0.5917 0.6417 0.5833
svm 0.5333 0.6917 0.5333 0.575 0.575
lda 0.5333 0.6583 0.5333 0.5667 0.5667
lr 0.5333 0.65 0.4917 0.5667 0.5667
dt 0.5333 0.65 0.4917 0.5667 0.5667
rc 0.5083 0.6333 0.4917 0.525 0.525
nb 0.5 0.625 0.475 0.5 0.4833
rfc 0.5 0.625 0.4417 0.4917 0.4583
knn 0.3917 0.6 0.4417 0.4833 0.45
et 0.375 0.5333 0.4333 0.4667 0.45
dc 0.375 0.5333 0.4333 0.4667 0.425
qds 0.3417 0.5333 0.4 0.4583 0.3667
lgt 0.3417 0.525 0.3917 0.45 0.3583
lt 0.2333 0.45 0.3917 0.4167 0.3417
Run Code Online (Sandbox Code Playgroud)
这些是按上表中指定的顺序排列的各个子列的值
有一种肮脏的方法可以做到这一点,我会写下来,直到有人用更好的想法回答为止。开始了:
import pandas as pd
# I assume that you can read raw data named test.csv by pandas and
# set header = None cause you mentioned the Test data without any headers, so:
df = pd.read_csv("test.csv", header = None)
# Then define preferred Columns!
MyColumns = pd.MultiIndex.from_tuples([("Models" , ""),
("Accuracy without normalization" , ""),
("Accuracy with normalization" , "zscore"),
("Accuracy with normalization" , "minmax"),
("Accuracy with normalization" , "maxabs"),
("Accuracy with normalization" , "robust")])
# Create new DataFrame with specified Columns, after this you should pass values
New_DataFrame = pd.DataFrame(df , columns = MyColumns)
# a loop for passing values
for item in range(len(MyColumns)):
New_DataFrame.loc[: , MyColumns[item]] = df.iloc[: , item]
Run Code Online (Sandbox Code Playgroud)
这给了我:
毕竟,如果你想设置Models为 的索引New_DataFrame,你可以继续:
New_DataFrame.set_index(New_DataFrame.columns[0][0] , inplace=True)
New_DataFrame
Run Code Online (Sandbox Code Playgroud)
这给了我: