如何将字符串矩阵更改为整数矩阵

4da*_*ong 6 python numpy python-3.x pandas

我有一个这样的投票数据集:

republican,n,y,n,y,y,y,n,n,n,y,?,y,y,y,n,y
republican,n,y,n,y,y,y,n,n,n,n,n,y,y,y,n,?
democrat,?,y,y,?,y,y,n,n,n,n,y,n,y,y,n,n
democrat,n,y,y,n,?,y,n,n,n,n,y,n,y,n,n,y
Run Code Online (Sandbox Code Playgroud)

但它们都是字符串,所以我想将它们更改为整数矩阵并进行统计 hou_dat = pd.read_csv("house.data", header=None)

for i in range (0, hou_dat.shape[0]):
    for j in range (0, hou_dat.shape[1]):
        if hou_dat[i, j] == "republican":
            hou_dat[i, j] = 2
        if hou_dat[i, j] == "democrat":
            hou_dat[i, j] = 3
        if hou_dat[i, j] == "y":
            hou_dat[i, j] = 1
        if hou_dat[i, j] == "n":
            hou_dat[i, j] = 0
        if hou_dat[i, j] == "?":
            hou_dat[i, j] = -1

hou_sta = hou_dat.apply(pd.value_counts)
print(hou_sta)
Run Code Online (Sandbox Code Playgroud)

但是,它显示错误,如何解决?:

Exception has occurred: KeyError
(0, 0)
Run Code Online (Sandbox Code Playgroud)

Dat*_*ice 4

IIUC,你需要map并且stack

map_dict = {'republican' : 2,
           'democrat' : 3,
           'y' : 1,
           'n' : 0,
           '?' : -1}

df1 = df.stack().map(map_dict).unstack()

print(df1)

   0   1   2   3   4   5   6   7   8   9   10  11  12  13  14  15  16
0   2   0   1   0   1   1   1   0   0   0   1  -1   1   1   1   0   1
1   2   0   1   0   1   1   1   0   0   0   0   0   1   1   1   0  -1
2   3  -1   1   1  -1   1   1   0   0   0   0   1   0   1   1   0   0
3   3   0   1   1   0  -1   1   0   0   0   0   1   0   1   0   0   1
Run Code Online (Sandbox Code Playgroud)