dtype 对象不支持唯一的轴参数

Abh*_*yal 7 python numpy unique frequency python-3.x

我正在尝试按列获取唯一计数,但我的数组具有分类变量(dtype 对象)

val, count = np.unique(x, axis=1, return_counts=True)
Run Code Online (Sandbox Code Playgroud)

虽然我收到这样的错误:

TypeError: The axis argument to unique is not supported for dtype object
Run Code Online (Sandbox Code Playgroud)

我该如何解决这个问题?

样品 x:

array([[' Private', ' HS-grad', ' Divorced'],
       [' Private', ' 11th', ' Married-civ-spouse'],
       [' Private', ' Bachelors', ' Married-civ-spouse'],
       [' Private', ' Masters', ' Married-civ-spouse'],
       [' Private', ' 9th', ' Married-spouse-absent'],
       [' Self-emp-not-inc', ' HS-grad', ' Married-civ-spouse'],
       [' Private', ' Masters', ' Never-married'],
       [' Private', ' Bachelors', ' Married-civ-spouse'],
       [' Private', ' Some-college', ' Married-civ-spouse']], dtype=object)
Run Code Online (Sandbox Code Playgroud)

需要以下计数:

for x_T in x.T:
    val, count = np.unique(x_T, return_counts=True)
    print (val,count)


[' Private' ' Self-emp-not-inc'] [8 1]
[' 11th' ' 9th' ' Bachelors' ' HS-grad' ' Masters' ' Some-college'] [1 1 2 2 2 1]
[' Divorced' ' Married-civ-spouse' ' Married-spouse-absent'
 ' Never-married'] [1 6 1 1]
Run Code Online (Sandbox Code Playgroud)

小智 6

你可以使用 Itemfreq 即使它的输出看起来不像你的它提供了所需的计数:

import numpy as np
from scipy.stats import itemfreq

x = np. array([[' Private', ' HS-grad', ' Divorced'],
       [' Private', ' 11th', ' Married-civ-spouse'],
       [' Private', ' Bachelors', ' Married-civ-spouse'],
       [' Private', ' Masters', ' Married-civ-spouse'],
       [' Private', ' 9th', ' Married-spouse-absent'],
       [' Self-emp-not-inc', ' HS-grad', ' Married-civ-spouse'],
       [' Private', ' Masters', ' Never-married'],
       [' Private', ' Bachelors', ' Married-civ-spouse'],
       [' Private', ' Some-college', ' Married-civ-spouse']], dtype=object)

itemfreq(x)
Run Code Online (Sandbox Code Playgroud)

输出:

array([[' 11th', 1],
       [' 9th', 1],
       [' Bachelors', 2],
       [' Divorced', 1],
       [' HS-grad', 2],
       [' Married-civ-spouse', 6],
       [' Married-spouse-absent', 1],
       [' Masters', 2],
       [' Never-married', 1],
       [' Private', 8],
       [' Self-emp-not-inc', 1],
       [' Some-college', 1]], dtype=object)
Run Code Online (Sandbox Code Playgroud)

否则,您可以尝试指定另一个 dtype,例如:

val, count = np.unique(x.astype("<U22"), axis=1, return_counts=True)
Run Code Online (Sandbox Code Playgroud)

为此,您的阵列必须有所不同