小编I.M*_*.M.的帖子

有列表时如何获取数据框列的唯一值-Python

我有以下数据框，我想在其中打印color列的唯一值。

df = pd.DataFrame({'colors': ['green', 'green', 'purple', ['yellow , red'], 'orange'], 'names': ['Terry', 'Nor', 'Franck', 'Pete', 'Agnes']})

Output:
           colors   names
0           green   Terry
1           green     Nor
2          purple  Franck
3  [yellow , red]    Pete
4          orange   Agnes

Run Code Online (Sandbox Code Playgroud)

df.colors.unique()如果没有那[yellow , red]排就可以了。因为它是我不断得到TypeError: unhashable type: 'list'可以理解的错误。

有没有办法在不考虑这一行的情况下仍然获得唯一值？

我尝试了以下方法，但没有任何效果：

df = df[~df.colors.str.contains(',', na=False)] # Nothing happens
df = df[~df.colors.str.contains('[', na=False)] # Output: error: unterminated character set at position 0
df = df[~df.colors.str.contains(']', na=False)] # Nothing happens

Run Code Online (Sandbox Code Playgroud)

python unique pandas

I.M*_*.M.

2019 10-17

5
推荐指数

1
解决办法

66
查看次数

如何在给定位置选择包含特定子字符串的行 - python

我正在使用一个看起来像这样的大数据框：

     id      time1      time2   data    
0   id1   06:24:00   06:24:00      A
1   id2   07:24:00   07:24:00      A
2   id3   08:24:00   08:24:00      B

Run Code Online (Sandbox Code Playgroud)

我想选择具有所有行time1和/或time2在23:xx:yy格式。

我尝试使用以下代码，但速度非常慢，因此我正在寻找更有效的方法：

list_ = list()

for idx in df.index:
    if ('23' in df.time1[:2]) | ('23' in df.time2[:2]):
        list_.append(df.loc[df.index == idx])  ###--- Here I wanted to get a list of indexes so I could do a simple df.loc[] afterward

Run Code Online (Sandbox Code Playgroud)

我还尝试了以下代码，但所有代码都引发了错误：

df.loc[df.time1[:2] == '23']
df.loc['23' in df.time1[:2]]
df[df.time1[:2].str.contains('23')]

> IndexingError: Unalignable boolean Series provided as indexer …

Run Code Online (Sandbox Code Playgroud)

python substring python-3.x pandas

I.M*_*.M.

lucky-day

1
推荐指数

1
解决办法

63
查看次数