小编Mik*_*ike的帖子

使用map时Pandas警告:尝试在DataFrame的切片副本上设置值

我有以下代码,它的工作原理.这基本上重命名列中的值,以便以后可以合并它们.

pop = pd.read_csv('population.csv')
pop_recent = pop[pop['Year'] == 2014]

mapping = {
        'Korea, Rep.': 'South Korea',
        'Taiwan, China': 'Taiwan'
}
f= lambda x: mapping.get(x, x)
pop_recent['Country Name'] = pop_recent['Country Name'].map(f)
Run Code Online (Sandbox Code Playgroud)

警告: 正在尝试在DataFrame的切片副本上设置值.尝试使用.loc [row_indexer,col_indexer] = value,请参阅文档中的警告:http: //pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy pop_recent ['国家名称'] = pop_recent ['国家名称'].地图(f)

我确实谷歌了!但似乎没有任何例子使用地图,所以我不知所措......

python pandas

6
推荐指数
1
解决办法
3万
查看次数

安装 Biopython:ImportError:没有名为 Bio 的模块

尝试在 Fedora 21、Python 2.7 上安装 Biopython。我做了以下

[mike@localhost Downloads](17:32)$ sudo pip2.7 install biopython
You are using pip version 6.1.1, however version 7.1.0 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
Collecting biopython
/usr/lib/python2.7/site-packages/pip/_vendor/requests/packages/urllib3/util/ssl_.py:79: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
  InsecurePlatformWarning
  Downloading biopython-1.65.tar.gz (12.6MB)
    100% |????????????????????????????????| 12.6MB 33kB/s 
Installing collected packages: biopython
  Running setup.py install …
Run Code Online (Sandbox Code Playgroud)

python fedora python-install biopython

5
推荐指数
1
解决办法
9349
查看次数

使用 Python 中的 Pandas,仅选择 group by group 计数为 1 的行

我已经按照此处的建议过滤了我的数据:使用 Python 中的 Pandas,为每个组选择最高值行

    author        cat  val
0  author1  category2   15
1  author2  category4    9
2  author3  category1    7
3  author3  category3    7  
Run Code Online (Sandbox Code Playgroud)

现在,我只想让作者出现在这个数据框中一次。我写了这个,但它不起作用:

def where_just_one_exists(group):
        return group.loc[group.count() == 1]
most_expensive_single_category = most_expensive_for_each_model.groupby('author', as_index = False).apply(where_just_one_exists).reset_index(drop = True)
print most_expensive_single_category
Run Code Online (Sandbox Code Playgroud)

错误:

  File "/home/mike/anaconda/lib/python2.7/site-packages/pandas/core/indexing.py", line 1659, in check_bool_indexer
    raise IndexingError('Unalignable boolean Series key provided')
pandas.core.indexing.IndexingError: Unalignable boolean Series key provided
Run Code Online (Sandbox Code Playgroud)

我想要的输出是:

    author        cat  val
0  author1  category2   15
1  author2  category4    9
2  author3  category1    7
3  author3  category3    7 
Run Code Online (Sandbox Code Playgroud)

python dataframe pandas

3
推荐指数
1
解决办法
2410
查看次数

使用Python中的Pandas,为每个组选择最高值行

使用Pandas,用于以下数据集

author1,category1,10.00
author1,category2,15.00
author1,category3,12.00
author2,category1,5.00
author2,category2,6.00
author2,category3,4.00
author2,category4,9.00
author3,category1,7.00
author3,category2,4.00
author3,category3,7.00
Run Code Online (Sandbox Code Playgroud)

我想为每位作者获得最高价值

author1,category2,15.00
author2,category4,9.00
author3,category1,7.00
author3,category3,7.00
Run Code Online (Sandbox Code Playgroud)

(抱歉,我是一只大熊猫.)

python pandas

1
推荐指数
1
解决办法
2529
查看次数

标签 统计

python ×4

pandas ×3

biopython ×1

dataframe ×1

fedora ×1

python-install ×1