我在想,如果有调用的可能性idxmin,并min在同一时间(在相同的呼叫/循环)。
假设以下数据框:
id option_1 option_2 option_3 option_4
0 0 10.0 NaN NaN 110.0
1 1 NaN 20.0 200.0 NaN
2 2 NaN 300.0 30.0 NaN
3 3 400.0 NaN NaN 40.0
4 4 600.0 700.0 50.0 50.0
Run Code Online (Sandbox Code Playgroud)
我想计算该系列的最小值(min)和包含最小值的列(idxmin)option_:
id option_1 option_2 option_3 option_4 min_column min_value
0 0 10.0 NaN NaN 110.0 option_1 10.0
1 1 NaN 20.0 200.0 NaN option_2 20.0
2 2 NaN 300.0 30.0 NaN option_3 30.0
3 …Run Code Online (Sandbox Code Playgroud) 我有一个包含日期时间和整数的数据框
import numpy as np
import pandas as pd
df = pd.DataFrame()
df['dt'] = pd.date_range("2017-01-01 12:00", "2017-01-01 12:30", freq="1min")
df['val'] = np.random.choice(xrange(1, 100), df.shape[0])
Run Code Online (Sandbox Code Playgroud)
给我
dt val
0 2017-01-01 12:00:00 33
1 2017-01-01 12:01:00 42
2 2017-01-01 12:02:00 44
3 2017-01-01 12:03:00 6
4 2017-01-01 12:04:00 70
5 2017-01-01 12:05:00 94*
6 2017-01-01 12:06:00 42*
7 2017-01-01 12:07:00 97*
8 2017-01-01 12:08:00 12
9 2017-01-01 12:09:00 11
10 2017-01-01 12:10:00 66
11 2017-01-01 12:11:00 71
12 2017-01-01 12:12:00 25
13 …Run Code Online (Sandbox Code Playgroud) 我想选择一个名为'Mid'的行,而不会丢失它的索引'Site'
以下代码显示了数据帧:
m.commodity
Run Code Online (Sandbox Code Playgroud)
price max maxperstep
Site Commodity Type
Mid Biomass Stock 6.0 inf inf
CO2 Env 0.0 inf inf
Coal Stock 7.0 inf inf
Elec Demand NaN NaN NaN
Gas Stock 27.0 inf inf
Hydro SupIm NaN NaN NaN
Lignite Stock 4.0 inf inf
Slack Stock 999.0 inf inf
Solar SupIm NaN NaN NaN
Wind SupIm NaN NaN NaN
North Biomass Stock 6.0 inf inf
CO2 Env 0.0 inf inf
Coal Stock 7.0 inf inf
Elec Demand NaN NaN …Run Code Online (Sandbox Code Playgroud) 我通过对 .fits 文件执行以下操作创建了一个 DataFrame:
data_dict= dict()
for obj in sortedpab:
for key in ['FIELD', 'ID', 'RA' , 'DEC' , 'Z_50', 'Z_84','Z_16' , 'PAB_FLUX', 'PAB_FLUX_ERR']:
data_dict.setdefault(key, list()).append(obj[key])
gooddf = pd.DataFrame(data_dict)
gooddf['Z_ERR']= ((gooddf['Z_84'] - gooddf['Z_50']) + (gooddf['Z_50'] - gooddf['Z_16'])) / (2 *
gooddf['Z_50'])
gooddf['OBS_PAB'] = 12820 * (1 + gooddf['Z_50'])
gooddf.loc[gooddf['FIELD'] == "ERS" , 'FIELD'] = "ERSPRIME"
gooddf = gooddf[['FIELD' , 'ID' , 'RA' , 'DEC' , 'Z_50' , 'Z_ERR' , 'PAB_FLUX' , 'PAB_FLUX_ERR' ,
'OBS_PAB']]
gooddf = gooddf[gooddf.OBS_PAB <= 16500]
Run Code Online (Sandbox Code Playgroud)
这给了我一个有 …
我对 pandas 的 iloc 函数有点困惑,因为我想选择一系列列,但输出与预期不同。行选择也会发生同样的情况,所以我写了一个小例子:
template = pd.DataFrame(
{'Headline': ['Subheading', '', 'Animal', 'Tiger', 'Bird', 'Lion'],
'Headline2': ['', 'Weight', 2017, 'group1', 'group2', 'group3'],
'Headline3': ['', '', 2018, 'group1', 'group2', 'group3']
})
Headline Headline2 Headline3
0 Subheading
1 Weight
2 Animal 2017 2018
3 Tiger group1 group1
4 Bird group2 group2
5 Lion group3 group3
Run Code Online (Sandbox Code Playgroud)
我想选择第 1 行到第 2 行,print(template.loc[1:2])结果是我所期望的:
Headline Headline2 Headline3
1 Weight
2 Animal 2017 2018
Run Code Online (Sandbox Code Playgroud)
如果我这样做,print(template.iloc[1:2])我会认为我会得到相同的结果,但没有:
Headline Headline2 Headline3
1 Weight
Run Code Online (Sandbox Code Playgroud)
我有点困惑,因为我期望这两个函数具有相同的行为,但是如果我选择一个范围(FROM:TO),两个函数的输出会有所不同。
似乎使用 iloc …
我正在使用 Python - 3.6 和 pandas - 0.24.1
我有一个熊猫数据框df1:
col1 col2
0 8388611 3.9386
1 8388612 1.9386
Run Code Online (Sandbox Code Playgroud)
col1我需要找到特定索引上的值
print(df1['col1'][1])
Run Code Online (Sandbox Code Playgroud)
错误:
Traceback (most recent call last):
File "/home/runner/.site-packages/pandas/core/indexes/base.py", line 2656, inget_loc
return self._engine.get_loc(key)
File "pandas/_libs/index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 132, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 1601, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 1608, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 1
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "main.py", line 24, in …Run Code Online (Sandbox Code Playgroud) 为了从熊猫中获取单个单元格,使用哪种方法更好(在性能和可靠性方面)DataFrame:get_value() 或 loc[]?
pandas ×7
python ×7
dataframe ×3
astronomy ×1
grouping ×1
multi-index ×1
numpy ×1
performance ×1
python-2.7 ×1