我似乎可以弄清楚如何将句柄和标签传递matplotlib.patches.Patch
给图例。
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
a_val = 0.6
colors = ['#EA5739','#FEFFBE','#4BB05C']
circ1 = mpatches.Patch( facecolor=colors[0],alpha=a_val,hatch=['\\\\'],label='Label1')
circ2= mpatches.Patch( facecolor=colors[1],alpha=a_val,hatch='o',label='Label2')
circ3 = mpatches.Patch(facecolor=colors[2],alpha=a_val,hatch='+',label='Label3')
fig,(ax) = plt.subplots()
ax.legend(handles = [circ1,circ2,circ3],loc=2)
plt.tight_layout()
Run Code Online (Sandbox Code Playgroud)
为什么上例中的图例是空白的?
在数据框中,
item_#, status, field1, field2
123, "A", "val1", "val2"
223, "B", "val3", "val4"
123, "B", "val5", "val6"
323, "A", "val7", "val8"
Run Code Online (Sandbox Code Playgroud)
item_#
我想要的是同时具有 status"A"
和 status的列表"B"
。类似df.groupby('item_#')[(df.status.isin(['A', 'B']
),但这实际上不起作用。它让我获得列表中任一值的所有项目。
任何建议将不胜感激!
我正在探索pandas.DataFrame.interpolate()
不同的方法,linear
与nearest
,并且当尾部缺少数据时,我发现这两种方法的输出不同。
例如:
import pandas as pd # version: '0.16.2' or '0.20.3'
>>> a = pd.DataFrame({'col1': [np.nan, 1, np.nan, 3, np.nan, 5, np.nan]})
Out[1]:
col1
0 NaN
1 1.0
2 NaN
3 3.0
4 NaN
5 5.0
6 NaN
>>> a.interpolate(method='linear')
Out[2]:
col1
0 NaN
1 1.0
2 2.0
3 3.0
4 4.0
5 5.0
6 5.0
>>> a.interpolate(method='nearest')
Out[3]:
col1
0 NaN
1 1.0
2 1.0
3 3.0
4 3.0
5 5.0
6 NaN …
Run Code Online (Sandbox Code Playgroud) 我有以下示例,其中我尝试向大于 999 的数字添加逗号分隔符(即 1,000 2,000 3,000 ...)
#Dummy data
Data1 <- data.frame(flow = c(8000,8.5,6,7.1,9), SP_elev = c(20,11,5,25,50))
Data2 <- data.frame(flow = c(7000,7.2,6.5,8.2,8.5), SP_elev = c(13,15,18,25,19))
Data3 <- data.frame(flow = c(2000,3,5,7,9), SP_elev = c(20,25,28,30,35))
Data4 <- data.frame(flow = c(1000,4,6,8,9), SP_elev = c(13,15,18,25,19))
Data5 <- data.frame(flow = c(1000,4,6,8,9), SP_elev = c(13,15,18,25,19))
Data6 <- data.frame(flow = c(1000,4,6,8,9), SP_elev = c(22,23,25,27,29))
#Create Vector list (in place of list.files)
dataframes = list("Data1" = Data1,
"Data2" = Data2,
"Data3" = Data3,
"Data4" = Data4,
"Data5" = …
Run Code Online (Sandbox Code Playgroud) 我有一个数据帧query2
:
Site TripDate Volume
0 003l 1990-06-10 2202.571850
1 003l 1991-07-26 2543.566201
2 003l 1991-11-01 1702.228651
3 003l 1992-10-15 2753.163510
4 003l 1993-04-01 2550.538237
5 003l 1993-10-08 2241.329021
Run Code Online (Sandbox Code Playgroud)
而另一个table1
:
TripDate Count
0 1990-06-10 35
1 1991-07-26 35
2 1992-10-15 34
3 1993-10-08 35
Run Code Online (Sandbox Code Playgroud)
我需要过滤query2
,只包括TripDates
在table1
.生成的过滤表格如下所示:
Site TripDate Volume
0 003l 1990-06-10 2202.571850
1 003l 1991-07-26 2543.566201
2 003l 1992-10-15 2753.163510
3 003l 1993-10-08 2241.329021
Run Code Online (Sandbox Code Playgroud) 现在,当我的时间序列在十年之初(即 1990、2000、2010 年等)开始时,我有一些符合我的规范的工作代码,但我不知道如何调整我的代码以具有当我的时间序列从不是偶数的年份(即 1993 年)开始时的正确格式。
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib import dates
def format_xaxis(fig):
years = dates.YearLocator(10,month=1,day=1)
years1=dates.YearLocator(2,month=1,day=1)
dfmt = dates.DateFormatter('%Y')
dfmt1 = dates.DateFormatter('%y')
[i.xaxis.set_major_locator(years) for i in fig.axes]
[i.xaxis.set_minor_locator(years1) for i in fig.axes]
[i.xaxis.set_major_formatter(dfmt) for i in fig.axes]
[i.xaxis.set_minor_formatter(dfmt1) for i in fig.axes]
[i.get_xaxis().set_tick_params(which='major', pad=15) for i in fig.axes]
for t in fig.axes:
for tick in t.xaxis.get_major_ticks():
tick.label1.set_horizontalalignment('center')
for label in t.get_xmajorticklabels() :
label.set_rotation(0)
label.set_weight('bold')
for label in t.xaxis.get_minorticklabels():
label.set_fontsize('small')
for label in t.xaxis.get_minorticklabels()[::5]: …
Run Code Online (Sandbox Code Playgroud) 我想从中获取结果pd.DataFrame.idxmax
并使用它来更改索引前的值,并使用最大值.
如果我有df
:
Mule Creek Saddle Mtn. Calvert Creek
Date
2011-05-01 23.400000 35.599998 8.6
2011-05-02 23.400000 35.599998 8.0
2011-05-03 23.400000 35.700001 7.6
2011-05-04 23.400000 50.000000 7.1
2011-05-05 23.100000 35.799999 6.4
2011-05-06 23.000000 35.799999 5.7
2011-05-07 40.000000 35.900002 4.7
2011-05-08 23.100000 36.500000 12.0
2011-05-09 23.299999 37.500000 4.4
2011-05-10 23.200001 37.500000 3.6
Run Code Online (Sandbox Code Playgroud)
我发现每列的最大值出现在哪里:
max = df.idxmax()
Run Code Online (Sandbox Code Playgroud)
我想确定最大值之前进行数值max
全部np.nan
期望的结果:
Mule Creek Saddle Mtn. Calvert Creek
Date
2011-05-01 NaN NaN NaN
2011-05-02 NaN NaN NaN
2011-05-03 NaN NaN …
Run Code Online (Sandbox Code Playgroud) 我使用该pdf()
设备创建了几个pdf ,我似乎无法从我的计算机中删除它们.我运行代码时收到以下错误.
> pdf(file="Appendix_B_1.pdf")
> plot(1:10)
> dev.off
function (which = dev.cur())
{
if (which == 1)
stop("cannot shut down device 1 (the null device)")
.External(C_devoff, as.integer(which))
dev.cur()
}
<bytecode: 0x0000000008c7ecd0>
<environment: namespace:grDevices>
Run Code Online (Sandbox Code Playgroud)
以上是我试图欺骗计算机认为它是一个新文件.它改变了文件,我可以打开它.我只是无法删除它.
我不知道这意味着什么,我想要做的就是删除文件.是否有一些解决方法可以关闭设备或从硬盘驱动器中删除文件?
我正在尝试将多个传递aggfuncs
给pd.pivot_table
:
如果我有new_df
:
ATI ATIMR
0 Basin Creek 2.0 0.039893
Calvert Creek 0.0 0.006824
Lick Creek 0.0 0.017371
Mule Creek 0.0 0.041154
Rocker Peak 2.0 0.027903
Saddle Mtn. 0.0 0.052603
Shower Falls 1.0 0.035456
1 Basin Creek 3.0 0.039893
Calvert Creek 1.0 0.006824
Lick Creek 1.0 0.017371
Run Code Online (Sandbox Code Playgroud)
这有效:
pct_75 = lambda y: np.percentile(y, 75)
func_list = [np.median, np.mean ,pct_75]
new_df = pd.pivot_table(new_df values='ATIMR',index='ATI',aggfunc=func_list)
Run Code Online (Sandbox Code Playgroud)
但是当我尝试传递第二个lambda
函数时,例如:
pct_25 = lambda x: np.percentile(x, 25)
func_list = [pct_25, …
Run Code Online (Sandbox Code Playgroud) python ×6
pandas ×5
dataframe ×2
matplotlib ×2
plot ×2
r ×2
axis-labels ×1
date ×1
delete-file ×1
filter ×1
legend ×1
pdf ×1
rstudio ×1
time-series ×1