安装最新的Mac OSX 64位Anaconda Python发行版后,我在尝试启动IPython Notebook时不断收到ValueError.
启动ipython工作正常:
3-millerc-~:ipython
Python 2.7.3 |Anaconda 1.4.0 (x86_64)| (default, Feb 25 2013, 18:45:56)
Type "copyright", "credits" or "license" for more information.
IPython 0.13.1 -- An enhanced Interactive Python.
? -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help -> Python's own help system.
object? -> Details about 'object', use 'object??' for extra details.
Run Code Online (Sandbox Code Playgroud)
但是启动ipython笔记本:
4-millerc-~:ipython notebook
Run Code Online (Sandbox Code Playgroud)
ValueError中的结果(带回溯):
Traceback (most recent call last):
File "/Users/millerc/anaconda/bin/ipython", line 7, in <module>
launch_new_instance()
File "/Users/millerc/anaconda/lib/python2.7/site-packages/IPython/frontend/terminal/ipapp.py", …Run Code Online (Sandbox Code Playgroud) 我有以下Pandas DataFrame:
In [31]:
import pandas as pd
sample = pd.DataFrame({'Sym1': ['a','a','a','d'],'Sym2':['a','c','b','b'],'Sym3':['a','c','b','d'],'Sym4':['b','b','b','a']},index=['Item1','Item2','Item3','Item4'])
In [32]: print(sample)
Out [32]:
Sym1 Sym2 Sym3 Sym4
Item1 a a a b
Item2 a c c b
Item3 a b b b
Item4 d b d a
Run Code Online (Sandbox Code Playgroud)
我想找到一种优雅的方法来Item根据这个距离矩阵得到每个距离:
In [34]:
DistMatrix = pd.DataFrame({'a': [0,0,0.67,1.34],'b':[0,0,0,0.67],'c':[0.67,0,0,0],'d':[1.34,0.67,0,0]},index=['a','b','c','d'])
print(DistMatrix)
Out[34]:
a b c d
a 0.00 0.00 0.67 1.34
b 0.00 0.00 0.00 0.67
c 0.67 0.00 0.00 0.00
d 1.34 0.67 0.00 0.00
Run Code Online (Sandbox Code Playgroud)
例如Item1,Item2 …
交换Pandas Dataframe轴的最有效方法是什么?
例如,如何将df1转换为下面的df2?
In [2]: import pandas as pd
In [3]: df1 = pd.DataFrame({'one' : [1., 2., 3., 4.], 'two' : [4., 3., 2., 1.]})
In [4]: df1
Out[4]:
one two
0 1 4
1 2 3
2 3 2
3 4 1
In [5]: df2 = pd.DataFrame({0 : [1,4], 1 : [2,3], 2 : [3,2], 3 : [4,1]}, index=['one','two'])
In [6]: df2
Out[6]:
0 1 2 3
one 1 2 3 4
two 4 3 2 1
Run Code Online (Sandbox Code Playgroud) 我发生的错误与此SO问题非常相似.rpy2使用conda 简单安装的解决方案不起作用.
我的情况的主要区别是rpy2在我更新到Mac OSX 10.11(El Capitan)之前正常工作.我的Python版本是Python 2.7.10,conda:3.18.4,R:R version 3.2.2 (2015-08-14) -- "Fire Safety并且都是使用anaconda发行版安装的.
我收到以下错误:
ImportError: dlopen(/Users/user/anaconda/lib/python2.7/site-packages/rpy2/rinterface/_rinterface.so, 2): Library not loaded: @rpath/R/lib/libR.dylib
Referenced from: /Users/user/anaconda/lib/python2.7/site-packages/rpy2/rinterface/_rinterface.so
Reason: image not found
Run Code Online (Sandbox Code Playgroud)
在尝试加载rpy2.ipython扩展时:
%load_ext rpy2.ipython
Run Code Online (Sandbox Code Playgroud)
我有一个预感,这是一个类似于处理加载rJavaR包的这个问题的修复.
我有一个熊猫数据框df:
Out[16]:
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 269850 entries, 2012-12-19 16:15:36 to 2012-12-20 14:36:55
Data columns:
X1 269850 non-null values
X2 269848 non-null values
X3 269848 non-null values
dtypes: float64(2), object(1)
Run Code Online (Sandbox Code Playgroud)
我想对数据帧进行切片,以从2012-12-20 05:00:00到返回四个小时的数据窗口2012-12-20 09:00:00
当我尝试:
Slicedf = df.truncate(before='12/20/2012 05:00:00',after='12/20/2012 09:00:00')
Run Code Online (Sandbox Code Playgroud)
发生以下错误:
KeyError: datetime.datetime(2012, 12, 20, 5, 0)
Run Code Online (Sandbox Code Playgroud)
我也尝试过(从Pandas DataFrame按天/小时/分钟分片):
from datetime import datetime
x=datetime(2012,12,20,5,0,0)
y=datetime(2012,12,20,9,0,0)
Slicedf = df.ix[x:y]
Run Code Online (Sandbox Code Playgroud)
返回完全相同的错误。
我有一个带有DateTimeIndex的Pandas Dataframe和带有每小时对象的列,我想将一个列转换并输出到一个JSON文件中,该文件由每日小时值数组组成.
一个简单的例子:
如果我有Dataframe:
In [106]:
rng = pd.date_range('1/1/2011 01:00:00', periods=12, freq='H')
df = pd.DataFrame(randn(12, 1), index=rng, columns=['A'])
In [107]:
df
Out[107]:
A
2011-01-01 01:00:00 -0.067214
2011-01-01 02:00:00 0.820595
2011-01-01 03:00:00 0.442557
2011-01-01 04:00:00 -1.000434
2011-01-01 05:00:00 -0.760783
2011-01-01 06:00:00 -0.106619
2011-01-01 07:00:00 0.786618
2011-01-01 08:00:00 0.144663
2011-01-01 09:00:00 -1.455017
2011-01-01 10:00:00 0.865593
2011-01-01 11:00:00 1.289754
2011-01-01 12:00:00 0.601067
Run Code Online (Sandbox Code Playgroud)
我想要这个json文件:
[
[-0.0672138259,0.8205950583,0.4425568167,-1.0004337373,-0.7607833867,-0.1066187698,0.7866183048,0.1446634381,-1.4550165851,0.8655931982,1.2897541164,0.6010672247]
]
Run Code Online (Sandbox Code Playgroud)
我的实际数据帧会延长很多天,因此大致如下所示:
[
[value@hour1day1, value@hour2day1.....value@hour24day1],
[value@hour1day2, value@hour2day2.....value@hour24day2],
[value@hour1day3, value@hour2day3.....value@hour24day3],
....
[value@hour1LastDay, value@hour2LastDay.....value@hour24LastDay]
]
Run Code Online (Sandbox Code Playgroud) 根据DataFrame中的一列来遮蔽pandas子图的最优雅方法是什么?
一个简单的例子:
In [8]:
from random import *
import pandas as pd
randBinList = lambda n: [randint(0,1) for b in range(1,n+1)]
rng = pd.date_range('1/1/2011', periods=72, freq='H')
ts = pd.DataFrame({'Value1': randn(len(rng)),'Value2': randn(len(rng)),'OnOff': randBinList(len(rng))}, index=rng)
ts.plot(subplots=True)
Run Code Online (Sandbox Code Playgroud)
结果如下:

理想情况下,我想要一个正好的子图,Value1并且Value2两个图都被阴影使用axvspanwhere On(1.0在中的值OnOff)被着色并且Off没有阴影.