我有一个按日期索引的数据框
transactions_ind
Out[25]:
Ticker Transaction Number_of_units Price
Date
2012-10-11 ROG VX Equity Buy 12000 182.00000
2012-10-16 ROG VX Equity Sell -5000 184.70000
2012-11-16 ROG VX Equity Sell -5000 175.51580
2012-12-07 ROG VX Equity Buy 5000 184.90000
2012-12-11 ROG VX Equity Sell -3000 188.50000
2012-12-11 ROG VX Equity Reversal: Sell 3000 188.50000
2012-12-11 ROG VX Equity Sell -3000 188.50000
2012-12-11 ROG VX Equity Reversal: Sell 3000 188.50000
2012-12-11 ROG VX Equity Sell -3000 188.50000
2012-12-20 ROG VX Equity Sell -5000 185.80000 …Run Code Online (Sandbox Code Playgroud) 如何按日期列表选择数据框的多行
dates = pd.date_range('20130101', periods=6)
df = pd.DataFrame(np.random.randn(6,4), index=dates, columns=list('ABCD'))
In[1]: df
Out[1]:
A B C D
2013-01-01 0.084393 -2.460860 -0.118468 0.543618
2013-01-02 -0.024358 -1.012406 -0.222457 1.906462
2013-01-03 -0.305999 -0.858261 0.320587 0.302837
2013-01-04 0.527321 0.425767 -0.994142 0.556027
2013-01-05 0.411410 -1.810460 -1.172034 -1.142847
2013-01-06 -0.969854 0.469045 -0.042532 0.699582
myDates = ["2013-01-02", "2013-01-04", "2013-01-06"]
Run Code Online (Sandbox Code Playgroud)
所以输出应该是
A B C D
2013-01-02 -0.024358 -1.012406 -0.222457 1.906462
2013-01-04 0.527321 0.425767 -0.994142 0.556027
2013-01-06 -0.969854 0.469045 -0.042532 0.699582
Run Code Online (Sandbox Code Playgroud) 默认情况下,Seaborn热图中的注释位于每个单元格的中间.是否可以将注释移动到"左上角".
这是一个菜鸟问题,但我对此感到疯狂.我有一个名为bars.list的字符向量,我从FTP Sever下载.矢量看起来像这样:
"\"\",\"times\",\"open\",\"high\",\"low\",\"close\",\"numEvents\",\"volume\"\r\n\"1\",2015-05-18 06:50:00,23.98,23.98,23.5,23.77,421,0\r\n\"2\",2015-05-18 07:50:00,23.77,23.9,23.34,23.6,720,0\r\n\"3\",2015-05-18 08:50:00,23.6,23.6,23.32,23.42,720,0\r\n\"4\",2015-05-18 09:50:00,23.44,23.91,23.43,23.66,720,0\r\n\"5\",2015-05-18 10:50:00,23.67,24.06,23.59,24.02,720,0\r\n\"6\",2015-05-18 11:50:00,24.02,24.04,23.32,23.33,720,0\r\n\"7\",2015-05-18 12:50:00,23.33,23.42,22.74,22.81,720,0\r\n\"8\",2015-05-18 13:50:00,22.79,22.92,22.49,22.69,720,0\r\n\"9\",2015-05-18 14:50:00,22.69,22.7,22.14,22.14,481,0\r\n\"10\",2015-05-19 06:50:00,21.09,21.49,20.82,21.47,421,0\r\n\"11\",2015-05-19 07:50:00,21.48,21.68,21.46,21.51,720,0\r\n\"12\",2015-05-19 08:50:00,21.51,21.93,21.45,21.92,720,0\r\n\"13\",2015-05-19 09:50:00,21.92,21.92,21.55,21.55,720,0\r\n\"
Run Code Online (Sandbox Code Playgroud)
我需要将此向量转换为可用格式,但是
> read.table(bars.list, header = TRUE, sep = ",", quote = "", dec = ".")
Error in file(file, "rt") : cannot open the connection
In addition: Warning message:
In file(file, "rt") :
cannot open file '"","times","open","high","low","close","numEvents","volume"
"1",2015-05-18 06:50:00,23.98,23.98,23.5,23.77,421,0
"2",2015-05-18 07:50:00,23.77,23.9,23.34,23.6,720,0
"3",2015-05-18 08:50:00,23.6,23.6,23.32,23.42,720,0
"4",2015-05-18 09:50:00,23.44,23.91,23.43,23.66,720,0
Run Code Online (Sandbox Code Playgroud)
我不清楚为什么R告诉我某些Connection无法打开,因为该对象已作为参数粘贴到函数中.输出R向我显示警告标志已经非常接近我需要的...
是否可以加宽seaborn热图中特定列和行的线宽?
例如,这个热图可以
import numpy as np; np.random.seed(0)
import seaborn as sns; sns.set()
uniform_data = np.random.rand(10, 12)
ax = sns.heatmap(uniform_data, linewidths=1.0)
Run Code Online (Sandbox Code Playgroud)
变成这样的东西:
我想提取出现值变化的xts对象的日期,即A的值从1变为零或从0变为1的日期:
require(xts)
A <- xts(c(1,1,0,0,1,1,0,0,1,1), Sys.Date()-10:1)
colnames(A) <- c("A")
> A
A
2014-12-27 1
2014-12-28 1
2014-12-29 0
2014-12-30 0
2014-12-31 1
2015-01-01 1
2015-01-02 0
2015-01-03 0
2015-01-04 1
2015-01-05 1
Run Code Online (Sandbox Code Playgroud)
期望的结果看起来像这样
> from.one.to.zero
[1] "2014-12-29" "2015-01-02"
> from.zero.to.one
[1] "2014-12-31" "2015-01-04"
Run Code Online (Sandbox Code Playgroud) 我能够在spyder ide中导入pandas包; 但是,如果我尝试打开一个新的juypter笔记本,导入失败.
我在MAC OS X上使用Anaconda软件包分发.
这是我做的:
In [1]: import pandas
Run Code Online (Sandbox Code Playgroud)
这是我得到的回应:
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
<ipython-input-5-97925edf8fb0> in <module>()
----> 1 import pandas
//anaconda/lib/python2.7/site-packages/pandas/__init__.py in <module>()
11 "pandas from the source directory, you may need to run "
12 "'python setup.py build_ext --inplace' to build the C "
---> 13 "extensions first.".format(module))
14
15 from datetime import datetime
ImportError: C extension: hashtable not built. If you want to import pandas from the source directory, you may need …Run Code Online (Sandbox Code Playgroud) 是否有快速pythonic方式来转换此表
index = pd.date_range('2000-1-1', periods=36, freq='M')
df = pd.DataFrame(np.random.randn(36,4), index=index, columns=list('ABCD'))
In[1]: df
Out[1]:
A B C D
2000-01-31 H 1.368795 0.106294 2.108814
2000-02-29 -1.713401 0.557224 0.115956 -0.851140
2000-03-31 -1.454967 -0.791855 -0.461738 -0.410948
2000-04-30 1.688731 -0.216432 -0.690103 -0.319443
2000-05-31 -1.103961 0.181510 -0.600383 -0.164744
2000-06-30 0.216871 -1.018599 0.731617 -0.721986
2000-07-31 0.621375 0.790072 0.967000 1.347533
2000-08-31 0.588970 -0.360169 0.904809 0.606771
...
Run Code Online (Sandbox Code Playgroud)
进入这张桌子
2001 2000
12 11 10 9 8 7 6 5 4 3 2 1 12 11 10 9 8 7 …Run Code Online (Sandbox Code Playgroud) 任务是转换下表
import pandas as pd
import numpy as np
index = pd.date_range('2000-1-1', periods=700, freq='D')
df = pd.DataFrame(np.random.randn(700), index=index, columns=["values"])
df.groupby(by=[df.index.year, df.index.month]).sum()
In[1]: df
Out[1]:
values
2000 1 1.181000
2 -8.005783
3 6.590623
4 -6.266232
5 1.266315
6 0.384050
7 -1.418357
8 -3.132253
9 0.005496
10 -6.646101
11 9.616482
12 3.960872
2001 1 -0.989869
2 -2.845278
3 -1.518746
4 2.984735
5 -2.616795
6 8.360319
7 5.659576
8 0.279863
9 -5.220678
10 5.077400
11 1.332519
Run Code Online (Sandbox Code Playgroud)
这看起来像这样
Jan Feb Mar Apr May Jun …Run Code Online (Sandbox Code Playgroud) 我有两个价格系列
require(quantmod)
require(TTR)
tickers = c("IBM","SPY")
getSymbols(tickers, from="2010-10-20", to="2014-09-22")
prices = do.call(merge, lapply(tickers, function(x) Cl(get(x))))
> head(prices)
IBM.Close SPY.Close
2010-10-20 139.07 117.87
2010-10-21 139.83 118.13
2010-10-22 139.67 118.35
2010-10-25 139.84 118.70
2010-10-26 140.67 118.72
2010-10-27 141.43 118.38
Run Code Online (Sandbox Code Playgroud)
现在我想使用 TTR 包的 SMA 函数平滑系列。
sma.IMB = SMA(prices[,1])
sma.SPY = SMA(prices[,2])
sma.prices = cbind(sma.IBM, sma.SPY)
> head(sma.prices)
IBM.Close.SMA.3 SPY.Close.SMA.3
2010-10-20 NA NA
2010-10-21 NA NA
2010-10-22 139.5233 118.1167
2010-10-25 139.7800 118.3933
2010-10-26 140.0600 118.5900
2010-10-27 140.6467 118.6000
Run Code Online (Sandbox Code Playgroud)
这在处理许多资产时非常乏味,所以我想使用 apply 缩短这个过程
sma.prices = apply(prices, …Run Code Online (Sandbox Code Playgroud) 我的目标是创建多级数据框的堆积条形图.数据框如下所示:
import pandas as pd
import numpy as np
arrays = [np.array(['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux', 'qux']),
np.array(['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two', 'three'])]
s = pd.Series([10,20,10,22,10,24,10,26, 11], index=arrays)
In[1]: s
Out[1]:
bar one 10
two 20
baz one 10
two 22
foo one 10
two 24
qux one 10
two 26
three 11
dtype: int64
Run Code Online (Sandbox Code Playgroud)
我有两个目标:
创建一个堆积条形图,使得这些值被堆叠到4个单独的区间,称为"bar,baz,foo,qux".
4个酒吧应按大小排序.在这个例子中,qux条将具有高度(10 + 26 + 11 =)47并且应该是第一个左边,然后是foo条,其大小(10 + 24)= 34.
我有一份ISIN列表,这是我唯一的信息来源.在Excel中,我可以捕获许多情况下需要的Bloomber股票代码,因为它指定了交易资产的交易所的代码.为此,我只需要在BDP() - 公式中添加"... Equity isin",其中"..."是ISIN的占位符.所以使用新的Rblpapi包(这是一个很棒的工具!)我可以尝试做同样的事情:
这是一个随机ISIN列表
isins = c("LU0942970442", "LU0997545750" ,"CH0019597530" , "CH0017142719" , "CH0131872431", "VGG0475N1087", "US46429B6974",
"LU0911032141" , "DE000A1JCWS9")
Run Code Online (Sandbox Code Playgroud)
在bdp公式中添加"公平"并调用"TICKER_AND_EXCH_CODE"
require(Rblpapi)
blpConnect()
portfolio_ticker = bdp(paste(c(isins),"equity"), "TICKER_AND_EXCH_CODE")
Run Code Online (Sandbox Code Playgroud)
但是没有指定一些代号.
> portfolio_ticker
TICKER_AND_EXCH_CODE
LU0942970442 equity XBAC SW
LU0997545750 equity AXESZHD LX
CH0019597530 equity
CH0017142719 equity
CH0131872431 equity
VGG0475N1087 equity ARIASII VI
US46429B6974 equity
LU0911032141 equity FCEUSMI LX
DE000A1JCWS9 equity CHOMCAR GR
Run Code Online (Sandbox Code Playgroud)
我的问题是:这是一个思考错误在我身边还是包装中的错误?
编辑:作为一个例子,它在Excel中看起来如何,这是相应的图片.
python ×8
pandas ×7
dataframe ×3
r ×3
time-series ×3
date ×2
matplotlib ×2
seaborn ×2
xts ×2
zoo ×2
anaconda ×1
api ×1
apply ×1
bar-chart ×1
bloomberg ×1
blpapi ×1
csv ×1
duplicates ×1
excel ×1
heatmap ×1
indexing ×1
ipython ×1
jupyter ×1
pivot ×1
portfolio ×1
read.table ×1
select ×1
stacked ×1
sum ×1