我有一个OHLC价格的数据集,我从CSV解析成熊猫数据帧和重采样到15个分钟吧:
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 500047 entries, 1998-05-04 04:45:00 to 2012-08-07 00:15:00
Freq: 15T
Data columns:
Close 363152 non-null values
High 363152 non-null values
Low 363152 non-null values
Open 363152 non-null values
dtypes: float64(4)
Run Code Online (Sandbox Code Playgroud)
我想添加各种计算列,从简单的列开始,例如期间范围(HL),然后是布尔值,以指示我将定义的价格模式的出现 - 例如锤子蜡烛模式,其样本定义:
def closed_in_top_half_of_range(h,l,c):
return c > l + (h-l)/2
def lower_wick(o,l,c):
return min(o,c)-l
def real_body(o,c):
return abs(c-o)
def lower_wick_at_least_twice_real_body(o,l,c):
return lower_wick(o,l,c) >= 2 * real_body(o,c)
def is_hammer(row):
return lower_wick_at_least_twice_real_body(row["Open"],row["Low"],row["Close"]) \
and closed_in_top_half_of_range(row["High"],row["Low"],row["Close"])
Run Code Online (Sandbox Code Playgroud)
基本问题:如何将函数映射到列,特别是我想引用多个其他列或整行或其他什么?
这篇文章涉及从单个源列添加两个计算列,这些列很接近但不完全相同.
稍微高级一点:对于参考多个条形(T)确定的价格模式,我如何从函数定义中引用不同的行(例如T-1,T-2等)?
我有一个通过HDFStore存储的Pandas DataFrame,它基本上存储了关于我正在进行的测试运行的摘要行.
每行中的几个字段包含可变长度的描述性字符串.
当我进行测试运行时,我创建了一个新的DataFrame,其中包含一行:
def export_as_df(self):
return pd.DataFrame(data=[self._to_dict()], index=[datetime.datetime.now()])
Run Code Online (Sandbox Code Playgroud)
然后调用HDFStore.append(string, DataFrame)将新行添加到现有DataFrame.
这很好,除了其中一个字符串列内容大于已存在的最长实例,我得到以下错误:
File "<ipython-input-302-a33c7955df4a>", line 516, in save_pytables
store.append('tests', test.export_as_df())
File "/Library/Frameworks/EPD64.framework/Versions/7.3/lib/python2.7/site-packages/pandas/io/pytables.py", line 532, in append
self._write_to_group(key, value, table=True, append=True, **kwargs)
File "/Library/Frameworks/EPD64.framework/Versions/7.3/lib/python2.7/site-packages/pandas/io/pytables.py", line 788, in _write_to_group
s.write(obj = value, append=append, complib=complib, **kwargs)
File "/Library/Frameworks/EPD64.framework/Versions/7.3/lib/python2.7/site-packages/pandas/io/pytables.py", line 2491, in write
min_itemsize=min_itemsize, **kwargs)
File "/Library/Frameworks/EPD64.framework/Versions/7.3/lib/python2.7/site-packages/pandas/io/pytables.py", line 2254, in create_axes
raise Exception("cannot find the correct atom type -> [dtype->%s,items->%s] %s" % (b.dtype.name, b.items, str(detail)))
Exception: cannot find the correct atom type …Run Code Online (Sandbox Code Playgroud) 我正在尝试使用流畅的Aggregate接口根据多字段键为每个组选择集合中的最新记录:
var matches = await Collection.Aggregate()
.Match(x => x.EffectiveDate >= minEffectiveDate)
.SortByDescending(x => x.LastUpdate)
.Group(key => new { key.EffectiveDate, key.ProductOid, key.InstrumentParentOid, key.ComponentOid, key.EventSummary }, g => g.First())
.ToListAsync();
Run Code Online (Sandbox Code Playgroud)
但是,我得到以下异常:
System.InvalidCastException occurred
HResult=-2147467262
Message=Unable to cast object of type 'MongoDB.Driver.Linq.Expressions.SerializationExpression' to type 'System.Linq.Expressions.MethodCallExpression'.
Source=MongoDB.Driver
StackTrace:
at MongoDB.Driver.Linq.Processors.GroupSerializationInfoBinder.GetBodyFromSelector(MethodCallExpression node)
at MongoDB.Driver.Linq.Processors.GroupSerializationInfoBinder.GetAggregationArgument(MethodCallExpression node)
at MongoDB.Driver.Linq.Processors.GroupSerializationInfoBinder.VisitMethodCall(MethodCallExpression node)
at MongoDB.Driver.Linq.Translators.AggregateProjectionTranslator.BindSerializationInfo(SerializationInfoBinder binder, LambdaExpression node, IBsonSerializer parameterSerializer)
at MongoDB.Driver.Linq.Translators.AggregateProjectionTranslator.TranslateGroup[TKey,TDocument,TResult](Expression`1 idProjector, Expression`1 groupProjector, IBsonSerializer`1 parameterSerializer, IBsonSerializerRegistry serializerRegistry)
at MongoDB.Driver.IAggregateFluentExtensions.GroupExpressionProjection`3.Render(IBsonSerializer`1 documentSerializer, IBsonSerializerRegistry serializerRegistry)
at MongoDB.Driver.AggregateFluent`2.<>c__DisplayClass1`1.<Group>b__0(IBsonSerializer`1 s, IBsonSerializerRegistry sr)
at …Run Code Online (Sandbox Code Playgroud) 在Json.NET中引用关于字符串的自定义序列化以及反之亦然的SO问题,使用EnumMember属性修饰枚举成员 - 有没有办法让MongoDB执行相同的专长?
我刚刚将一些以前的字符串字段重构为枚举,并且想知道是否有任何方法可以指示Mongo在(反)序列化时也读取EnumMember值并避免我必须通过数据库并更新所有当前文本值.
我正在研究/回测交易系统.
我有一个包含OHLC数据的Pandas数据框,并添加了几个计算列,用于识别我将用作启动头寸信号的价格模式.
我现在想添加一个能够跟踪当前净头寸的列.我已经尝试使用df.apply(),但是将数据帧本身作为参数而不是行对象传递,因为后者我似乎无法回顾之前的行以确定它们是否导致任何价格模式:
open_campaigns = []
Campaign = namedtuple('Campaign', 'open position stop')
def calc_position(df):
# sum of current positions + any new positions
if entered_long(df):
open_campaigns.add(
Campaign(
calc_long_open(df.High.shift(1)),
calc_position_size(df),
calc_long_isl(df)
)
)
return sum(campaign.position for campaign in open_campaigns)
def entered_long(df):
return buy_pattern(df) & (df.High > df.High.shift(1))
df["Position"] = df.apply(lambda row: calc_position(df), axis=1)
Run Code Online (Sandbox Code Playgroud)
但是,这会返回以下错误:
ValueError: ('The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()', u'occurred at index 1997-07-16 08:00:00')
Run Code Online (Sandbox Code Playgroud)
滚动窗口函数似乎很自然,但据我所知,它们只作用于单个时间序列或列,因此无法工作,因为我需要在多个时间点访问多列的值.
我该怎么做呢?
python algorithmic-trading quantitative-finance dataframe pandas