小编Wil*_*llZ的帖子

pandas.qcut和pandas.cut有什么区别？

文件说:

http://pandas.pydata.org/pandas-docs/dev/basics.html

"连续值可以使用切割(基于值的箱子)和qcut(基于样本分位数的箱子)功能"离散化"

听起来很抽象......我可以看到下面例子中的差异,但qcut(样本分位数)实际上是什么/意味着什么？你什么时候使用qcut与cut？

谢谢.

factors = np.random.randn(30)

In [11]:
pd.cut(factors, 5)
Out[11]:
[(-0.411, 0.575], (-0.411, 0.575], (-0.411, 0.575], (-0.411, 0.575], (0.575, 1.561], ..., (-0.411, 0.575], (-1.397, -0.411], (0.575, 1.561], (-2.388, -1.397], (-0.411, 0.575]]
Length: 30
Categories (5, object): [(-2.388, -1.397] < (-1.397, -0.411] < (-0.411, 0.575] < (0.575, 1.561] < (1.561, 2.547]]

In [14]:
pd.qcut(factors, 5)
Out[14]:
[(-0.348, 0.0899], (-0.348, 0.0899], (0.0899, 1.19], (0.0899, 1.19], (0.0899, 1.19], ..., (0.0899, 1.19], (-1.137, -0.348], (1.19, 2.547], [-2.383, -1.137], (-0.348, 0.0899]] …

Run Code Online (Sandbox Code Playgroud)

python pandas

Wil*_*llZ

lucky-day

76
推荐指数

3
解决办法

5万
查看次数

Cassandra cqlsh - 如何显示timestamp列的微秒/毫秒？

我正在插入带有时间戳列的Cassandra表.我的数据具有微秒精度,因此时间数据字符串如下所示:

2015-02-16T18:00:03.234 + 00:00

但是,在运行select查询的cqlsh中,没有显示微秒数据,我只能看到时间下降到第二精度.在234微秒数据未示出.

我想我有两个问题:

1)Cassandra是否使用时间戳数据类型捕获微秒？我的猜测是肯定的？

2)如何通过cqlsh验证？

表定义:

create table data (
  datetime timestamp,
  id text,
  type text,
  data text,
  primary key (id, type, datetime)
) 
with compaction = {'class' : 'DateTieredCompactionStrategy'};

Run Code Online (Sandbox Code Playgroud)

使用Java PreparedStatment运行插入查询:

insert into data (datetime, id, type, data) values(?, ?, ?, ?);

Run Code Online (Sandbox Code Playgroud)

选择查询只是:

select * from data;

Run Code Online (Sandbox Code Playgroud)

timestamp cql cassandra cqlsh

Wil*_*llZ

2016 07-03

21
推荐指数

2
解决办法

3万
查看次数

如何为 ubuntu 16.04 安装 gcc 7.3？

看起来 Jonathan F 的 gcc-7.3 版本不适用于 Ubuntu 16.04。amd64 构建处于失败状态。见这里。

此时 Ubuntu 工具链这里只有 gcc-7.2。

是否有任何相对简单的替代方法可以将现有的 gcc 升级到 gcc 7.3？

谢谢。

ubuntu gcc

Wil*_*llZ

lucky-day

8
推荐指数

1
解决办法

3万
查看次数

Python Pandas DataFrame - 无法在同一轴上绘制条形和线条

我可能会做错事但我正在努力实现以下目标:

# plot bars and lines in the same figure, sharing both x and y axes.
df = some DataFrame with multiple columns
_, ax = plt.subplots()
df[col1].plot(kind='bar', ax=ax)
df[col2].plot(ax=ax, marker='o', ls='-')
ax.legend(loc='best')

Run Code Online (Sandbox Code Playgroud)

我希望看到的图表都有些酒吧和线路.然而,我最终得到的只是线条df[col2],条形图df[col1]不在图表上.以前的任何事情df[col2]似乎都被覆盖了.

我绕过这个:

df[col1].plot(kind='bar', ax=ax, label=bar_labels)
ax.plot(df[col2], marker='o', ls='-', label=line_labels)
ax.legend(loc='best')

Run Code Online (Sandbox Code Playgroud)

然而,这并不完美,因为我不得不使用label标签,否则传说将不包括df[col2]...

那里的任何人都有一个更优雅的解决方案,使条形和线条出现？

**编辑**感谢@DizietAsahi - 发现这是DatetimeIndex作为x值的问题.在熊猫提交以下内容:

https://github.com/pydata/pandas/issues/10761#issuecomment-128671523

python matplotlib pandas

Wil*_*llZ

2015 08-08

4
推荐指数

2
解决办法

1646
查看次数

标签统计

pandas ×2

python ×2

cassandra ×1

cql ×1

cqlsh ×1

gcc ×1

matplotlib ×1

timestamp ×1

ubuntu ×1

pandas.qcut和pandas.cut有什么区别？

Cassandra cqlsh - 如何显示timestamp列的微秒/毫秒？

如何为 ubuntu 16.04 安装 gcc 7.3？

Python Pandas DataFrame - 无法在同一轴上绘制条形和线条

标签 统计

小编Wil_llZ的帖子

标签统计