请参阅下面的示例csv文件:
A,B,C
d,e,f
g,h,i
Run Code Online (Sandbox Code Playgroud)
第一行用大写字母是我的标题.
我试过这个:
df = pd.read_csv("example.csv", header=0, sep=",", index_col=0, parse_dates=True)
Run Code Online (Sandbox Code Playgroud)
并且创建的数据框看起来像这样,标题搞砸了.
B C
A
d e f
g h i
Run Code Online (Sandbox Code Playgroud)
任何人都知道为什么或如何手动修复此问题?
我正在使用Keras构建我的第一个人工多层感知器神经网络.
这是我的输入数据:
这是我用来构建我的初始模型的代码,它基本上遵循Keras示例代码:
model = Sequential()
model.add(Dense(64, input_dim=14, init='uniform'))
model.add(Activation('tanh'))
model.add(Dropout(0.5))
model.add(Dense(64, init='uniform'))
model.add(Activation('tanh'))
model.add(Dropout(0.5))
model.add(Dense(2, init='uniform'))
model.add(Activation('softmax'))
sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='mean_squared_error', optimizer=sgd)
model.fit(X_train, y_train, nb_epoch=20, batch_size=16)
Run Code Online (Sandbox Code Playgroud)
输出:
Epoch 1/20
1213/1213 [==============================] - 0s - loss: 0.1760
Epoch 2/20
1213/1213 [==============================] - 0s - loss: 0.1840
Epoch 3/20
1213/1213 [==============================] - 0s - loss: 0.1816
Epoch 4/20
1213/1213 [==============================] - 0s - loss: 0.1915
Epoch 5/20
1213/1213 [==============================] - 0s - loss: 0.1928
Epoch 6/20
1213/1213 [==============================] …Run Code Online (Sandbox Code Playgroud) https://plot.ly/python/bar-charts/#bar-chart-with-line-plot
我想使用plotly和iPython创建一个带有线条图的条形图,如上例所示.另一方面,我希望条形图是一个水平堆积条形图,如下面的例子中使用plotly和iPython.我该怎么做呢?
https://plot.ly/python/bar-charts/#colored-bar-chart
y_saving_yes = [1, 2, 4, 6, 7, 7]
y_saving_no = [10, 10, 10, 10, 10, 10]
y_net_worth = [93453, 81666, 69889, 78381, 141395, 92969]
x_saving = ['Premium', 'Spot Shadow', 'Slow Motion', 'Highlight Music','Extra Text', 'Top Play']
x_net_worth = ['Premium', 'Spot Shadow', 'Slow Motion', 'Highlight Music','Extra Text', 'Top Play']
trace1 = Bar(
x=y_saving,
y=x_saving,
marker=Marker(
color='rgba(50, 171, 96, 0.6)',
line=Line(
color='rgba(50, 171, 96, 1.0)',
width=1,
),
),
name='Highlight Properties',
orientation='h',
)
trace2 = Bar(
x=y_saving,
y=x_saving,
marker=Marker(
color='rgba(50, …Run Code Online (Sandbox Code Playgroud) 我使用以下代码创建了一个图表。我如何不仅返回节点,还返回每个用户连接到多个视频的时间和地点以及每个视频连接到多个用户的关系?
CREATE CONSTRAINT ON (u:User) ASSERT u.user IS UNIQUE;
CREATE CONSTRAINT ON (v:Video) ASSERT v.video IS UNIQUE;
USING PERIODIC COMMIT 100000
LOAD CSV WITH HEADERS FROM 'asdfjkl;' AS line
WITH distinct line.user as user_data
MERGE (:User {user: user_data });
USING PERIODIC COMMIT 100000
LOAD CSV WITH HEADERS FROM 'asdfjkl;' AS line
WITH distinct line.video as video_data
MERGE (:Video {video: video_data });
USING PERIODIC COMMIT 100000
LOAD CSV WITH HEADERS FROM 'asdfjkl;' AS line
MATCH (u:User {user: line.user })
MATCH (v:Video …Run Code Online (Sandbox Code Playgroud) 
这是我到目前为止使用Plotly和iPython创建的条形图.显然这很难理解.我想要做的是在x轴上创建箱子.例如,为0到0的x值创建总y值的1 bar.并且50-100.等等.
可以使用matplotlib或Plotly来完成吗?
Plotly代码:
data = Data([
Bar(
x=[tuples[0] for tuples in tuples_list],
y=[tuples[1] for tuples in tuples_list]
)
])
layout = dict(
title='Public Video Analysis',
yaxis=YAxis(
title = 'Views'),
xaxis1=XAxis(
title = "Duration in Seconds"),
)
fig = Figure(data=data, layout=layout)
py.iplot(fig)
Run Code Online (Sandbox Code Playgroud) 如何在Python中将ticks转换为datetime?
我想将52707330000转换为1小时27分50秒.
不知何故,它在这里工作 - http://tickstodatetime.com/.我试过检查元素,但我不懂javascript.
代码:
w = Window().partitionBy("ticker").orderBy("date")
x = s_df.withColumn("daily_return", (col("close") - lag("close", 1).over(w)) / lag("close", 1).over(w))
Run Code Online (Sandbox Code Playgroud)
s_df 的样子:
+----------+------+------+------+------+--------+------+
| date| open| high| low| close| volume|ticker|
+----------+------+------+------+------+--------+------+
|2016-11-02| 111.4|112.35|111.23|111.59|28331709| AAPL|
|2016-11-01|113.46|113.77|110.53|111.49|43825812| AAPL|
|2016-10-31|113.65|114.23| 113.2|113.54|26419398| AAPL|
+----------+------+------+------+------+--------+------+
Run Code Online (Sandbox Code Playgroud)
那么 X 的样子:
+----------+--------------------+
| date| avg(daily_return)|
+----------+--------------------+
|2015-12-28|0.004124786535090563|
|2015-11-20|0.006992226387807268|
|2015-12-29| 0.01730500286123971|
Run Code Online (Sandbox Code Playgroud)
我想找到每组股票的 avg(daily_return) 的标准偏差。
我试过的:
x.agg(stddev("avg(daily_return)")).over(w)
Run Code Online (Sandbox Code Playgroud)
我收到此错误:
AttributeError: 'DataFrame' object has no attribute 'over'
Run Code Online (Sandbox Code Playgroud)
我正在尝试做的事情是不可能的,还是有另一种方法可以做到?
这是我第一次尝试使用Python和Sci-Kit Learn进行K-Means聚类,我不知道如何制作我的最终聚类图或如何微调我的K意味着聚类算法.
我的最终目标是找到一组用户类别,描述一些有趣或有用的行为特征.
ATTEMPT 1:
输入:性别,年龄范围,国家(所有热门编码因为数据是分类的)和帐户年龄(以周为单位的数字)
码:
# Convert DataFrame to matrix
mat2 = all_dummy.as_matrix()
# Using sklearn
km2 = sklearn.cluster.KMeans(n_clusters=6)
km2.fit(mat2)
# Get cluster assignment labels
labels2 = km2.labels_
# Format results as a DataFrame
results2 = pd.DataFrame([all_dummy.index,labels2]).T
plot_x2 = results2[0].tolist()
plot_y2 = results2[1].tolist()
pyplot.scatter(plot_x2,plot_y2)
pyplot.show()
Run Code Online (Sandbox Code Playgroud)
情节:
具体问题:
我从一个元组列表开始(a,所有b).我想以元组列表结束(b,all a).
例如:
FROM
(a1,[b1,b2,b3])
(a2,[b2])
(a3,[b1,b2])
TO
(b1,[a1,a3])
(b2[a1,a2,a3])
(b3,[a1]
Run Code Online (Sandbox Code Playgroud)
我如何使用Python 2执行此操作?谢谢您的帮助.
python ×7
plotly ×2
apache-spark ×1
bar-chart ×1
csv ×1
cypher ×1
datetime ×1
ipython ×1
keras ×1
linegraph ×1
list ×1
matplotlib ×1
neo4j ×1
nodes ×1
pandas ×1
pyspark ×1
python-2.7 ×1
relationship ×1
scikit-learn ×1
subplot ×1
tuples ×1