我想从目录中读取几个csv文件到pandas并将它们连接成一个大的DataFrame.我虽然无法弄明白.这是我到目前为止:
import glob
import pandas as pd
# get data file names
path =r'C:\DRO\DCL_rawdata_files'
filenames = glob.glob(path + "/*.csv")
dfs = []
for filename in filenames:
dfs.append(pd.read_csv(filename))
# Concatenate all data into one DataFrame
big_frame = pd.concat(dfs, ignore_index=True)
Run Code Online (Sandbox Code Playgroud)
我想在for循环中需要一些帮助???
import numpy as np
y = np.array(((1,2,3),(4,5,6),(7,8,9)))
OUTPUT:
print(y.flatten())
[1 2 3 4 5 6 7 8 9]
print(y.ravel())
[1 2 3 4 5 6 7 8 9]
Run Code Online (Sandbox Code Playgroud)
两个函数都返回相同的列表.那么两个不同功能执行相同工作的需求是什么.
我对这段代码的工作原理有点困惑:
fig, axes = plt.subplots(nrows=2, ncols=2)
plt.show()
Run Code Online (Sandbox Code Playgroud)
在这种情况下,无花果轴如何工作?它有什么作用?
为什么这不能做同样的事情:
fig = plt.figure()
axes = fig.subplots(nrows=2, ncols=2)
Run Code Online (Sandbox Code Playgroud)
谢谢
以下有什么区别?
>>> import numpy as np
>>> arr = np.array([[[ 0, 1, 2],
... [ 10, 12, 13]],
... [[100, 101, 102],
... [110, 112, 113]]])
>>> arr
array([[[ 0, 1, 2],
[ 10, 12, 13]],
[[100, 101, 102],
[110, 112, 113]]])
>>> arr.ravel()
array([ 0, 1, 2, 10, 12, 13, 100, 101, 102, 110, 112, 113])
>>> arr.ravel()[0] = -1
>>> arr
array([[[ -1, 1, 2],
[ 10, 12, 13]],
[[100, 101, 102],
[110, 112, 113]]])
>>> list(arr.flat)
[-1, …Run Code Online (Sandbox Code Playgroud) 我编写了以下代码来在不同的子图中绘制 6 个饼图,但出现错误。如果我只使用它来绘制 2 个图表,则此代码可以正常工作,但除此之外还会产生错误。
我的数据集中有 6 个分类变量,它们的名称存储在 list 中cat_cols。图表将根据训练数据绘制train。
代码
fig, axes = plt.subplots(2, 3, figsize=(24, 10))
for i, c in enumerate(cat_cols):
train[c].value_counts()[::-1].plot(kind = 'pie', ax=axes[i], title=c, autopct='%.0f', fontsize=18)
axes[i].set_ylabel('')
plt.tight_layout()
Run Code Online (Sandbox Code Playgroud)
错误
AttributeError: 'numpy.ndarray' object has no attribute 'get_figure'
Run Code Online (Sandbox Code Playgroud)
我们如何纠正这个问题?
Seaborn 的猫图似乎无法与 plt.subplots() 一起使用。我不确定这里的问题是什么,但我似乎无法将它们并排放置。
#Graph 1
plt.subplot(121)
sns.catplot(x="HouseStyle",y="SalePrice",data=df,kind="swarm")
#Graph 2
plt.subplot(122)
sns.catplot(x="LandContour",y="SalePrice",data=df,kind="swarm")
Run Code Online (Sandbox Code Playgroud)
每当我尝试使用distplot时seaborn,我都会显示此警告,而我似乎无法弄清楚我做错了什么,对不起,如果这很简单的话.
警告:
FutureWarning:不推荐使用非元组序列进行多维索引; 用
arr[tuple(seq)]而不是arr[seq].将来,这将被解释为数组索引arr[np.array(seq)],这将导致错误或不同的结果.return np.add.reduce(sorted [indexer]*weights,axis = axis)/ sumval
这是一个可重复的例子:
import numpy as np
import pandas as pd
import random
import seaborn as sns
kde_data = np.random.normal(loc=0.0, scale=1, size=100) # fake data
kde_data = pd.DataFrame(kde_data)
kde_data.columns = ["value"]
#kde_data.head()
Run Code Online (Sandbox Code Playgroud)
现在,情节是正确的,但我继续得到warning上述并使用arr[tuple(seq)]而arr[seq]不是帮助我.
sns.distplot(kde_data.value, hist=False, kde=True)
Run Code Online (Sandbox Code Playgroud)
我正在研究Jupyter,这是模块版本:
seaborn==0.9.0
scipy==1.1.0
pandas==0.23.0
numpy==1.15.4
Run Code Online (Sandbox Code Playgroud) seaborn 文档区分了图形级函数和轴级函数:https://seaborn.pydata.org/introduction.html#figure-level-and-axes-level-functions
我知道像 sns.boxplot 这样的函数可以将轴作为参数,因此可以在子图中使用。
但是 sns.relplot() 怎么样?有没有办法把它放到子图中?
更一般地说,有什么方法可以让seaborn在子图中生成线图吗?
例如,这不起作用:
fig,ax=plt.subplots(2)
sns.relplot(x,y, ax=ax[0])
Run Code Online (Sandbox Code Playgroud)
因为 relplot 不将轴作为参数。
我想在 2 x 2 网格中排列四个 Seaborn 图。我尝试了以下代码,但出现异常。我还想知道如何在子图中设置标题和 xlabel、ylabel 以及整个网格图的标题。
一些玩具数据:
df
'{"age":{"76":33,"190":30,"255":36,"296":27,"222":19,"147":39,"127":23,"98":24,"168":29,"177":39,"197":27,"131":36,"36":30,"219":28,"108":38,"198":34,"40":32,"246":24,"109":26,"117":47,"20":26,"113":24,"279":35,"120":35,"7":26,"119":28,"272":24,"66":28,"87":28,"133":28},"Less_than_College":{"76":1,"190":1,"255":0,"296":1,"222":1,"147":1,"127":0,"98":0,"168":1,"177":1,"197":0,"131":1,"36":0,"219":0,"108":0,"198":0,"40":0,"246":0,"109":1,"117":1,"20":0,"113":0,"279":0,"120":0,"7":0,"119":1,"272":0,"66":1,"87":0,"133":0},"college":{"76":0,"190":0,"255":0,"296":0,"222":0,"147":0,"127":1,"98":1,"168":0,"177":0,"197":1,"131":0,"36":1,"219":1,"108":0,"198":1,"40":1,"246":0,"109":0,"117":0,"20":1,"113":1,"279":0,"120":1,"7":1,"119":0,"272":0,"66":0,"87":1,"133":1},"Bachelor":{"76":0,"190":0,"255":1,"296":0,"222":0,"147":0,"127":0,"98":0,"168":0,"177":0,"197":0,"131":0,"36":0,"219":0,"108":1,"198":0,"40":0,"246":1,"109":0,"117":0,"20":0,"113":0,"279":1,"120":0,"7":0,"119":0,"272":1,"66":0,"87":0,"133":0},"terms":{"76":30,"190":15,"255":30,"296":30,"222":30,"147":15,"127":15,"98":15,"168":30,"177":30,"197":15,"131":30,"36":15,"219":15,"108":30,"198":7,"40":30,"246":15,"109":15,"117":15,"20":15,"113":15,"279":15,"120":15,"7":15,"119":30,"272":15,"66":30,"87":30,"133":15},"Principal":{"76":1000,"190":1000,"255":1000,"296":1000,"222":1000,"147":800,"127":800,"98":800,"168":1000,"177":1000,"197":1000,"131":1000,"36":1000,"219":800,"108":1000,"198":1000,"40":1000,"246":1000,"109":1000,"117":1000,"20":1000,"113":800,"279":800,"120":800,"7":800,"119":1000,"272":1000,"66":1000,"87":1000,"133":1000}}'
fig = plt.figure()
fig.subplots_adjust(hspace=0.4, wspace=0.4)
ax = fig.add_subplot(2, 2, 1)
ax.sns.distplot(df.Principal)
ax = fig.add_subplot(2, 2, 2)
ax.sns.distplot(df.terms)
ax = fig.add_subplot(2, 2, 3)
ax.sns.barplot(data = df[['Less_than_College', 'college', 'Bachelor', ]])
ax = fig.add_subplot(2, 2, 4)
ax.sns.boxplot(data = df['age'])
plt.show()
AttributeError: 'AxesSubplot' object has no attribute 'sns'
Run Code Online (Sandbox Code Playgroud)