我想使用urllib下载文件并在保存之前将文件解压缩到内存中.
这就是我现在所拥有的:
response = urllib2.urlopen(baseURL + filename)
compressedFile = StringIO.StringIO()
compressedFile.write(response.read())
decompressedFile = gzip.GzipFile(fileobj=compressedFile, mode='rb')
outfile = open(outFilePath, 'w')
outfile.write(decompressedFile.read())
Run Code Online (Sandbox Code Playgroud)
这最终会写出空文件.我怎样才能实现我追求的目标?
更新答案:
#! /usr/bin/env python2
import urllib2
import StringIO
import gzip
baseURL = "https://www.kernel.org/pub/linux/docs/man-pages/"
# check filename: it may change over time, due to new updates
filename = "man-pages-5.00.tar.gz"
outFilePath = filename[:-3]
response = urllib2.urlopen(baseURL + filename)
compressedFile = StringIO.StringIO(response.read())
decompressedFile = gzip.GzipFile(fileobj=compressedFile)
with open(outFilePath, 'w') as outfile:
outfile.write(decompressedFile.read())
Run Code Online (Sandbox Code Playgroud) 我写了一个脚本,调用QIIME中的函数来构建一堆图表.一切都运行良好完成,但matplotlib总是为它创建的每个绘图抛出以下反馈(超级烦人):
/usr/local/lib/python2.7/dist-packages/matplotlib/pyplot.py:412:RuntimeWarning:已打开超过20个数字.通过pyplot接口(
matplotlib.pyplot.figure)创建的数字将保留,直到明确关闭,并可能消耗太多内存.(要控制此警告,请参阅rcParamfigure.max_num_figures).max_open_warning,RuntimeWarning)
我发现这个页面似乎解释了如何解决这个问题,但在我按照指示后,没有任何变化:
import matplotlib as mpl
mpl.rcParams[figure.max_open_warning'] = 0
Run Code Online (Sandbox Code Playgroud)
我直接从python调用matplotlib后进入文件,看看我应该调查哪个rcparams文件并手动将20改为0.仍然没有变化.如果文档不正确,我也将其更改为1000,仍然收到相同的警告消息.
我知道这对于在功率有限的计算机上运行的人来说可能是一个问题,但在我的情况下这不是问题.如何让这些反馈永久消失?
C:\Users\Lenovo>conda info
Current conda install:
platform : win-64
conda version : 4.3.8
conda is private : False
conda-env version : 4.3.8
conda-build version : 1.21.3
python version : 3.5.2.final.0
requests version : 2.12.4
root environment : C:\Anaconda3 (writable)
default environment : C:\Anaconda3
envs directories : C:\Anaconda3\envs
package cache : C:\Anaconda3\pkgs
channel URLs : https://repo.continuum.io/pkgs/free/win-64
https://repo.continuum.io/pkgs/free/noarch
https://repo.continuum.io/pkgs/r/win-64
https://repo.continuum.io/pkgs/r/noarch
https://repo.continuum.io/pkgs/pro/win-64
https://repo.continuum.io/pkgs/pro/noarch
https://repo.continuum.io/pkgs/msys2/win-64
https://repo.continuum.io/pkgs/msys2/noarch
config file : None
offline mode : False
user-agent : conda/4.3.8 requests/2.12.4 CPython/3.5.2 Windows/7 Windows/6.1.7601
Run Code Online (Sandbox Code Playgroud)
最近,在通过conda安装或更新软件包之后,有时甚至是pip,以下序列将打印到控制台: …
我有一个数据帧merged_df_energy:
+------------------------+------------------------+------------------------+--------------+
| ACT_TIME_AERATEUR_1_F1 | ACT_TIME_AERATEUR_1_F3 | ACT_TIME_AERATEUR_1_F5 | class_energy |
+------------------------+------------------------+------------------------+--------------+
| 63.333333 | 63.333333 | 63.333333 | low |
| 0 | 0 | 0 | high |
| 45.67 | 0 | 55.94 | high |
| 0 | 0 | 23.99 | low |
| 0 | 20 | 23.99 | medium |
+------------------------+------------------------+------------------------+--------------+
Run Code Online (Sandbox Code Playgroud)
我想为每个ACT_TIME_AERATEUR_1_Fx(ACT_TIME_AERATEUR_1_F1,ACT_TIME_AERATEUR_1_F3和ACT_TIME_AERATEUR_1_F5)创建一个包含这些列的数据框:class_energy和sum_time
例如,对应于的数据框ACT_TIME_AERATEUR_1_F1:
+-----------------+-----------+
| class_energy | sum_time …Run Code Online (Sandbox Code Playgroud)