小编Spa*_*ade的帖子

SQLite查询的执行时间:单位

如SQLite文档中所述,可以使用:

sqlite> .timer ON

Run Code Online (Sandbox Code Playgroud)

或者向〜/ .sqliterc添加相同的命令

完成此操作后,SQLite shell将为每个执行的查询响应CPU时间的用户和sys组件:

user@machine% sqlite3 test.db
-- Loading resources from ~/.sqliterc

SQLite version 3.7.14 2012-09-03 15:42:36
Enter ".help" for instructions
Enter SQL statements terminated with a ";"

sqlite> select count(*) from my_table;
count(*)
10143270
CPU Time: user 0.199970 sys 1.060838

Run Code Online (Sandbox Code Playgroud)

虽然我发现这个答案提供了时间单位的证据,但是我很难同意它.当使用秒表计时执行查询时,我发现几乎所有查询都需要花费的时间超过shell的时间.例如,在上面的例子中定时查询大约花了1分54秒实时.造成这种差异的原因是什么？

再一次,这些单位是什么？什么是用户和sys组件？

我在可通过NFS访问的Debian GNU/Linux 6.0.6(squeeze)发行版上运行SQLite 3.7.14.

linux sqlite

Spa*_*ade

2017 05-23

19
推荐指数

1
解决办法

1万
查看次数

BeautifulSoup 不一致的行为

我对我在两个不同环境中编写的以下 HTML 抓取代码的行为完全感到困惑，需要帮助找到这种差异的根本原因。

import sys
import bs4
import md5
import logging
from urllib2 import urlopen
from platform import platform

# Log particulars of the environment
logging.warning("OS platform is %s" %platform())
logging.warning("Python version is %s" %sys.version)
logging.warning("BeautifulSoup is at %s and its version is %s" %(bs4.__file__, bs4.__version__))

# Open web-page and read HTML
url = 'http://www.ncbi.nlm.nih.gov/Traces/wgs/?val=JXIG&size=all'
response = urlopen(url)
html = response.read()

# Calculate MD5 to ensure that the same string was downloaded
print "MD5 sum for …

Run Code Online (Sandbox Code Playgroud)

python beautifulsoup html-parsing web-scraping python-2.7

Spa*_*ade

lucky-day

2
推荐指数

1
解决办法

893
查看次数

更新 gridfs 文件对象的元数据

我使用 GridFS 如下：

connection = MongoClient(host='localhost')
db = connection.gridfs_example
fs = gridfs.GridFS(db)
fileId = fs.put("Contents of my file", key='s1')

Run Code Online (Sandbox Code Playgroud)

在文件最初存储在 GridFS 中之后，我有一个计算与文件内容相关的附加元数据的过程。

def qcFile(fileId):
    #DO QC
    return "QC PASSED"

qcResult = qcFile(fileId)

Run Code Online (Sandbox Code Playgroud)

如果我能做到，那就太好了：

fs.update(fileId, QC_RESULT = qcResult)

Run Code Online (Sandbox Code Playgroud)

但该选项似乎不存在于文档中。我在这里发现（问题更新了解决方案）Java 驱动程序似乎提供了一个选项来执行这样的操作，但在 python gridfs 中找不到它的等效项。

那么，如何使用 pymongo 用新计算的元数据值标记我的文件qcResult？我在文档中找不到它。

python mongodb pymongo gridfs

Spa*_*ade

2017 05-23

2
推荐指数

1
解决办法

3214
查看次数

标签统计

python ×2

beautifulsoup ×1

gridfs ×1

html-parsing ×1

linux ×1

mongodb ×1

pymongo ×1

python-2.7 ×1

sqlite ×1

web-scraping ×1

SQLite查询的执行时间:单位

BeautifulSoup 不一致的行为

更新 gridfs 文件对象的元数据

标签 统计

小编Spa_ade的帖子

标签统计