我读过不少资源(AO.1,2),但我无法得到PostgreSQL关于冲突忽略行为SQLAlchemy的工作.
我用这个被接受的答案作为基础,但它给出了
SAWarning: Can't validate argument 'append_string'; can't locate any SQLAlchemy dialect named 'append'
Run Code Online (Sandbox Code Playgroud)
我已经尝试将postgresql方言添加到@compile子句,重命名我的对象,但它不起作用.我也尝试使用str(insert())+ " ON CONFILCT IGNORE"没有结果.(顺便说一句,这并不奇怪)
如何才能将On CONFLICT IGNORE其添加到插入中?我喜欢建议的解决方案,因为我可以看到自己不想要IGNORE每个人的行为INSERT
PS.使用python 2.7(不介意升级到3.4/3.5),最新的sqlalchemy(1.x)
我正在尝试使用postgres(pq lib)数据库使用Goose创建此函数.
我的代码如下:
CREATE OR REPLACE FUNCTION add_userlocation(user_id INT, location_id INT) RETURNS VOID AS
$BODY$
BEGIN
LOOP
UPDATE userslocations SET count = count+1 WHERE userid = user_id AND locationid = location_id;
IF found THEN
RETURN;
END IF;
BEGIN
INSERT INTO userslocations(userid,locationid, count) VALUES (user_id, location_id, 1);
RETURN;
EXCEPTION WHEN unique_violation THEN
END;
END LOOP;
END;
$BODY$
LANGUAGE plpgsql;
Run Code Online (Sandbox Code Playgroud)
当我尝试goose up它时提供错误:
(pq: unterminated dollar-quoted string at or near "$BODY$
BEGIN
LOOP
-- first try to …Run Code Online (Sandbox Code Playgroud) 我是PhantomJS的新手,我正在尝试使用phantomjs驱动程序运行我的selenium测试(python),但它不会是web元素.
Ghostdriver日志:
[INFO - 2015-02-27T15:24:40.236Z] GhostDriver - Main - running on port 52653
[INFO - 2015-02-27T15:24:41.075Z] Session [bfd397f0-be94-11e4-ad03-b711254501c8] - page.settings - {"XSSAuditingEnabled":false,"javascriptCanCloseWindows":true,"javascriptCanOpenWindows":true,"javascriptEnabled":true,"loadImages":true,"localToRemoteUrlAccessEnabled":false,"userAgent":"Mozilla/5.0 (Macintosh; Intel Mac OS X) AppleWebKit/538.1 (KHTML, like Gecko) PhantomJS/2.0.0 Safari/538.1","webSecurityEnabled":true}
[INFO - 2015-02-27T15:24:41.075Z] Session [bfd397f0-be94-11e4-ad03-b711254501c8] - page.customHeaders: - {}
[INFO - 2015-02-27T15:24:41.075Z] Session [bfd397f0-be94-11e4-ad03-b711254501c8] - Session.negotiatedCapabilities - {"browserName":"phantomjs","version":"2.0.0","driverName":"ghostdriver","driverVersion":"1.2.0","platform":"mac-10.9 (Mavericks)-64bit","javascriptEnabled":true,"takesScreenshot":true,"handlesAlerts":false,"databaseEnabled":false,"locationContextEnabled":false,"applicationCacheEnabled":false,"browserConnectionEnabled":false,"cssSelectorsEnabled":true,"webStorageEnabled":false,"rotatable":false,"acceptSslCerts":false,"nativeEvents":true,"proxy":{"proxyType":"direct"}}
[INFO - 2015-02-27T15:24:41.075Z] SessionManagerReqHand - _postNewSessionCommand - New Session Created: bfd397f0-be94-11e4-ad03-b711254501c8
[ERROR - 2015-02-27T15:24:47.242Z] WebElementLocator - _handleLocateCommand - Element(s) NOT Found: GAVE UP. Search Stop Time: 1425050687190
:262 …Run Code Online (Sandbox Code Playgroud) 我找到了相关方法:
find- 不起作用,因为此版本neo4j不支持标签。match- 不起作用,因为我无法指定关系,因为该节点还没有关系。match_one- 与...一样match。node- 不起作用,因为我不知道节点的 id。我需要相当于:
start n = node(*) where n.name? = "wvxvw" return n;
Run Code Online (Sandbox Code Playgroud)
密码查询。看起来这应该是基本的,但事实并非如此......
附言。我反对使用 Cypher 的原因太多,无法一一提及。所以这也不是一个选择。
我正在使用自动激活将数据存储在多处理设置中。但是,我不知道如何将它合并到多处理管理器功能中。
我的自动激活代码来自Python 中的多级 'collection.defaultdict'并且在没有多处理发生时工作正常。
class vividict(dict):
def __getitem__(self, item):
try:
return dict.__getitem__(self, item)
except KeyError:
value = self[item] = type(self)()
return value
Run Code Online (Sandbox Code Playgroud)
我的 multiproc 代码是相对简单的:
if __name__ == "__main__":
man = Manager()
ngramDict = man.dict()
print(ngramDict) # {}
s_queue = Queue()
aProces = Process(target=insert_ngram, args=(s_queue,ngramDict,))
aProces.start()
aProces.join()
print(ngramDict) # {}
write_to_file()
Run Code Online (Sandbox Code Playgroud)
在 insert_ngram 中读取、写入和更新字典:
def insert_ngram(sanitize_queue, ngramDict):
ngramDict = Vividict() # obviously this overwrites the manager
try:
for w in iter(s_queue.get, None):
if ngramDict[w[0]][w[1]][w[2]][w[3]][w[4]]:
ngramDict[w[0]][w[1]][w[2]][w[3]][w[4]]+=int(w[5])
else:
ngramDict[w[0]][w[1]][w[2]][w[3]][w[4]]=int(w[5])
print(ngramDict) …Run Code Online (Sandbox Code Playgroud) 我正在读一个有效的JSON文件(嵌套5级深度),然后向它添加一些数据,然后尝试将这些数据用于某些计算.
我int is not subscriptable以随机的方式得到错误.我无法绕过它.铸造str()没有帮助,印刷与pprint不缓解它,铸造int()对输入也没有帮助.我拼命耗尽了选择......
主功能
with open(rNgram_file, 'r', encoding='utf-8') as ngram_file:
data = json.load(ngram_file)
data = rank_items(data)
data = probability_items(data)
Run Code Online (Sandbox Code Playgroud)
rank_items(数据)
所有值都在5嵌套级别计算,并在树中向上添加.我将转换添加int()到输入作为可能的解决方案,但这没有帮助.得到的问题出现了x_grams['_rank']
for ngram, one_grams in data.items():
ngram_rank = 0
for one_gram, two_grams in one_grams.items():
one_gram_rank = 0
[..]
for four_gram, values in four_grams.items():
# 4gram = of, values = 34
three_gram_rank += values
four_grams['_rank'] = int(three_gram_rank)
two_gram_rank += three_gram_rank
[..]
two_grams['_rank'] = int(one_gram_rank)
ngram_rank += one_gram_rank …Run Code Online (Sandbox Code Playgroud) 我有一个Neo4J企业数据库,运行在具有8Gb RAM和80Gb SSD的DigitalOcean VPS上.Neo4J实例的性能目前很糟糕:
match (n) where n.gram='0gram' AND n.word=~'a.' return n.word LIMIT 5 @ 349ms
match (n) where n.gram='0gram' AND n.word=~'a.*' return n.word LIMIT 25 @ 1588ms
Run Code Online (Sandbox Code Playgroud)
我理解正则表达式是昂贵的,但在同样的查询,我用任何其他字母替换'a.'或'a.*'部分,Neo4j只是崩溃.在此之前我可以看到内存中的大量增加(达到90%),并且CPU飙升.
我的Neo4j填充如下:
Number Of Relationship Type Ids In Use: 1,
Number Of Node Ids In Use: 172412046,
Number Of Relationship Ids In Use: 172219328,
Number Of Property Ids In Use: 344453742
Run Code Online (Sandbox Code Playgroud)
VPS只运行Neo4J(在debian 7/amd64上).我使用NUMA + parallelGC标志,因为它们应该更快.我一直在调整我的RAM设置,虽然现在经常没有崩溃,但我觉得应该有一些收获
neostore.nodestore.db.mapped_memory=1024M
neostore.relationshipstore.db.mapped_memory=2048M
neostore.propertystore.db.mapped_memory=6144M
neostore.propertystore.db.strings.mapped_memory=512M
neostore.propertystore.db.arrays.mapped_memory=512M
# caching
cache_type=hpc
node_cache_array_fraction=7
relationship_cache_array_fraction=5
# node_cache_size=3G
# …Run Code Online (Sandbox Code Playgroud) python ×5
dictionary ×2
neo4j ×2
postgresql ×2
casting ×1
cypher ×1
exception ×1
ghostdriver ×1
go ×1
goose ×1
json ×1
libpq ×1
performance ×1
phantomjs ×1
py2neo ×1
ram ×1
selenium ×1
sql ×1
sqlalchemy ×1
webdriver ×1