我正在尝试使用Python处理大量数据并在MySQL中维护处理状态.但是,我很惊讶没有python-mysql的标准连接池(如Java中的HikariCP).
我最初是从PyMySQL开始的,事情很棒,直到程序运行最初几个小时.几个小时后,事情开始失败.我遇到了很多错误:
pymysql.err.OperationalError: (2003, "Can't connect to MySQL server on '127.0.0.1' ([Errno 99] Cannot assign requested address)")
Run Code Online (Sandbox Code Playgroud)
此外,很多端口都处于TIME_WAIT状态,因为由于缺少连接池,我太频繁地打开和关闭连接
/d/p/950 ??? netstat -nt | wc -l
84752
Run Code Online (Sandbox Code Playgroud)
每本和这个,我试图设置tcp_fin_timeout和ip_local_port_range,但几乎没有任何改善.
echo 30 > /proc/sys/net/ipv4/tcp_fin_timeout
echo 15000 65000 > /proc/sys/net/ipv4/ip_local_port_range
Run Code Online (Sandbox Code Playgroud)
然后我发现MySQL提供了带有池功能的mysql.connector .在做完所有这些表现后实际上恶化了.更多流程开始失败.我正在使用Python的多处理模块在24核机器上同时运行29个进程(多处理.Pool默认选择此项).以下是代码,当然我使用.my.cnf传递所有凭据以避免将它们提交给git:
import mysql.connector
from mysql.connector import pooling
conn_pool = pooling.MySQLConnectionPool(pool_name="mypool1",
pool_size=pooling.CNX_POOL_MAXSIZE,
option_files=MYSQL_CONFIG,
option_groups=MYSQL_GROUP_NODE1,
allow_local_infile=True)
conn = conn_pool.get_connection()
Run Code Online (Sandbox Code Playgroud)
最后,还原为旧代码.仍在使用PyMySQL,虽然错误较少,但它仍然是一个重大问题.我查看了SQLAlchemy,并没有真正找到关于池的文档.
我想知道其他人如何处理mysql-python连接池问题?我真的相信应该有一些东西,所以我不必重新发明轮子.
任何指针都非常感谢.
connection-pooling time-wait pymysql mysql-connector-python python-multiprocessing
我有一个nat,它有各种服务器所以从我的本地服务器我想去nat然后从nat我必须ssh到其他机器
本地 - > NAT(abcuser @ publicIP with key 1) - > server1(xyzuser @ localIP with key 2)nat有不同的ssh密钥,每个服务器都有不同的ssh密钥如何使用fabric完成这种类型的multihop ssh我尝试使用env.roledefs功能但它似乎也没有工作我也不知道如何定义两个ssh密钥.我知道我们可以用env.key_filename定义一个密钥列表但问题是它会检查每个服务器的每个密钥?我如何才能更具体,并且只将键与一台服务器匹配
我尝试使用来自我本地机器的命令 fab deploy -g'ec2-user@54.251.151.39'-i'/ home/aman/Download/aws_oms.pem' ,我的脚本是
from __future__ import with_statement
from fabric.api import local, run, cd, env, execute
env.hosts=['ubuntu@10.0.0.77']
env.key_filename=['/home/ec2-user/varnish_cache.pem']
def deploy():
run("uname -a")
Run Code Online (Sandbox Code Playgroud) "DF","00000000@11111.COM","FLTINT1000130394756","26JUL2010","B2C","6799.2"
"Rail","00000.POO@GMAIL.COM","NR251764697478","24JUN2011","B2C","2025"
"DF","0000650000@YAHOO.COM","NF2513521438550","01JAN2013","B2C","6792"
"Bus","00009.GAURAV@GMAIL.COM","NU27012932319739","26JAN2013","B2C","800"
"Rail","0000.ANU@GMAIL.COM","NR251764697526","24JUN2011","B2C","595"
"Rail","0000MANNU@GMAIL.COM","NR251277005737","29OCT2011","B2C","957"
"Rail","0000PRANNOY0000@GMAIL.COM","NR251297862893","21NOV2011","B2C","212"
"DF","0000PRANNOY0000@YAHOO.CO.IN","NF251327485543","26JUN2011","B2C","17080"
"Rail","0000RAHUL@GMAIL.COM","NR2512012069809","25OCT2012","B2C","5731"
"DF","0000SS0@GMAIL.COM","NF251355775967","10MAY2011","B2C","2000"
"DF","0001HARISH@GMAIL.COM","NF251352240086","22DEC2010","B2C","4006"
"DF","0001HARISH@GMAIL.COM","NF251742087846","12DEC2010","B2C","1000"
"DF","0001HARISH@GMAIL.COM","NF252022031180","09DEC2010","B2C","3439"
"Rail","000AYUSH@GMAIL.COM","NR2151120122283","25JAN2013","B2C","136"
"Rail","000AYUSH@GMAIL.COM","NR2151213260036","28NOV2012","B2C","41"
"Rail","000AYUSH@GMAIL.COM","NR2151313264432","29NOV2012","B2C","96"
"Rail","000AYUSH@GMAIL.COM","NR2151413266728","29NOV2012","B2C","96"
"Rail","000AYUSH@GMAIL.COM","NR2512912359037","08DEC2012","B2C","96"
"Rail","000AYUSH@GMAIL.COM","NR2517612385569","12DEC2012","B2C","96"
Run Code Online (Sandbox Code Playgroud)
以上是样本数据.数据根据电子邮件地址排序,文件非常大,约为1.5Gb
我希望在另一个csv文件中输出这样的东西
"DF","00000000@11111.COM","FLTINT1000130394756","26JUL2010","B2C","6799.2",1,0 days
"Rail","00000.POO@GMAIL.COM","NR251764697478","24JUN2011","B2C","2025",1,0 days
"DF","0000650000@YAHOO.COM","NF2513521438550","01JAN2013","B2C","6792",1,0 days
"Bus","00009.GAURAV@GMAIL.COM","NU27012932319739","26JAN2013","B2C","800",1,0 days
"Rail","0000.ANU@GMAIL.COM","NR251764697526","24JUN2011","B2C","595",1,0 days
"Rail","0000MANNU@GMAIL.COM","NR251277005737","29OCT2011","B2C","957",1,0 days
"Rail","0000PRANNOY0000@GMAIL.COM","NR251297862893","21NOV2011","B2C","212",1,0 days
"DF","0000PRANNOY0000@YAHOO.CO.IN","NF251327485543","26JUN2011","B2C","17080",1,0 days
"Rail","0000RAHUL@GMAIL.COM","NR2512012069809","25OCT2012","B2C","5731",1,0 days
"DF","0000SS0@GMAIL.COM","NF251355775967","10MAY2011","B2C","2000",1,0 days
"DF","0001HARISH@GMAIL.COM","NF251352240086","09DEC2010","B2C","4006",1,0 days
"DF","0001HARISH@GMAIL.COM","NF251742087846","12DEC2010","B2C","1000",2,3 days
"DF","0001HARISH@GMAIL.COM","NF252022031180","22DEC2010","B2C","3439",3,10 days
"Rail","000AYUSH@GMAIL.COM","NR2151213260036","28NOV2012","B2C","41",1,0 days
"Rail","000AYUSH@GMAIL.COM","NR2151313264432","29NOV2012","B2C","96",2,1 days
"Rail","000AYUSH@GMAIL.COM","NR2151413266728","29NOV2012","B2C","96",3,0 days
"Rail","000AYUSH@GMAIL.COM","NR2512912359037","08DEC2012","B2C","96",4,9 days
"Rail","000AYUSH@GMAIL.COM","NR2512912359037","08DEC2012","B2C","96",5,0 days
"Rail","000AYUSH@GMAIL.COM","NR2517612385569","12DEC2012","B2C","96",6,4 days
"Rail","000AYUSH@GMAIL.COM","NR2517612385569","12DEC2012","B2C","96",7,0 days
"Rail","000AYUSH@GMAIL.COM","NR2151120122283","25JAN2013","B2C","136",8,44 days
"Rail","000AYUSH@GMAIL.COM","NR2151120122283","25JAN2013","B2C","136",9,0 days
Run Code Online (Sandbox Code Playgroud)
即如果第一次进入,我需要追加1如果它发生第二次我需要追加2同样我的意思是我需要计算文件中的电子邮件地址的出现次数,如果电子邮件存在两次或更多我想要区别日期和记住日期之间没有排序所以我们必须针对特定的电子邮件地址对它们进行排序,我正在寻找python中的解决方案,使用numpy或pandas库或任何其他可以处理这种类型的大数据的库而不放弃绑定内存异常我有双核处理器与centos 6.3和4GB的内存
我们正在尝试使用HAProxy + Lua构建传入请求验证平台.我们的用例是创建一个LUA脚本,它基本上会对Validation API进行套接字调用,并且基于Validation API的响应,我们希望将请求重定向到后端API,如果验证失败,我们希望从LUA脚本返回请求权限.例如,对于200响应,我们希望将请求重定向到后端api,对于404,我们希望返回请求.从文档中,我了解Lua-Haproxy集成有各种默认函数.
core.register_action() --> I'm using this. Take TXN as input
core.register_converters() --> Essentially used for string manipulations.
core.register_fetches() --> Takes TXN as input and returns string; Mainly used for representing dynamic backend profiles in haproxy config
core.register_init() --> Used for initialization
core.register_service() --> You have to return the response mandatorily while using this function, which doesn't satisfy our requirements
core.register_task() --> For using normal functions. No mandatory input class. TXN is required to fetch header details …Run Code Online (Sandbox Code Playgroud)