在Flask中使用pymysql时出错

Bha*_*gav 6 python mysql flask pymysql

我正在使用pymysql客户端连接到我的Flask API中的mysql,几天后一切正常(大约1-2天),然后突然开始抛出此错误

Traceback (most recent call last):
  File "/usr/local/lib/python3.4/dist-packages/pymysql/connections.py", line 1039, in _write_bytes
    self._sock.sendall(data)
TimeoutError: [Errno 110] Connection timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "Main.py", line 194, in post
    result={'resultCode':100,'resultDescription':'SUCCESS','result':self.getStudentATData(studentId,args['chapterId'])}
  File "Main.py", line 176, in getStudentATData
    cur.execute("my query")
  File "/usr/local/lib/python3.4/dist-packages/pymysql/cursors.py", line 166, in execute
    result = self._query(query)
  File "/usr/local/lib/python3.4/dist-packages/pymysql/cursors.py", line 322, in _query
    conn.query(q)
  File "/usr/local/lib/python3.4/dist-packages/pymysql/connections.py", line 855, in query
    self._execute_command(COMMAND.COM_QUERY, sql)
  File "/usr/local/lib/python3.4/dist-packages/pymysql/connections.py", line 1092, in _execute_command
    self._write_bytes(packet)
  File "/usr/local/lib/python3.4/dist-packages/pymysql/connections.py", line 1044, in _write_bytes
    "MySQL server has gone away (%r)" % (e,))
pymysql.err.OperationalError: (2006, "MySQL server has gone away (TimeoutError(110, 'Connection timed out'))")
Run Code Online (Sandbox Code Playgroud)

如果重新启动应用程序,它又能正常工作,我已经尝试了所有方法,但似乎无法摆脱困境,有人可以帮忙吗?正如建议的那样,我实现了一种重试机制,但是并没有解决问题

def connect(self):
        #db connect here
    def cursor(self):
        try:
            cursor = self.conn.cursor()
        except Exception as e:
            print(e)
            self.connect()
            cursor = self.conn.cursor()
        return cursor
Run Code Online (Sandbox Code Playgroud)

并像DB()。cursor()一样使用它

saa*_*aaj 6

首先,您需要确定是否要维持与MySQL的持久连接。后者性能更好,但需要一点维护。

wait_timeoutMySQL中的默认值为8小时。每当连接空闲时间长于wait_timeout关闭时间。重新启动MySQL服务器时,它也会关闭所有已建立的连接。因此,如果您使用持久性连接,则需要在使用连接之前检查它是否有效(如果没有,请重新连接)。如果使用每个请求连接,则无需维护连接状态,因为连接始终是最新的。

每个请求连接

对于每个传入的HTTP请求,非持久数据库连接的开销很明显(对于数据库服务器和客户端)打开连接,握手等(对于数据库服务器和客户端而言)。

这是Flask的官方教程中有关数据库连接的引用:

Creating and closing database connections all the time is very inefficient, so you will need to keep it around for longer. Because database connections encapsulate a transaction, you will need to make sure that only one request at a time uses the connection. An elegant way to do this is by utilizing the application context.

Note, however, that application context is initialised per request (which is kind of veiled by efficiency concerns and Flask's lingo). And thus, it's still very inefficient. However it should solve your issue. Here's stripped snippet of what it suggests as applied to pymysql:

import pymysql
from flask import Flask, g, request    

app = Flask(__name__)    

def connect_db():
    return pymysql.connect(
        user = 'guest', password = '', database = 'sakila', 
        autocommit = True, charset = 'utf8mb4', 
        cursorclass = pymysql.cursors.DictCursor)

def get_db():
    '''Opens a new database connection per request.'''        
    if not hasattr(g, 'db'):
        g.db = connect_db()
    return g.db    

@app.teardown_appcontext
def close_db(error):
    '''Closes the database connection at the end of request.'''    
    if hasattr(g, 'db'):
        g.db.close()    

@app.route('/')
def hello_world():
    city = request.args.get('city')

    cursor = get_db().cursor()
    cursor.execute('SELECT city_id FROM city WHERE city = %s', city)
    row = cursor.fetchone()

    if row:
      return 'City "{}" is #{:d}'.format(city, row['city_id'])
    else:
      return 'City "{}" not found'.format(city)
Run Code Online (Sandbox Code Playgroud)

Persistent connection

For a persistent connection database connection there are two major options. Either you have a pool of connections or map connections to worker processes. Because normally Flask WSGI applications are served by threaded servers with fixed number of threads (e.g. uWSGI), thread-mapping is easier and as efficient.

There's a package, DBUtils, which implements both, and PersistentDB for thread-mapped connections.

One important caveat in maintaining a persistent connection is transactions. The API for reconnection is ping. It's safe for auto-committing single-statements, but it can be disrupting in between a transaction (a little more details here). DBUtils takes care of this, and should only reconnect on dbapi.OperationalError and dbapi.InternalError (by default, controlled by failures to initialiser of PersistentDB) raised outside of a transaction.

Here's how the above snippet will look like with PersistentDB.

import pymysql
from flask import Flask, g, request
from DBUtils.PersistentDB import PersistentDB    

app = Flask(__name__)    

def connect_db():
    return PersistentDB(
        creator = pymysql, # the rest keyword arguments belong to pymysql
        user = 'guest', password = '', database = 'sakila', 
        autocommit = True, charset = 'utf8mb4', 
        cursorclass = pymysql.cursors.DictCursor)

def get_db():
    '''Opens a new database connection per app.'''

    if not hasattr(app, 'db'):
        app.db = connect_db()
    return app.db.connection()    

@app.route('/')
def hello_world():
    city = request.args.get('city')

    cursor = get_db().cursor()
    cursor.execute('SELECT city_id FROM city WHERE city = %s', city)
    row = cursor.fetchone()

    if row:
      return 'City "{}" is #{:d}'.format(city, row['city_id'])
    else:
      return 'City "{}" not found'.format(city)
Run Code Online (Sandbox Code Playgroud)

Micro-benchmark

To give a little clue what performance implications are in numbers, here's micro-benchmark.

I ran:

  • uwsgi --http :5000 --wsgi-file app_persistent.py --callable app --master --processes 1 --threads 16
  • uwsgi --http :5000 --wsgi-file app_per_req.py --callable app --master --processes 1 --threads 16

And load-tested them with concurrency 1, 4, 8, 16 via:

siege -b -t 15s -c 16 http://localhost:5000/?city=london
Run Code Online (Sandbox Code Playgroud)

图表

Observations (for my local configuration):

  1. A persistent connection is ~30% faster,
  2. On concurrency 4 and higher, uWSGI worker process peaks at over 100% of CPU utilisation (pymysql has to parse MySQL protocol in pure Python, which is the bottleneck),
  3. On concurrency 16, mysqld's CPU utilisation is ~55% for per-request and ~45% for persistent connection.


nnd*_*dii 4

据我所知,您有两种选择:

  • 为每个查询创建新连接,然后关闭它。像这样:

    def db_execute(query):
         conn = MySQLdb.connect(*)
         cur = conn.cursor()
         cur.execute(query)
         res = cur.fetchall()
         cur.close()
         conn.close()
         return res
    
    Run Code Online (Sandbox Code Playgroud)
  • 更好的方法是使用像SqlAlchemy.pool这样的连接池以及 pool_pre_ping 参数和自定义连接函数。