使用python批量更新MySql

kes*_*haw 4 mysql python-2.7 mysql-connector-python

我必须将数百万行更新到MySQL中.我目前正在使用for循环来执行查询.为了使更新更快,我想使用executemany()Python MySQL Connector,这样我就可以使用单个查询批量更新每个批处理.

Wes*_*ite 7

我不认为mysqldb有一种方法可以同时处理多个UPDATE查询.

但是您可以在结尾使用ON DUPLICATE KEY UPDATE条件的INSERT查询.

为便于使用和阅读,我编写了以下示例.

import MySQLdb

def update_many(data_list=None, mysql_table=None):
    """
    Updates a mysql table with the data provided. If the key is not unique, the
    data will be inserted into the table.

    The dictionaries must have all the same keys due to how the query is built.

    Param:
        data_list (List):
            A list of dictionaries where the keys are the mysql table
            column names, and the values are the update values
        mysql_table (String):
            The mysql table to be updated.
    """

    # Connection and Cursor
    conn = MySQLdb.connect('localhost', 'jeff', 'atwood', 'stackoverflow')
    cur = conn.cursor()

    query = ""
    values = []

    for data_dict in data_list:

        if not query:
            columns = ', '.join('`{0}`'.format(k) for k in data_dict)
            duplicates = ', '.join('{0}=VALUES({0})'.format(k) for k in data_dict)
            place_holders = ', '.join('%s'.format(k) for k in data_dict)
            query = "INSERT INTO {0} ({1}) VALUES ({2})".format(mysql_table, columns, place_holders)
            query = "{0} ON DUPLICATE KEY UPDATE {1}".format(query, duplicates)

        v = data_dict.values()
        values.append(v)

    try:
        cur.executemany(query, values)
    except MySQLdb.Error, e:
        try:
            print"MySQL Error [%d]: %s" % (e.args[0], e.args[1])
        except IndexError:
            print "MySQL Error: %s" % str(e)

        conn.rollback()
        return False

    conn.commit()
    cur.close()
    conn.close()
Run Code Online (Sandbox Code Playgroud)

一个衬里的解释

columns = ', '.join('`{}`'.format(k) for k in data_dict)
Run Code Online (Sandbox Code Playgroud)

是相同的

column_list = []
for k in data_dict:
    column_list.append(k)
columns = ", ".join(columns)
Run Code Online (Sandbox Code Playgroud)

这是一个使用示例

test_data_list = []
test_data_list.append( {'id' : 1, 'name' : 'Marco', 'articles' : 1 } )
test_data_list.append( {'id' : 2, 'name' : 'Keshaw', 'articles' : 8 } )
test_data_list.append( {'id' : 3, 'name' : 'Wes', 'articles' : 0 } )

update_many(data_list=test_data_list, mysql_table='writers')
Run Code Online (Sandbox Code Playgroud)

查询输出

INSERT INTO writers (`articles`, `id`, `name`) VALUES (%s, %s, %s) ON DUPLICATE KEY UPDATE articles=VALUES(articles), id=VALUES(id), name=VALUES(name)
Run Code Online (Sandbox Code Playgroud)

值输出

[[1, 1, 'Marco'], [8, 2, 'Keshaw'], [0, 3, 'Wes']]
Run Code Online (Sandbox Code Playgroud)