使用JDBC进行批量INSERTS的有效方法

Aay*_*uri 59 java sql performance jdbc

在我的应用程序中,我需要做很多插入.它是一个Java应用程序,我使用普通的JDBC来执行查询.数据库是Oracle.我已启用批处理,因此它可以节省执行查询的网络延迟.但是查询作为单独的INSERT串行执行:

insert into some_table (col1, col2) values (val1, val2)
insert into some_table (col1, col2) values (val3, val4)
insert into some_table (col1, col2) values (val5, val6)
Run Code Online (Sandbox Code Playgroud)

我想知道以下形式的INSERT是否可能更有效:

insert into some_table (col1, col2) values (val1, val2), (val3, val4), (val5, val6)
Run Code Online (Sandbox Code Playgroud)

即将多个INSERT折叠成一个.

使批量INSERT更快的任何其他技巧?

小智 120

这是前两个答案的混合:

  PreparedStatement ps = c.prepareStatement("INSERT INTO employees VALUES (?, ?)");

  ps.setString(1, "John");
  ps.setString(2,"Doe");
  ps.addBatch();

  ps.clearParameters();
  ps.setString(1, "Dave");
  ps.setString(2,"Smith");
  ps.addBatch();

  ps.clearParameters();
  int[] results = ps.executeBatch();
Run Code Online (Sandbox Code Playgroud)

  • 在这种特殊情况下,`ps.clearParameters();`是不必要的. (33认同)
  • 这是完美的解决方案,因为语句只准备(解析)一次. (3认同)
  • 对于 mysql,还将以下内容添加到 url:“&useServerPrepStmts=false&rewriteBatchedStatements=true” (3认同)
  • 一定要测量一下。根据 JDBC 驱动程序的实现,这可能是预期的每批一次往返,但也可能最终是每个语句一次往返。 (2认同)

pra*_*upd 21

虽然这个问题要求使用JDBC有效地插入Oracle,但我现在正在使用DB2(在IBM大型机上),概念上插入类似,所以认为看看我的指标可能会有所帮助

  • 一次插入一条记录

  • 插入一批记录(效率很高)

这里是指标

1)一次插入一条记录

public void writeWithCompileQuery(int records) {
    PreparedStatement statement;

    try {
        Connection connection = getDatabaseConnection();
        connection.setAutoCommit(true);

        String compiledQuery = "INSERT INTO TESTDB.EMPLOYEE(EMPNO, EMPNM, DEPT, RANK, USERNAME)" +
                " VALUES" + "(?, ?, ?, ?, ?)";
        statement = connection.prepareStatement(compiledQuery);

        long start = System.currentTimeMillis();

        for(int index = 1; index < records; index++) {
            statement.setInt(1, index);
            statement.setString(2, "emp number-"+index);
            statement.setInt(3, index);
            statement.setInt(4, index);
            statement.setString(5, "username");

            long startInternal = System.currentTimeMillis();
            statement.executeUpdate();
            System.out.println("each transaction time taken = " + (System.currentTimeMillis() - startInternal) + " ms");
        }

        long end = System.currentTimeMillis();
        System.out.println("total time taken = " + (end - start) + " ms");
        System.out.println("avg total time taken = " + (end - start)/ records + " ms");

        statement.close();
        connection.close();

    } catch (SQLException ex) {
        System.err.println("SQLException information");
        while (ex != null) {
            System.err.println("Error msg: " + ex.getMessage());
            ex = ex.getNextException();
        }
    }
}
Run Code Online (Sandbox Code Playgroud)

100笔交易的指标:

each transaction time taken = 123 ms
each transaction time taken = 53 ms
each transaction time taken = 48 ms
each transaction time taken = 48 ms
each transaction time taken = 49 ms
each transaction time taken = 49 ms
...
..
.
each transaction time taken = 49 ms
each transaction time taken = 49 ms
total time taken = 4935 ms
avg total time taken = 49 ms
Run Code Online (Sandbox Code Playgroud)

第一个事务正在围绕120-150ms这是查询解析,然后执行,后续交易仅考虑周围50ms.(哪个仍然很高,但我的数据库在不同的服务器上(我需要对网络进行故障排除))

2)通过批量插入(有效的一个) - 实现preparedStatement.executeBatch()

public int[] writeInABatchWithCompiledQuery(int records) {
    PreparedStatement preparedStatement;

    try {
        Connection connection = getDatabaseConnection();
        connection.setAutoCommit(true);

        String compiledQuery = "INSERT INTO TESTDB.EMPLOYEE(EMPNO, EMPNM, DEPT, RANK, USERNAME)" +
                " VALUES" + "(?, ?, ?, ?, ?)";
        preparedStatement = connection.prepareStatement(compiledQuery);

        for(int index = 1; index <= records; index++) {
            preparedStatement.setInt(1, index);
            preparedStatement.setString(2, "empo number-"+index);
            preparedStatement.setInt(3, index+100);
            preparedStatement.setInt(4, index+200);
            preparedStatement.setString(5, "usernames");
            preparedStatement.addBatch();
        }

        long start = System.currentTimeMillis();
        int[] inserted = preparedStatement.executeBatch();
        long end = System.currentTimeMillis();

        System.out.println("total time taken to insert the batch = " + (end - start) + " ms");
        System.out.println("total time taken = " + (end - start)/records + " s");

        preparedStatement.close();
        connection.close();

        return inserted;

    } catch (SQLException ex) {
        System.err.println("SQLException information");
        while (ex != null) {
            System.err.println("Error msg: " + ex.getMessage());
            ex = ex.getNextException();
        }
        throw new RuntimeException("Error");
    }
}
Run Code Online (Sandbox Code Playgroud)

一批100笔交易的指标是

total time taken to insert the batch = 127 ms
Run Code Online (Sandbox Code Playgroud)

并进行1000次交易

total time taken to insert the batch = 341 ms
Run Code Online (Sandbox Code Playgroud)

因此,在~5000ms(一次一个trxn)中进行100次交易减少到~150ms(一批100条记录).

注意 - 忽略我的网络超级慢,但指标值是相对的.


Boz*_*zho 6

Statement为您提供了以下选项:

Statement stmt = con.createStatement();

stmt.addBatch("INSERT INTO employees VALUES (1000, 'Joe Jones')");
stmt.addBatch("INSERT INTO departments VALUES (260, 'Shoe')");
stmt.addBatch("INSERT INTO emp_dept VALUES (1000, 260)");

// submit a batch of update commands for execution
int[] updateCounts = stmt.executeBatch();
Run Code Online (Sandbox Code Playgroud)

  • 虽然最终结果是相同的,但是在这个方法中,解析了多个语句,这对于批量来说要慢得多,实际上并不比单独执行每个语句有效.另外,请尽可能使用PreparedStatement进行重复查询,因为它们的性能要好得多. (6认同)

Bur*_*ear 5

显然,您必须进行基准测试,但是如果您使用 PreparedStatement 而不是 Statement,那么通过 JDBC 发出多个插入会快得多。