使用ActiveRecord :: Base.connection.execute时进行批处理

Don*_*son 0 activerecord ruby-on-rails batching syck psychparser

我正在忙于编写一个迁移,该迁移将使我们能够将yamler从Syck迁移到Psych,并最终将项目升级到ruby2。尽管如此,该迁移将占用大量资源,因此我将需要使用分块。

我编写了以下方法来确认我计划使用的迁移结果能够产生预期的结果,并且可以在不停机的情况下完成。为了避免Active Record自动执行序列化,我需要使用ActiveRecord::Base.connection.execute

我描述转换的方法如下

 def show_summary(table, column_name)
  a = ActiveRecord::Base.connection.execute <<-SQL
   SELECT id, #{column_name} FROM #{table}
  SQL
  all_rows = a.to_a; ""
  problem_rows = all_rows.select do |row|
    original_string = Syck.dump(Syck.load(row[1]))
    orginal_object = Syck.load(original_string)

    new_string = Psych.dump(orginal_object)
    new_object = Syck.load(new_string)

    Syck.dump(new_object) != original_string rescue true
  end

problem_rows.map do |row|
  old_string = Syck.dump(Syck.load(row[1]))
  new_string = Psych.dump(Syck.load(old_string)) rescue "Parse failure"
  roundtrip_string = begin
    Syck.dump(Syck.load(new_string))
  rescue => e
    e.message
  end

  new_row = {}
  new_row[:id] = row[0]
  new_row[:original_encoding] = old_string
  new_row[:new_encoding] = roundtrip_string
  new_row
  end
end
Run Code Online (Sandbox Code Playgroud)

使用时如何使用批处理 ActiveRecord::Base.connection.execute

为了完整起见,我的更新功能如下

  # Migrate the given serialized YAML column from Syck to Psych
  # (if any).
  def migrate_to_psych(table, column)
    table_name = ActiveRecord::Base.connection.quote_table_name(table)

    column_name = ActiveRecord::Base.connection.quote_column_name(column)

    fetch_data(table_name, column_name).each do |row|
      transformed = ::Psych.dump(convert(Syck.load(row[column])))

      ActiveRecord::Base.connection.execute <<-SQL
         UPDATE #{table_name}
         SET #{column_name} = #{ActiveRecord::Base.connection.quote(transformed)}
         WHERE id = #{row['id']};
      SQL
    end
  end

  def fetch_data(table_name, column_name)
    ActiveRecord::Base.connection.select_all <<-SQL
       SELECT id, #{column_name}
       FROM #{table_name}
       WHERE #{column_name} LIKE '---%'
    SQL
  end
Run Code Online (Sandbox Code Playgroud)

我从http://fossies.org/linux/openproject/db/migrate/migration_utils/legacy_yamler.rb获得

Raf*_*ael 5

您可以使用SQL LIMITOFFSET子句轻松构建一些东西:

def fetch_data(table_name, column_name)
  batch_size, offset = 1000, 0
  begin
    batch = ActiveRecord::Base.connection.select_all <<-SQL
      SELECT id, #{column_name}
      FROM #{table_name}
      WHERE #{column_name} LIKE '---%'
      LIMIT #{batch_size} 
      OFFSET #{offset}
    SQL
    batch.each do |row|
      yield row
    end
    offset += batch_size
  end until batch.empty?
end
Run Code Online (Sandbox Code Playgroud)

您可以使用几乎与以前完全相同的方式,而无需使用.each

fetch_data(table_name, column_name) do |row| ... end
Run Code Online (Sandbox Code Playgroud)

HTH!

  • 这很好用,只需一个注释 [not use begin; 结束直到语法](/sf/ask/9575541/) (2认同)