Don*_*son 0 activerecord ruby-on-rails batching syck psychparser
我正在忙于编写一个迁移,该迁移将使我们能够将yamler从Syck迁移到Psych,并最终将项目升级到ruby2。尽管如此,该迁移将占用大量资源,因此我将需要使用分块。
我编写了以下方法来确认我计划使用的迁移结果能够产生预期的结果,并且可以在不停机的情况下完成。为了避免Active Record自动执行序列化,我需要使用ActiveRecord::Base.connection.execute
我描述转换的方法如下
def show_summary(table, column_name)
a = ActiveRecord::Base.connection.execute <<-SQL
SELECT id, #{column_name} FROM #{table}
SQL
all_rows = a.to_a; ""
problem_rows = all_rows.select do |row|
original_string = Syck.dump(Syck.load(row[1]))
orginal_object = Syck.load(original_string)
new_string = Psych.dump(orginal_object)
new_object = Syck.load(new_string)
Syck.dump(new_object) != original_string rescue true
end
problem_rows.map do |row|
old_string = Syck.dump(Syck.load(row[1]))
new_string = Psych.dump(Syck.load(old_string)) rescue "Parse failure"
roundtrip_string = begin
Syck.dump(Syck.load(new_string))
rescue => e
e.message
end
new_row = {}
new_row[:id] = row[0]
new_row[:original_encoding] = old_string
new_row[:new_encoding] = roundtrip_string
new_row
end
end
Run Code Online (Sandbox Code Playgroud)
使用时如何使用批处理 ActiveRecord::Base.connection.execute?
为了完整起见,我的更新功能如下
# Migrate the given serialized YAML column from Syck to Psych
# (if any).
def migrate_to_psych(table, column)
table_name = ActiveRecord::Base.connection.quote_table_name(table)
column_name = ActiveRecord::Base.connection.quote_column_name(column)
fetch_data(table_name, column_name).each do |row|
transformed = ::Psych.dump(convert(Syck.load(row[column])))
ActiveRecord::Base.connection.execute <<-SQL
UPDATE #{table_name}
SET #{column_name} = #{ActiveRecord::Base.connection.quote(transformed)}
WHERE id = #{row['id']};
SQL
end
end
def fetch_data(table_name, column_name)
ActiveRecord::Base.connection.select_all <<-SQL
SELECT id, #{column_name}
FROM #{table_name}
WHERE #{column_name} LIKE '---%'
SQL
end
Run Code Online (Sandbox Code Playgroud)
我从http://fossies.org/linux/openproject/db/migrate/migration_utils/legacy_yamler.rb获得
您可以使用SQL LIMIT和OFFSET子句轻松构建一些东西:
def fetch_data(table_name, column_name)
batch_size, offset = 1000, 0
begin
batch = ActiveRecord::Base.connection.select_all <<-SQL
SELECT id, #{column_name}
FROM #{table_name}
WHERE #{column_name} LIKE '---%'
LIMIT #{batch_size}
OFFSET #{offset}
SQL
batch.each do |row|
yield row
end
offset += batch_size
end until batch.empty?
end
Run Code Online (Sandbox Code Playgroud)
您可以使用几乎与以前完全相同的方式,而无需使用.each:
fetch_data(table_name, column_name) do |row| ... end
Run Code Online (Sandbox Code Playgroud)
HTH!
| 归档时间: |
|
| 查看次数: |
1308 次 |
| 最近记录: |