如何让Zlib从Ruby中的S3流解压缩?

gfp*_*eco 4 ruby compression stream amazon-s3

Zlib::GzipReader应该创建Ruby,传递一个类似IO的对象(必须有一个与行为相同的read方法IO#read).

我的问题是我无法从AWS::S3lib 获得这个类似IO的对象.据我所知,从中获取流的唯一方法是将块传递给S3Object#stream.

我已经尝试过:

Zlib::GzipReader.new(AWS::S3::S3Object.stream('file', 'bucket'))
# Wich gaves me error: undefined method `read' for #<AWS::S3::S3Object::Value:0x000000017cbe78>
Run Code Online (Sandbox Code Playgroud)

有谁知道我怎么能实现它?

Pat*_*ity 5

一个简单的解决方案是将下载的数据写入a StringIO,然后将其读回:

require 'stringio'

io = StringIO.new
io.write AWS::S3::S3Object.value('file', 'bucket')
io.rewind

gz = Zlib::GzipReader.new(io)
data = gz.read
gz.close

# do something with data ...
Run Code Online (Sandbox Code Playgroud)

更精细的方法是在流仍在下载时开始对压缩数据进行膨胀,这可以通过使用IO.pipe.有点像这样:

reader, writer = IO.pipe

fork do
  reader.close
  AWS::S3::S3Object.stream('file', 'bucket') do |chunk|
    writer.write chunk
  end
end

writer.close

gz = Zlib::GzipReader.new(reader)
while line = gz.gets
  # do something with line ...
end

gz.close
Run Code Online (Sandbox Code Playgroud)

您也可以使用Thread而不是fork:

reader, writer = IO.pipe

thread = Thread.new do
  AWS::S3::S3Object.stream('file', 'bucket') do |chunk|
    writer.write chunk
  end
  writer.close
end

gz = Zlib::GzipReader.new(reader)
while line = gz.gets
  # do something with line
end

gz.close
thread.join
Run Code Online (Sandbox Code Playgroud)