我一直在努力找到将文档转换为doc,docx,ppt,pptx到pdf的有效方法.到目前为止,我已经试过docsplit和oowriter,但都采取> 10秒完成任务的pptx文件有大小1.7MB.任何人都可以建议我改进方法的更好方法或建议吗?
我尝试过的:
from subprocess import Popen, PIPE
import time
def convert(src, dst):
d = {'src': src, 'dst': dst}
commands = [
'/usr/bin/docsplit pdf --output %(dst)s %(src)s' % d,
'oowriter --headless -convert-to pdf:writer_pdf_Export %(dst)s %(src)s' % d,
]
for i in range(len(commands)):
command = commands[i]
st = time.time()
process = Popen(command, stdout=PIPE, stderr=PIPE, shell=True) # I am aware of consequences of using `shell=True`
out, err = process.communicate()
errcode = process.returncode
if errcode != 0: …Run Code Online (Sandbox Code Playgroud) 我正在尝试让docsplit使用我的rails应用程序.现在我只想让它在本地运行.我安装了gem和所有依赖项.所有基本示例都在命令行中工作,我能够得到
Docsplit.extract_pdf('example.doc')
Run Code Online (Sandbox Code Playgroud)
在我的rails应用程序中工作.但是当我尝试使用extract_images时,即
Docsplit.extract_images('example.doc', :size => '1000x', :format => [:png, :jpg])
Run Code Online (Sandbox Code Playgroud)
我收到以下错误:
Docsplit::ExtractionFailed (sh: pdfinfo: command not found):
docsplit (0.6.1) lib/docsplit/info_extractor.rb:23:in `extract'
(eval):3:in `extract_length'
docsplit (0.6.1) lib/docsplit/image_extractor.rb:34:in `convert'
docsplit (0.6.1) lib/docsplit/image_extractor.rb:19:in `extract'
docsplit (0.6.1) lib/docsplit/image_extractor.rb:19:in `each'
docsplit (0.6.1) lib/docsplit/image_extractor.rb:19:in `extract'
docsplit (0.6.1) lib/docsplit/image_extractor.rb:18:in `each'
docsplit (0.6.1) lib/docsplit/image_extractor.rb:18:in `each_with_index'
docsplit (0.6.1) lib/docsplit/image_extractor.rb:18:in `extract'
docsplit (0.6.1) lib/docsplit/image_extractor.rb:16:in `each'
docsplit (0.6.1) lib/docsplit/image_extractor.rb:16:in `extract'
docsplit (0.6.1) lib/docsplit.rb:58:in `extract_images'
app/controllers/sandbox_controller.rb:53:in `split_doc'
Run Code Online (Sandbox Code Playgroud)
我仔细检查并安装了所有依赖项.我猜我错过了rails中的配置.
谢谢.
有谁知道如何使用Paperclip上传多页pdf并将每个页面转换为Jpeg?
到目前为止,每次上传PDF时,它只允许我将PDF的第一页看作JPEG.但我希望能够将PDF中的每个页面上传并转换为JPEG.
是否有任何宝石或插件可以帮助我上传10-pg PDF并在数据库中转换/存储为10个JPEG文件?
我看过docsplit-images gem,但我不确定这是解决方案的最佳解决方案还是它的工作原理.
Post.rb
class Post < ActiveRecord::Base
belongs_to :Blogs
attr_accessible :content, :title, :pdf
has_attached_file :pdf,
:url => "/assets/products/:id/:style/:basename.:extension",
:path => ":rails_root/public/assets/products/:id/:style/:basename.:extension"
validates_attachment_content_type :pdf,
:content_type => [ 'application/pdf' ],
:message => "only pdf files are allowed"
end
Run Code Online (Sandbox Code Playgroud)
_form.html.erb
<%= form_for ([@post]), :html => { :multipart => true } do |f| %>
<%= f.file_field :pdf %>
<% end %>
Run Code Online (Sandbox Code Playgroud)
show.html.erb
<%= image_tag @post.pdf.url(:original) %>
Run Code Online (Sandbox Code Playgroud) 我正在尝试使用docsplit将PDF文件拆分为图像.但似乎我的ruby安装有问题.我每次都会收到以下错误:
/usr/lib/ruby/1.8/fileutils.rb:694:in `remove_entry_secure': parent directory is world writable
Run Code Online (Sandbox Code Playgroud)
这是完整的命令行输出:
$ docsplit images pdf-test.pdf
/usr/lib/ruby/1.8/fileutils.rb:694:in `remove_entry_secure': parent directory is world writable, FileUtils#remove_entry_secure does not work; abort: "/tmp/d20130207-6739-1f9i6b" (parent directory mode 42777) (ArgumentError)
from /var/lib/gems/1.8/gems/docsplit-0.6.4/lib/docsplit/image_extractor.rb:51:in `convert'
from /var/lib/gems/1.8/gems/docsplit-0.6.4/lib/docsplit/image_extractor.rb:19:in `extract'
from /var/lib/gems/1.8/gems/docsplit-0.6.4/lib/docsplit/image_extractor.rb:19:in `each'
from /var/lib/gems/1.8/gems/docsplit-0.6.4/lib/docsplit/image_extractor.rb:19:in `extract'
from /usr/lib/ruby/1.8/rubygems/custom_require.rb:31:in `each_with_index'
from /var/lib/gems/1.8/gems/docsplit-0.6.4/lib/docsplit/image_extractor.rb:18:in `each'
from /var/lib/gems/1.8/gems/docsplit-0.6.4/lib/docsplit/image_extractor.rb:18:in `each_with_index'
from /var/lib/gems/1.8/gems/docsplit-0.6.4/lib/docsplit/image_extractor.rb:18:in `extract'
from /var/lib/gems/1.8/gems/docsplit-0.6.4/lib/docsplit/image_extractor.rb:16:in `each'
from /var/lib/gems/1.8/gems/docsplit-0.6.4/lib/docsplit/image_extractor.rb:16:in `extract'
from /var/lib/gems/1.8/gems/docsplit-0.6.4/lib/docsplit.rb:63:in `extract_images'
from /var/lib/gems/1.8/gems/docsplit-0.6.4/bin/../lib/docsplit/command_line.rb:44:in `run'
from /var/lib/gems/1.8/gems/docsplit-0.6.4/bin/../lib/docsplit/command_line.rb:37:in `initialize'
from /var/lib/gems/1.8/gems/docsplit-0.6.4/bin/docsplit:5:in `new'
from /var/lib/gems/1.8/gems/docsplit-0.6.4/bin/docsplit:5
from /usr/bin/docsplit:19:in `load'
from /usr/bin/docsplit:19 …Run Code Online (Sandbox Code Playgroud)