标签: docsplit

将文档转换为pdf格式的有效方法

我一直在努力找到将文档转换为doc,docx,ppt,pptx到pdf的有效方法.到目前为止,我已经试过docsplitoowriter,但都采取> 10秒完成任务的pptx文件有大小1.7MB.任何人都可以建议我改进方法的更好方法或建议吗?

我尝试过的:

from subprocess import Popen, PIPE
import time

def convert(src, dst):
    d = {'src': src, 'dst': dst}
    commands = [
        '/usr/bin/docsplit pdf --output %(dst)s %(src)s' % d,
        'oowriter --headless -convert-to pdf:writer_pdf_Export %(dst)s %(src)s' % d,
    ]

    for i in range(len(commands)):
        command = commands[i]
        st = time.time()
        process = Popen(command, stdout=PIPE, stderr=PIPE, shell=True) # I am aware of consequences of using `shell=True` 
        out, err = process.communicate()
        errcode = process.returncode
        if errcode != 0: …
Run Code Online (Sandbox Code Playgroud)

python pdf ubuntu document-conversion docsplit

20
推荐指数
1
解决办法
1万
查看次数

Docsplit Ruby on Rails

我正在尝试让docsplit使用我的rails应用程序.现在我只想让它在本地运行.我安装了gem和所有依赖项.所有基本示例都在命令行中工作,我能够得到

Docsplit.extract_pdf('example.doc')
Run Code Online (Sandbox Code Playgroud)

在我的rails应用程序中工作.但是当我尝试使用extract_images时,即

Docsplit.extract_images('example.doc', :size => '1000x', :format => [:png, :jpg])
Run Code Online (Sandbox Code Playgroud)

我收到以下错误:

Docsplit::ExtractionFailed (sh: pdfinfo: command not found):
  docsplit (0.6.1) lib/docsplit/info_extractor.rb:23:in `extract'
  (eval):3:in `extract_length'
  docsplit (0.6.1) lib/docsplit/image_extractor.rb:34:in `convert'
  docsplit (0.6.1) lib/docsplit/image_extractor.rb:19:in `extract'
  docsplit (0.6.1) lib/docsplit/image_extractor.rb:19:in `each'
  docsplit (0.6.1) lib/docsplit/image_extractor.rb:19:in `extract'
  docsplit (0.6.1) lib/docsplit/image_extractor.rb:18:in `each'
  docsplit (0.6.1) lib/docsplit/image_extractor.rb:18:in `each_with_index'
  docsplit (0.6.1) lib/docsplit/image_extractor.rb:18:in `extract'
  docsplit (0.6.1) lib/docsplit/image_extractor.rb:16:in `each'
  docsplit (0.6.1) lib/docsplit/image_extractor.rb:16:in `extract'
  docsplit (0.6.1) lib/docsplit.rb:58:in `extract_images'
  app/controllers/sandbox_controller.rb:53:in `split_doc'
Run Code Online (Sandbox Code Playgroud)

我仔细检查并安装了所有依赖项.我猜我错过了rails中的配置.

谢谢.

ruby ruby-on-rails docsplit

6
推荐指数
1
解决办法
2504
查看次数

如何上传多页PDF并使用Paperclip将其转换为JPEG?

有谁知道如何使用Paperclip上传多页pdf并将每个页面转换为Jpeg?

到目前为止,每次上传PDF时,它只允许我将PDF的第一页看作JPEG.但我希望能够将PDF中的每个页面上传并转换为JPEG.

是否有任何宝石或插件可以帮助我上传10-pg PDF并在数据库中转换/存储为10个JPEG文件?

我看过docsplit-images gem,但我不确定这是解决方案的最佳解决方案还是它的工作原理.

Post.rb

class Post < ActiveRecord::Base
  belongs_to :Blogs

  attr_accessible :content, :title, :pdf

  has_attached_file :pdf,
                    :url  => "/assets/products/:id/:style/:basename.:extension",
                    :path => ":rails_root/public/assets/products/:id/:style/:basename.:extension"

  validates_attachment_content_type :pdf,
      :content_type => [ 'application/pdf' ],
      :message => "only pdf files are allowed"
end
Run Code Online (Sandbox Code Playgroud)

_form.html.erb

<%= form_for ([@post]), :html => { :multipart => true } do |f| %>

    <%= f.file_field :pdf %>

<% end %>
Run Code Online (Sandbox Code Playgroud)

show.html.erb

  <%= image_tag @post.pdf.url(:original) %>
Run Code Online (Sandbox Code Playgroud)

pdf jpeg ruby-on-rails paperclip docsplit

6
推荐指数
1
解决办法
4829
查看次数

使用ruby应用程序时出现remove_entry_secure错误

我正在尝试使用docsplit将PDF文件拆分为图像.但似乎我的ruby安装有问题.我每次都会收到以下错误:

/usr/lib/ruby/1.8/fileutils.rb:694:in `remove_entry_secure': parent directory is world writable
Run Code Online (Sandbox Code Playgroud)

这是完整的命令行输出:

$ docsplit images pdf-test.pdf
/usr/lib/ruby/1.8/fileutils.rb:694:in `remove_entry_secure': parent directory is world writable, FileUtils#remove_entry_secure does not work; abort: "/tmp/d20130207-6739-1f9i6b" (parent directory mode 42777) (ArgumentError)
    from /var/lib/gems/1.8/gems/docsplit-0.6.4/lib/docsplit/image_extractor.rb:51:in `convert'
    from /var/lib/gems/1.8/gems/docsplit-0.6.4/lib/docsplit/image_extractor.rb:19:in `extract'
    from /var/lib/gems/1.8/gems/docsplit-0.6.4/lib/docsplit/image_extractor.rb:19:in `each'
    from /var/lib/gems/1.8/gems/docsplit-0.6.4/lib/docsplit/image_extractor.rb:19:in `extract'
    from /usr/lib/ruby/1.8/rubygems/custom_require.rb:31:in `each_with_index'
    from /var/lib/gems/1.8/gems/docsplit-0.6.4/lib/docsplit/image_extractor.rb:18:in `each'
    from /var/lib/gems/1.8/gems/docsplit-0.6.4/lib/docsplit/image_extractor.rb:18:in `each_with_index'
    from /var/lib/gems/1.8/gems/docsplit-0.6.4/lib/docsplit/image_extractor.rb:18:in `extract'
    from /var/lib/gems/1.8/gems/docsplit-0.6.4/lib/docsplit/image_extractor.rb:16:in `each'
    from /var/lib/gems/1.8/gems/docsplit-0.6.4/lib/docsplit/image_extractor.rb:16:in `extract'
    from /var/lib/gems/1.8/gems/docsplit-0.6.4/lib/docsplit.rb:63:in `extract_images'
    from /var/lib/gems/1.8/gems/docsplit-0.6.4/bin/../lib/docsplit/command_line.rb:44:in `run'
    from /var/lib/gems/1.8/gems/docsplit-0.6.4/bin/../lib/docsplit/command_line.rb:37:in `initialize'
    from /var/lib/gems/1.8/gems/docsplit-0.6.4/bin/docsplit:5:in `new'
    from /var/lib/gems/1.8/gems/docsplit-0.6.4/bin/docsplit:5
    from /usr/bin/docsplit:19:in `load'
    from /usr/bin/docsplit:19 …
Run Code Online (Sandbox Code Playgroud)

ruby gem ubuntu-10.04 docsplit

6
推荐指数
1
解决办法
3444
查看次数