从X剪贴板获取HTML源或富文本

int*_*ted 24 html browser linux clipboard xorg

如何从X剪贴板获取富文本或HTML源代码?例如,如果您从Web浏览器复制一些文本并将其粘贴到kompozer中,它会粘贴为HTML,并保留链接等.但是,相同选择的xclip -o只输出纯文本,以类似于的方式重新格式化elinks -dump.我想将HTML拉出来并进入文本编辑器(特别是vim).

在superuser.com上了同样的问题,因为我希望有一个实用程序可以做到这一点,但我没有得到任何信息性的回复.X剪贴板API对我来说还是一个神秘的野兽; 任何有关黑客攻击的提示都非常受欢迎.这些天我选择的语言是Python,但几乎任何事情都可以.

Ste*_*las 41

为了补充@ rkhayrov的答案,已经存在一个命令:xclip.或者更准确地说,有一个补丁xclip在2010年晚些时候添加xclip,但尚未发布那样做.因此,假设你的操作系统像Debian一样带有颠覆头xclip:

列出CLIPBOARD选择的目标:

$ xclip -selection clipboard -o -t TARGETS
TIMESTAMP
TARGETS
MULTIPLE
SAVE_TARGETS
text/html
text/_moz_htmlcontext
text/_moz_htmlinfo
UTF8_STRING
COMPOUND_TEXT
TEXT
STRING
text/x-moz-url-priv
Run Code Online (Sandbox Code Playgroud)

要选择特定目标:

$ xclip -selection clipboard -o -t text/html
 <a href="https://stackoverflow.com/users/200540/rkhayrov" title="3017 reputation" class="comment-user">rkhayrov</a>
$ xclip -selection clipboard -o -t UTF8_STRING
 rkhayrov
$ xclip -selection clipboard -o -t TIMESTAMP
684176350
Run Code Online (Sandbox Code Playgroud)

并且xclip还可以设置和拥有选择(-i而不是-o).

  • 太好了!知道为什么它还没有被发布吗? (7认同)

rkh*_*rov 24

在X11中,您必须与选择所有者通信,询问支持的格式,然后以特定格式请求数据.我认为最简单的方法是使用现有的窗口工具包.例如.使用Python和GTK:

#!/usr/bin/python

import glib, gtk

def test_clipboard():
    clipboard = gtk.Clipboard()
    targets = clipboard.wait_for_targets()
    print "Targets available:", ", ".join(map(str, targets))
    for target in targets:
        print "Trying '%s'..." % str(target)
        contents = clipboard.wait_for_contents(target)
        if contents:
            print contents.data

def main():
    mainloop = glib.MainLoop()
    def cb():
        test_clipboard()
        mainloop.quit()
    glib.idle_add(cb)
    mainloop.run()

if __name__ == "__main__":
    main()
Run Code Online (Sandbox Code Playgroud)

输出将如下所示:

$ ./clipboard.py 
Targets available: TIMESTAMP, TARGETS, MULTIPLE, text/html, text/_moz_htmlcontext, text/_moz_htmlinfo, UTF8_STRING, COMPOUND_TEXT, TEXT, STRING, text/x-moz-url-priv
...
Trying 'text/html'...
I asked <a href="http://superuser.com/questions/144185/getting-html-source-or-rich-text-from-the-x-clipboard">the same question on superuser.com</a>, because I was hoping there was a utility to do this, but I didn't get any informative responses.
Trying 'text/_moz_htmlcontext'...
<html><body class="question-page"><div class="container"><div id="content"><div id="mainbar"><div id="question"><table><tbody><tr><td class="postcell"><div><div class="post-text"><p></p></div></div></td></tr></tbody></table></div></div></div></div></body></html>
...
Trying 'STRING'...
I asked the same question on superuser.com, because I was hoping there was a utility to do this, but I didn't get any informative responses.
Trying 'text/x-moz-url-priv'...
http://stackoverflow.com/questions/3261379/getting-html-source-or-rich-text-from-the-x-clipboard
Run Code Online (Sandbox Code Playgroud)