如何在Ruby中解析编码的URI?

bra*_*rad 2 ruby url-encoding

我正在尝试解析包含括号的URI - [和] - .我试图用URI.parse直接解析它,但括号导致失败.因此我尝试用CGI :: escape编码URI来处理括号,但是当我尝试用URI.parse解析这个编码的URI时,它似乎不会将它识别为URI并将整个URI放入路径中宾语.

在irb会议上演示;

irb(main):001:0> require 'uri'
=> true
irb(main):002:0> require 'cgi'
=> true
irb(main):003:0> name = "http://www.website.com/dir1/dir[2]/file.txt"
=> "http://www.website.com/dir1/dir[2]/file.txt"
irb(main):004:0> encoded_name = CGI::escape(name)
=> "http%3A%2F%2Fwww.website.com%2Fdir1%2Fdir%5B2%5D%2Ffile.txt"
irb(main):005:0> parsed_name = URI.parse(encoded_name)
=> #<URI::Generic:0x00000001e8f520 URL:http%3A%2F%2Fwww.website.com%2Fdir1%2Fdir%5B2%5D%2Ffile.txt>
irb(main):006:0> parsed_name.scheme
=> nil
irb(main):007:0> parsed_name.host
=> nil
irb(main):008:0> parsed_name.path
=> "http%3A%2F%2Fwww.website.com%2Fdir1%2Fdir%5B2%5D%2Ffile.txt"
irb(main):009:0> URI.split(encoded_name)
=> [nil, nil, nil, nil, nil, "http%3A%2F%2Fwww.website.com%2Fdir1%2Fdir%5B2%5D%2Ffile.txt", nil, nil, nil]
Run Code Online (Sandbox Code Playgroud)

无论如何,我现在的工作是以下丑陋但有效的黑客攻击

encoded_name = name.gsub(/\[/,"%5B").gsub(/\]/,"%5D")
Run Code Online (Sandbox Code Playgroud)

使用URI.parse解析它会产生所需的结果,但如果其他奇怪的字符进入我的URI,它将无法应对.所以我的问题是,有没有一种坚实的方法可以做到这一点,不会失败?

Thi*_*ais 6

问题在于尝试应用于CGI::escape整个URI.当你这样做时,你会失去保存方案的URI的前面部分,之后URI解析器会丢失.您可能想根据mtyaka的答案尝试一些事情:

irb(main):015:0> encoded_name = URI.encode(name, '[]')
=> "http://www.website.com/dir1/dir%5B2%5D/file.txt"
irb(main):016:0> parsed_name = URI.parse(encoded_name)
=> #<URI::HTTP:0xb76ff358 URL:http://www.website.com/dir1/dir%5B2%5D/file.txt>
irb(main):017:0> parsed_name.scheme
=> "http"
irb(main):018:0> parsed_name.host
=> "www.website.com"
irb(main):019:0> parsed_name.path
=> "/dir1/dir%5B2%5D/file.txt"
Run Code Online (Sandbox Code Playgroud)

获得原始路径,URI.decode无论你得到什么parsed_name.path.