是否有内置函数来获取这样的url:../images.html给定一个像这样的基本URL:http://www.example.com/faq/index.html和一个目标URL,如http://www.example.com/images.html
我检查了urlparse模块.我想要的是urljoin()函数的对应物.
unu*_*tbu 10
您可以使用urlparse.urlparse查找路径,并使用os.path.relname的posixpath版本来查找相对路径.
(警告:这适用于Linux,但可能不适用于Windows):
import urlparse
import sys
import posixpath
def relurl(target,base):
base=urlparse.urlparse(base)
target=urlparse.urlparse(target)
if base.netloc != target.netloc:
raise ValueError('target and base netlocs do not match')
base_dir='.'+posixpath.dirname(base.path)
target='.'+target.path
return posixpath.relpath(target,start=base_dir)
tests=[
('http://www.example.com/images.html','http://www.example.com/faq/index.html','../images.html'),
('http://google.com','http://google.com','.'),
('http://google.com','http://google.com/','.'),
('http://google.com/','http://google.com','.'),
('http://google.com/','http://google.com/','.'),
('http://google.com/index.html','http://google.com/','index.html'),
('http://google.com/index.html','http://google.com/index.html','index.html'),
]
for target,base,answer in tests:
try:
result=relurl(target,base)
except ValueError as err:
print('{t!r},{b!r} --> {e}'.format(t=target,b=base,e=err))
else:
if result==answer:
print('{t!r},{b!r} --> PASS'.format(t=target,b=base))
else:
print('{t!r},{b!r} --> {r!r} != {a!r}'.format(
t=target,b=base,r=result,a=answer))
Run Code Online (Sandbox Code Playgroud)
首先想到的解决方案是:
>>> os.path.relpath('/images.html', os.path.dirname('/faq/index.html'))
'../images.html'
Run Code Online (Sandbox Code Playgroud)
当然,这需要URL解析 - >域名比较(!!) - >路径重写,如果是这种情况 - >重新添加查询和片段.
import urlparse
import posixpath
def relative_url(destination, source):
u_dest = urlparse.urlsplit(destination)
u_src = urlparse.urlsplit(source)
_uc1 = urlparse.urlunsplit(u_dest[:2]+tuple('' for i in range(3)))
_uc2 = urlparse.urlunsplit(u_src[:2]+tuple('' for i in range(3)))
if _uc1 != _uc2:
## This is a different domain
return destination
_relpath = posixpath.relpath(u_dest.path, posixpath.dirname(u_src.path))
return urlparse.urlunsplit(('', '', _relpath, u_dest.query, u_dest.fragment)
Run Code Online (Sandbox Code Playgroud)
然后
>>> relative_url('http://www.example.com/images.html', 'http://www.example.com/faq/index.html')
'../images.html'
>>> relative_url('http://www.example.com/images.html?my=query&string=here#fragment', 'http://www.example.com/faq/index.html')
'../images.html?my=query&string=here#fragment'
>>> relative_url('http://www.example.com/images.html', 'http://www2.example.com/faq/index.html')
'http://www.example.com/images.html'
>>> relative_url('https://www.example.com/images.html', 'http://www.example.com/faq/index.html')
'https://www.example.com/images.html'
Run Code Online (Sandbox Code Playgroud)
编辑:现在使用posixpath实现os.path使其在Windows下工作.