我想要:
<div data-a>
Run Code Online (Sandbox Code Playgroud)
但 LXML API 似乎只给了我这个:
<div data-a=''>
Run Code Online (Sandbox Code Playgroud)
我如何获得无价值的属性?
令人讨厌的是,LXML 将空白值和空值表示为空白字符串。
设置 None 值没有帮助。
In [19]: from lxml.html import fromstring, tostring
In [20]: b = fromstring('<body class="meow" data-a="haha" data-b data-x="">text-fef27e87389e466fb99b5421629323f6</body>')
In [21]: b.attrib
Out[21]: {'data-a': 'haha', 'data-x': '', 'data-b': '', 'class': 'meow'}
In [22]: b = fromstring('<body class="meow" data-a="haha" data-b data-x="">text-fef27e87389e466fb99b5421629323f6</body>')
In [23]: b.attrib
Out[23]: {'data-a': 'haha', 'data-x': '', 'data-b': '', 'class': 'meow'}
In [24]: b.attrib['data-y'] = None
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-24-1f55133e3dc4> in <module>()
----> …Run Code Online (Sandbox Code Playgroud)