BeautifulSoup:获取元素本身的标记名称,而不是其子元素

Question

BeautifulSoup:获取元素本身的标记名称,而不是其子元素

我有以下(简化)代码,它使用以下来源:

<html>
    <p>line 1</p>
    <div>
        <a>line 2</a>
    </div>
</html>

soup = BeautifulSoup('<html><p>line 1</p><div><a>line 2</a></div></html>')
ele = soup.find('p').nextSibling
somehow_print_tag_of_ele_here

Run Code Online (Sandbox Code Playgroud)

我想得到ele的标签,在这种情况下是"div".但是,我似乎只能得到它的孩子的标签.我错过了一些简单的事吗？我以为我可以做ele.tag.name,但这是一个例外,因为tag是None.

#Below correctly prints the div element "<div><a>line 2</a></div>"
print ele

#Below prints "None". Printing tag.name is an exception since tag is None
print ele.tag 

#Below prints "a", the child of ele
allTags = ele.findAll(True)
for e in allTags:
    print e.name

Run Code Online (Sandbox Code Playgroud)

在这一点上,我正在考虑做一些事情来获得ele的父母,然后得到父母的孩子的标签,并计算了多少上层兄弟姐妹,倒数到正确的子标签.这看起来很荒谬.

Answer 1

Seb*_*Piu 26

ele已经是一个标签,尝试这样做:

soup = BeautifulSoup('<html><p>line 1</p><div><a>line 2</a></div></html>')
print(soup.find('p').nextSibling.name)

Run Code Online (Sandbox Code Playgroud)

所以在你的例子中它将是公正的

print(ele.name)

Run Code Online (Sandbox Code Playgroud)

是否可以在条件中使用 ele.name？类似于：`if ele.name is 'a':` 对我不起作用。 (2认同)

归档时间：	13 年，11 月前
查看次数：	25174 次
最近记录：	6 年，3 月前