本文共 1675 字,大约阅读时间需要 5 分钟。
如果要获取某个节点元素的父节点,可以调用parent属性:
html = """The Dormouse's story Once upon a time there were three little sisters; and their names wereElsie
...
"""from bs4 import BeautifulSoupsoup = BeautifulSoup(html, 'lxml')print(soup.a.parent)
运行结果如下:
Once upon a time there were three little sisters; and their names wereElsie
这里我们选择的是第一个a节点的父节点元素。很明显,它的父节点是p节点,输出结果便是p节点及其内部的内容。
需要注意的是,这里输出的仅仅是a节点的直接父节点,而没有再向外寻找父节点的祖先节点。如果想获取所有的祖先节点,可以调用parents属性:html = """"""from bs4 import BeautifulSoupsoup = BeautifulSoup(html, 'lxml')print(type(soup.a.parents))print(list(enumerate(soup.a.parents)))
运行结果如下:
[(0, ), (1, ), (2, ), (3, )]
可以发现,返回结果是生成器类型。这里用列表输出了它的索引和内容,而列表中的元素就是a节点的祖先节点。
兄弟节点的获取方式:
html = """Once upon a time there were little sisters; and their names wereElsie HelloLacie andTillie and they lived at the bottom of a well.
"""from bs4 import BeautifulSoupsoup = BeautifulSoup(html, 'lxml')print('Next Sibling', soup.a.next_sibling)print('Prev Sibling', soup.a.previous_sibling)print('Next Siblings', list(enumerate(soup.a.next_siblings)))print('Prev Siblings', list(enumerate(soup.a.previous_siblings)))
运行结果如下:
Next Sibling HelloPrev Sibling Once upon a time there were little sisters; and their names wereNext Siblings [(0, '\n Hello\n'), (1, Lacie), (2, '\n and\n'), (3, Tillie), (4, '\n and they lived at the bottom of a well.\n')]Prev Siblings [(0, '\n Once upon a time there were little sisters; and their names were\n')]
可以看到,这里调用了4个属性,其中next_sibling和previous_sibling分别获取节点的下一个和上一个兄弟元素,next_siblings和previous_siblings则分别返回所有前面和后面的兄弟节点的生成器。
转载地址:http://csyrz.baihongyu.com/