python使用xpath爬取时打印输出"[]"
import requests
from lxml import etree
url = 'https://movie.douban.com/chart'
headers = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.198 Safari/537.36'}
rs = requests.get(url,headers=headers)
rs=rs.text
e1 = etree.HTML(rs)
info = e1.xpath("/html/body/div[@id='wrapper']/div[@id='content']/div[@class='grid-16-8 clearfix']/div[@class='article']/div[@class='indent']/div/table/tbody/tr/td/div/a/text()")
print(info)
[]
怎么输出正确结果
xpath写错了,应该是:
info = e1.xpath("/html/body/div[@id='wrapper']/div[@id='content']/div[@class='grid-16-8 clearfix']/div[@class='article']/div[@class='indent']/div/table/tr/td/div/a/text()")
你在浏览器上看到的html是渲染之后的,和request接受到的不一样。
写xpath的时候一定要以实际接收到的为准
顺便,xpath写成这样太丑了,这样比较好
info = e1.xpath("//div[@class='article']/div[@class='indent']/div/table/tr/td/div/a/text()")