utf-8
```import requests
from lxml import etree
import json
url = "http://music.163.com/playlist?id=2182968685"
headers = {'Host': 'music.163.com',
'Referer': 'http://music.163.com/',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64)AppleWebKit/537.36(KHTML, like Gecko)Chrome/79.10.3945.130 Safari/537.36'
}
s = requests.session()
r = s.get(url, headers=headers)
results = etree.HTML(r.content)
print(results)
results = results.xpath('//span/a/@href')
#print(results)
for result in results:
print(result)
上面是代码,结果如下:
<Element html at 0x2246160fac8>
/user/home?id=60348755
javascript:void(0)
javascript:void(0)
javascript:void(0)
javascript:void(0)
javascript:void(0)
/artist?id=${x.id}
/song?id=${x.id}
/song?id=${x.id}
/song?id=${x.id}
/song?id=${x.id}
/song?id=${x.id}
/song?id=${x.id}
/song?id=${x.id}
/song?id=${x.id}
/song?id=${x.id}
望各位大神指点迷津,到底是那里出了问题。
你的规则提取的范围太大了,可以把你的xpath规则改一下:results = results.xpath('//ul[@class="f-hide"]/li/a/@href')
如何解决?求问
和我的一样,我用的是Beautifulsoup也是这个结果