在命令行里直接get就能获取信息,写成下面的程序就要人机验证了,这是为什么?
import requests
keyword = "Python"
try:
kv = {'wd': keyword}
kv2 = {'user-agent': 'Mozilla/5.0'}
r = requests.get("http://www.baidu.com/s", params=kv, headers=kv2)
r.raise_for_status()
r.encoding = r.apparent_encoding
print(r.text)
except:
print("爬取失败")
需要添加Accept和Referer内容到headers,写成如下试试:kv2 = {
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.198 Safari/537.36', 'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,/;q=0.8,application/signed-exchange;v=b3;q=0.9', 'Referer': 'http://www.baidu.com/s'}
如有帮助,请点击采纳按钮。
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.198 Safari/537.36'
试试吧 可能不规范