学习python爬虫课程中,跟着视频课程爬百度免费小说西游记时想获取所有章节内容,视频课程能爬出其的data数据,但我出现不明错误,爬不出来,加了headers和params都不行。之后下载了课程代码运行,仍然不可以,
以下是代码:
#爬取网址:https://dushu.baidu.com/pc/reader?gid=4306063500&cid=11348571
def getCatalog(url):
resp = requests.get(url,headers=headers,params = params)
print(resp.text)
if __name__=='__main__':
b_id = "4306063500"
url = 'https://dushu.baidu.com/api/pc/getCatalog?data={"' + b_id + '"}'
getCatalog(url)
返回结果:
{"errno":0,"data”:{"novel“:[],"errno":1},"logid":"2462003550","mac":"10.187.83.64","timestamp":"1627213262","s_log":"823b3751e32df627b858bcc57028c284","s_father_log":"823b3751e32df627b858bcc57028c284","s_root_log":"823b3751e32df627b858bcc57028c284"}
"data”:{"novel“:[],"errno":1}中本应该有数据,且errno为0,但现在却没有,请问是什么原因,有什么方法可以解决吗?
你输入的网址可能是错的,网页跳转的网址可能会通过各种栈的加密后有一点小小的变动,你注意对比一下
参数格式错误,应该是data={"book_id":"4306063500"},不是data={"4306063500"}。有帮助麻烦点个采纳【本回答右上角】,谢谢~~
import requests
#爬取网址:https://dushu.baidu.com/pc/reader?gid=4306063500&cid=11348571
def getCatalog(url):
headers={"user-agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.85 Safari/537.36"}
resp = requests.get(url,headers=headers)
print(resp.text)
if __name__=='__main__':
b_id = "4306063500"
url = 'https://dushu.baidu.com/api/pc/getCatalog?data={"book_id":"'+b_id+'"}'
getCatalog(url)