小白,初学python爬虫跟着教程实战爬取B站视频,以前都是小网站没有反爬所以从来没用过请求headers技能,今天使用发现Referer报错是字符错误,怎么也看不出来是哪错了
import requests # 发送网络请求
if __name__ == '__main__':
# 确认目标的url
url_30064 = 'https://upos-sz-mirrorkodoo1.bilivideo.com/upgcxcode/17/56/347885617/347885617-1-30064.m4s?e=ig8euxZM2rNcNbdlhoNvNC8BqJIzNbfqXBvEqxTEto8BTrNvN0GvT90W5JZMkX_YN0MvXg8gNEV4NC8xNEV4N03eN0B5tZlqNxTEto8BTrNvNeZVuJ10Kj_g2UB02J0mN0B5tZlqNCNEto8BTrNvNC7MTX502C8f2jmMQJ6mqF2fka1mqx6gqj0eN0B599M=&uipk=5&nbs=1&deadline=1624126563&gen=playurlv2&os=kodoo1bv&oi=1904523163&trid=5708ece4a138488c89e6b09d5660a15eu&platform=pc&upsig=54cf0562515ab061c77baad99e337a66&uparams=e,uipk,nbs,deadline,gen,os,oi,trid,platform&mid=4309805&bvc=vod&orderid=0,3&agrr=1&logo=80000000'
url_30280 = 'https://xy222x208x115x43xy.mcdn.bilivideo.cn:4483/upgcxcode/17/56/347885617/347885617-1-30280.m4s?e=ig8euxZM2rNcNbdlhoNvNC8BqJIzNbfqXBvEqxTEto8BTrNvN0GvT90W5JZMkX_YN0MvXg8gNEV4NC8xNEV4N03eN0B5tZlqNxTEto8BTrNvNeZVuJ10Kj_g2UB02J0mN0B5tZlqNCNEto8BTrNvNC7MTX502C8f2jmMQJ6mqF2fka1mqx6gqj0eN0B599M=&uipk=5&nbs=1&deadline=1624126299&gen=playurlv2&os=mcdn&oi=1904523163&trid=000190cb941cfc9e4613a5cafb127561a0cfu&platform=pc&upsig=51640a5b79fc0a374fda3798365f20d9&uparams=e,uipk,nbs,deadline,gen,os,oi,trid,platform&mcdnid=9000453&mid=4309805&bvc=vod&orderid=0,3&agrr=1&logo=A0000100'
# 伪装hearder的参数
headers = {
'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.106 Safari/537.36'
'Referer':'https://www.bilibili.com/video/BV1AK4y137SD?spm_id_from=333.851.b_62696c695f7265706f72745f646f756761.9'
}
# 发送请求,获取响应对象
response_30064 = requests.get(url_30064,headers=headers)
response_30280 = requests.get(url_30280,headers=headers)
data_30064 = response_30064.content # 字节类型数据提取
data_30280 = response_30280.content
#保存字节类型数据本地:MP4
with open('bigeyes_girl_30064.mp4','wb') as f:
f.write(data_30112)
with open('bigeyes_girl_30064.mp4', 'wb') as f:
f.write(data_30280)
#下面就是报错内容
C:\Users\lee\AppData\Local\Programs\Python\Python39\python.exe C:/Users/lee/Desktop/python_work/测试.py
File "C:\Users\lee\Desktop\python_work\测试.py", line 18
'Referer':'https://www.bilibili.com/video/BV1AK4y137SD?spm_id_from=333.851.b_62696c695f7265706f72745f646f756761.9'
^
SyntaxError: invalid syntax
进程已结束,退出代码为 1
同时爬取两个网页数据而且没有设置时间间隔,当然会报错了,很容易被识别为爬虫的。将两个网址设成一个列表,用循环方式,分次去获取数据,同时设置time.sleep()。
如对你有帮助,望点击采纳。
你把Referer删掉shi shi
您好,我是有问必答小助手,您的问题已经有小伙伴解答了,您看下是否解决,可以追评进行沟通哦~
如果有您比较满意的答案 / 帮您提供解决思路的答案,可以点击【采纳】按钮,给回答的小伙伴一些鼓励哦~~
ps:问答VIP仅需29元,即可享受5次/月 有问必答服务,了解详情>>>https://vip.csdn.net/askvip?utm_source=1146287632