爬取电影,名字是乱码

问题遇到的现象和发生背景 import requests

爬取网页的url和电影名称

用代码块功能插入代码,请勿粘贴截图

import re
url = 'https://www.1905.com/vod/?=0.2791749943538798%27
data = {

}
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/105.0.0.0 Safari/537.36'
}
resp = requests.get(url,headers=headers)
nei_rong = resp.text

obj = re.compile(r' ?) target="_blank" title="(?P.</em>?)">.*?</a>')<br>result = obj.finditer(nei_rong)<br>for i in result:<br><span style="display:inline-block;text-indent:2em;"> print(i.group('url'),i.group('title'))</span></p> <h6 id="h6_运行结果及报错内容-dpython310pythonexe-dsettingmaachangpy_1663572192537">运行结果及报错内容 D:\Python3.10\python.exe D:/setting/maa/chang.py</h6> <p>"<a href="https://www.1905.com/vod/play/1456417.shtml"">https://www.1905.com/vod/play/1456417.shtml"</a> 别哭!妈妈<br>"<a href="https://www.1905.com/vod/play/1451547.shtml"">https://www.1905.com/vod/play/1451547.shtml"</a> 打过长江去<br>"<a href="https://www.1905.com/vod/play/1451546.shtml"">https://www.1905.com/vod/play/1451546.shtml"</a> 亲密旅行<br>"<a href="https://www.1905.com/vod/play/1449745.shtml"">https://www.1905.com/vod/play/1449745.shtml"</a> 战斧行动<br>"<a href="https://www.1905.com/vod/play/1444908.shtml"">https://www.1905.com/vod/play/1444908.shtml"</a> 侠路相逢<br>"<a href="https://www.1905.com/vod/play/1069278.shtml"">https://www.1905.com/vod/play/1069278.shtml"</a> 谋杀似水年华<br>"<a href="https://www.1905.com/vod/dp/"">https://www.1905.com/vod/dp/"</a> 查看更多<br>"<a href="https://www.1905.com/vod/play/1391266.shtml"">https://www.1905.com/vod/play/1391266.shtml"</a> 杀破狼·贪狼<br>"<a href="https://www.1905.com/vod/play/1391165.shtml"">https://www.1905.com/vod/play/1391165.shtml"</a> 追龙<br>"<a href="https://www.1905.com/vod/play/1391230.shtml"">https://www.1905.com/vod/play/1391230.shtml"</a> 喵星人<br>"<a href="https://www.1905.com/vod/play/1391225.shtml"">https://www.1905.com/vod/play/1391225.shtml"</a> 侠盗联盟<br>"<a href="https://www.1905.com/vod/play/1391222.shtml"">https://www.1905.com/vod/play/1391222.shtml"</a> 拆弹专家<br>"<a href="https://www.1905.com/vod/play/969017.shtml"">https://www.1905.com/vod/play/969017.shtml"</a> 华丽上班族<br>"<a href="https://www.1905.com/vod/gp/"">https://www.1905.com/vod/gp/"</a> 查看更多<br>"<a href="https://www.1905.com/vod/list/c_178/o3u1p1.html"">https://www.1905.com/vod/list/c_178/o3u1p1.html"</a> 查看更多<br>"<a href="https://www.1905.com/vod/nd/"">https://www.1905.com/vod/nd/"</a> 查看更多<br>D:\Python3.10\python.exe D:/setting/maa/chang.py<br>"<a href="https://www.1905.com/vod/play/1456417.shtml"">https://www.1905.com/vod/play/1456417.shtml"</a> 别哭!妈妈<br>"<a href="https://www.1905.com/vod/play/1451547.shtml"">https://www.1905.com/vod/play/1451547.shtml"</a> 打过长江去<br>"<a href="https://www.1905.com/vod/play/1451546.shtml"">https://www.1905.com/vod/play/1451546.shtml"</a> 亲密旅行<br>"<a href="https://www.1905.com/vod/play/1449745.shtml"">https://www.1905.com/vod/play/1449745.shtml"</a> 战斧行动<br>"<a href="https://www.1905.com/vod/play/1444908.shtml"">https://www.1905.com/vod/play/1444908.shtml"</a> 侠路相逢<br>"<a href="https://www.1905.com/vod/play/1069278.shtml"">https://www.1905.com/vod/play/1069278.shtml"</a> 谋杀似水年华<br>"<a href="https://www.1905.com/vod/dp/"">https://www.1905.com/vod/dp/"</a> 查看更多<br>"<a href="https://www.1905.com/vod/play/1391266.shtml"">https://www.1905.com/vod/play/1391266.shtml"</a> 杀破狼·贪狼<br>"<a href="https://www.1905.com/vod/play/1391165.shtml"">https://www.1905.com/vod/play/1391165.shtml"</a> 追龙<br>"<a href="https://www.1905.com/vod/play/1391230.shtml"">https://www.1905.com/vod/play/1391230.shtml"</a> 喵星人<br>"<a href="https://www.1905.com/vod/play/1391225.shtml"">https://www.1905.com/vod/play/1391225.shtml"</a> 侠盗联盟<br>"<a href="https://www.1905.com/vod/play/1391222.shtml"">https://www.1905.com/vod/play/1391222.shtml"</a> 拆弹专家<br>"<a href="https://www.1905.com/vod/play/969017.shtml"">https://www.1905.com/vod/play/969017.shtml"</a> 华丽上班族<br>"<a href="https://www.1905.com/vod/gp/"">https://www.1905.com/vod/gp/"</a> 查看更多<br>"<a href="https://www.1905.com/vod/list/c_178/o3u1p1.html"">https://www.1905.com/vod/list/c_178/o3u1p1.html"</a> 查看更多<br>"<a href="https://www.1905.com/vod/nd/"">https://www.1905.com/vod/nd/"</a> 查看更多</p> <h6 id="h6_我想要达到的结果_1663572192537">我想要达到的结果</h6> <p>名字不是乱码就好</p>

把nei_rong这行改为这样:
nei_rong = resp.content.decode('utf8')

request请求后加一句转编码res.encoding = 'utf-8'

爬虫乱码现象很常见,这篇博文里的三种方法可以试一试,很合适,最常见的就是转换成二进制,或者encoding编码一下!
https://blog.csdn.net/xx_nm98/article/details/123191514

有帮助的话采纳一下哦

resp = requests.get(url,headers=headers)
resp.encoding='utf-8'
nei_rong = resp.text