运行不出来想要的结果,如何解决?(语言-python)

目标网站:https://top.baidu.com/board?tab=movie
1、爬取页面源代码
2、用正则解析数据,获取到整个榜单的电影名,电影类型和演员
3、把数据保存到csv

img

我看你的请求没问题,应该是正则写错了,代码复制粘贴下,我调试一下看

你先了解一下re模块,正则表达式的用法

代码复制出来嘛,我这边跑一下试试

import csv
import requests
import re
url = 'https://top.baidu.com/board?tab=movie'
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.0.0 Safari/537.36'}
response = requests.get(url, headers=headers)
data = response.text
# print(data)

div_tag = re.match('.*(<div style="margin-bottom:20px">.*</div>)', data, re.S)
div = div_tag.group(1)

name = re.findall('<div class=.*?>.*</div>', div, re.S)
# print(name)
lis = []
pat = re.compile('<div class="c-single-text-ellipsis">(.*?)</div>.*?<div class="intro_1l0wp">(.*?)</div><div class="intro_1l0wp">(.*?)</div>')
for k in name:
    res = pat.match(k)
    print(res.group(1), res.group(2), res.group(3))