setting文件中的代理也改为谷歌浏览器的了,其他都是默认的
import scrapy
from pachong2.items import MovieItem
from scrapy import Selector
class DoubanSpider(scrapy.Spider):
name = 'douban'
allowed_domains = ['movie.douban.com']
start_urls = ['http://movie.douban.com/top250']
def parse(self, response):
print(2)
sel = Selector(response)
list_items = sel.css('#content > div > div.article > ol> li')
print(list_items)
for list_item in list_items:
movie_item = MovieItem()
movie_item['title'] = list_item.css('span.title::text').extract_first()
movie_item['score'] = list_item.css('class.rating_num::text').extract_first()
yield movie_item
版权声明:本文为CSDN博主「weixin_42847617」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。
原文链接:https://blog.csdn.net/weixin_42847617/article/details/126064623
不打印'2',也就是没执行parse
scrapy crawl douban --nolog执行后发现没有爬东西,然后我加了一个print(2)后发现也不打印,那就是parse方法没执行
你把这个语句删掉试试呢,
看打不打印2