想爬取一下壁纸的图片,可惜遇到错误

今天想爬一下好看的壁纸,可遇到了一些错误求帮忙

import requests
from lxml import etree
import os
url='https://wallhaven.cc/toplist?'
headers={
    'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36 Edg/96.0.1054.62'
}
response=requests.get(url=url,headers=headers)
page_text=response.text
tree=etree.HTML(page_text)
li_list=tree.xpath('//*[@id="thumbs"]/section[1]/ul')
if not os.path.exists('./壁纸'):
    os.mkdir('./壁纸')
for li in li_list:
    img_src='//*[@id="thumbs"]/section[1]/ul/li[2]/figure/img@src'[0]
    img_data=requests.get(url=img_src, headers=headers).content
    img_path='picLibs/' + img_src
    with open('壁纸/'+img_src)as fp:
        fp.write(img_data)
        print(img_src,'下载完成!')

url='https://wallhaven.cc/toplist?%27
报错为

img

就是想爬取整个排行榜的壁纸的图片,然后存到一个文件夹里面。

楼主,我把你的代码改了一下,这样就行了。

import requests
from lxml import etree
import os
import re

from torch import mode

url = 'https://wallhaven.cc/toplist?'
headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36 Edg/96.0.1054.62'
}
response = requests.get(url=url, headers=headers)
page_text = response.text
tree = etree.HTML(page_text)
as_href = tree.xpath('//*[@id="thumbs"]/section[1]/ul/li//a/@href')
if not os.path.exists('./壁纸'):
    os.mkdir('./壁纸')
for a_href in as_href:
    rsp = requests.get(url=a_href,headers=headers)
    html2 = etree.HTML(rsp.text)
    img_src = html2.xpath("//img[@id='wallpaper']/@src")[0]
    img_name = re.findall('.*-(.*)',img_src)[0]
    content2 = requests.get(url= img_src,headers=headers).content
    with open(file='./壁纸/{}'.format(img_name),mode='wb') as f:
        f.write(content2)

这个网站上的图片是真的不错呀!感谢楼主的分享!