用模块去爬一张网页的壁纸,可不知道为什么只爬了一张图片
from coffee_dou_requests import coffee_dou_requests
a='qazplm75124'#模块使用密码
u='****r'
l='/html/body/div[2]/div/div[2]/div[4]/div'#大体解析
o='壁纸'#文件名
i=".//a/div[1]/img/@src"#小解析
w='*********'#图片前缺失的url
w=coffee_dou_requests.coffee_dou_requests_2(a,u,l,o,i,w)
运行结果:
C:\Users\16662\PycharmProjects\pythonProject4\venv\Scripts\python.exe "C:/Users/16662/Desktop/pychon/壁纸爬取/shibai - 副本.py"
1.0+.jpg https://www.wallpapermaiden.com//image/2016/12/04/your-name-sky-stars-kimi-no-na-wa-lights-anime-10260-thumb.png 爬取完毕!!
总耗时: 5.41566276550293
爬取结束!!!
进程已结束,退出代码0
求help.
w传了两次值,把最后一行的“w=”删除
import requests
from lxml import etree
import os
url = 'https://www.wallpapermaiden.com/popular'
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36 Edg/96.0.1054.62'
}
response=requests.get(url=url,headers=headers)
page_test=response.text
tree=etree.HTML(page_test)
a=tree.xpath('/html/body/div[2]/div/div[2]/div[4]/div')
for li in a:
o = li.xpath('.//div/a/div[1]/img/@src')[0]
name = li.xpath('.//div/a/div[1]/img/@alt')[0]
print(o)
print(name)
o_url='https://www.wallpapermaiden.com/'+o
img_data=requests.get(url=o_url,headers=headers).content
img_path='壁纸2.0/'+name
with open(img_path,'wb')as fp:
fp.write(img_data)
print(o,name,'爬取完毕')
#if not os.path.exists('./壁纸2.0'):
#os.mkdir('./壁纸2.0')
这种写法也是一样的,只下载了一张图片