python爬虫爬取到的图片打不开,但是在爬取到的网页进去是正常的,可以看到图片,就不知道是怎么回事,请大佬帮忙看看。
import requests
from lxml import etree
import os
url = 'https://pic.netbian.com/new/'
page_text = requests.get(url).content
tree = etree.HTML(page_text)
div_list = tree.xpath('//div[@class="slist"]/ul/li')
if not os.path.exists('upian'):
os.mkdir('upian')
for img in div_list:
src = 'https://pic.netbian.com/'+img.xpath('./a/@href')[0]
name = img.xpath('./a/img/@alt')[0]+'.jpg'
print(src,name)
pic = requests.get(src).content
tree2 = etree.HTML(pic)
picurl = 'https://pic.netbian.com'+tree2.xpath('//*[@id="img"]/img/@src')[0]
print(picurl)
response = requests.get(picurl).content
pic_path = 'upian/'+name
with open(pic_path,'wb')as f:
f.write(response)
print(name,'done*****************')
不写绝对路径,默认是当前工作文件夹,所以直接加tupian,前面的./没毛用。
再有就是你把获取到的网址就当做图片的网址是不对的。页面里还有很多其他的照片,文字等等。真实地址要打开之后只能看到图片。