能获得图片的src,如何用playwright有头模式,通过src下载图片?
首先,确保已经安装了Playwright库和浏览器驱动。参考资料中提到的安装步骤非常简单,在命令行执行以下命令: pip install playwright python -m playwright install
导入Playwright库,创建一个浏览器实例,并打开一个新的页面: ```python from playwright import sync_playwright
with sync_playwright() as playwright: browser = playwright.chromium.launch(headless=False) # 在有头模式下启动Chromium浏览器 page = browser.new_page() ```
使用页面的goto()
方法导航到目标网页: python page.goto('http://example.com')
使用页面的query_selector_all()
方法找到所有图片元素,遍历它们并根据src
属性下载图片: python images = page.query_selector_all('img') for image in images: src = image.get_attribute('src') if src is not None: with open('image.jpg', 'wb') as file: file.write(image.screenshot())
上述代码将页面上的所有图片保存为image.jpg
文件。
python browser.close()
整体代码如下:
from playwright import sync_playwright
with sync_playwright() as playwright:
browser = playwright.chromium.launch(headless=False) # 在有头模式下启动Chromium浏览器
page = browser.new_page()
page.goto('http://example.com')
images = page.query_selector_all('img')
for image in images:
src = image.get_attribute('src')
if src is not None:
with open('image.jpg', 'wb') as file:
file.write(image.screenshot())
browser.close()
注意:上述代码假设在示例页面(http://example.com)中存在图片元素。要下载不同网页中的图片,需要替换page.goto()
的URL。
希望这个解决方案能够帮助到你!如果有任何问题,请随时提问。