pytho是3.9.0.用了网上很多办法都没法解决,网上说是pytho版本过高
这个date数据返回的只是个response对象吧,需要使用read()获取具体内容的 findall 里面的写成 date.read()
反爬虫导致的,关键是添加header
import ssl
import urllib.request
context = ssl._create_unverified_context() # 解决https导致的 urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1056)>
url= "https://read.douban.com/provider/all"
header = { #头部信息。解决反爬虫导致的 HTTPError(req.full_url, code, msg, hdrs, fp)
'User-Agent':'Mozilla/5.0 (X11; Fedora; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36'
}
request = urllib.request.Request(url, headers=header)
conn = urllib.request.urlopen(request, context=context) # <class 'http.client.HTTPResponse'>
data = conn.read() # <class 'bytes'>
html = data.decode("utf8") # <class 'str'>
print(html)