一个异步加载的页面:http://q.10jqka.com.cn/thshy/index/field/199112/order/desc/page/1/ajax/1/
打开上面的链接,右键‘查看源代码’可以看到需要的内容,但用requests.get()请求服务器只返回两个script标签(如下图)。请问各位高手如何才能正确的请求到页面内容?
拿到返回的cookie,再请求一次。
def req():
s=requests.session()
url = "http://q.10jqka.com.cn/thshy/index/field/199112/order/desc/page/1/ajax/1/"
headers = {
'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
'Accept-Encoding':'gzip, deflate',
'Accept-Language':'zh-CN,zh;q=0.9',
'Connection':'keep-alive',
'Host':'q.10jqka.com.cn',
'Referer':'http://q.10jqka.com.cn/thshy/index/field/199112/order/desc/page/1/ajax/1/',
'Upgrade-Insecure-Requests':'1',
'User-Agent':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.15 Safari/537.36',
}
s.headers.update(headers)
html=s.get(url=url,verify=False)
html.encoding = 'utf-8'
Soup = BeautifulSoup(html.text, 'lxml')
print(Soup)
req()
通过测试,代码运行可以得到数据