python爬虫报错SSLError(TLS/SSL connection has been closed (EOF,如何解决?

python爬虫报错requests.exceptions.SSLError: HTTPSConnectionPool(host='gs.amac.org.cn', port=443): Max retries exceeded with url: /amac-infodisc/api/pof/securities (Caused by SSLError(SSLZeroReturnError(6, 'TLS/SSL connection has been closed (EOF) (_ssl.c:1131)')))

搜索得知是ssl的问题,加入参数response = requests.post(url,verify=False)仍然报错, 加不加headers没有影响,尝试过方法将cryptography版本切换为36.0.2,pyopenssl更换为22.0.0仍然无效

requests代码如下


  def get_data(self, page):
        r = random()
        url = f'https://gs.amac.org.cn/amac-infodisc/api/pof/securities?rand={r}&page={page}&size=100'
        response = requests.post(url, verify = False, headers = self.headers)
        return response.json()

换用scrapy进行爬取,代码如下

    def start_requests(self):
        r = random.random()
        headers = {
            "Content-Length": "2",
            "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36 Edg/114.0.1823.79",
        }
        data={}
        url = f'https://gs.amac.org.cn/amac-infodisc/api/pof/securities'
        yield scrapy.Request(
            url=url,
            method='POST',
            callback=self.parse,
            headers=headers,
            # body=json.dumps(data)
        )

报错内容为 [<twisted.python.failure.Failure twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion: Connection lost.>]

如果在请求头去掉"Content-Length": "2",,则报错ERROR: Gave up retrying <POST https://gs.amac.org.cn/amac-infodisc/api/pof/securities%3E (failed 3 times): 500 Internal Server Error

如果在yield scrapy.Request内加入body参数则会出现INFO: Ignoring response <400 https://gs.amac.org.cn/amac-infodisc/api/pof/securities%3E: HTTP status code is not handled or not allowed

爬取网站为https://gs.amac.org.cn/amac-infodisc/res/pof/securities/index.html
抓包内容为

img

img

使用在线网站发送post请求仍然得不到数据

img

问题点: POST请求方式错误
代码修改如下,scrapy的请求方式也可以参考一下.

import requests

headers = {
    "Accept": "application/json, text/javascript, */*; q=0.01",
    "Content-Type": "application/json",
    "Host": "gs.amac.org.cn",
    "Origin": "https://gs.amac.org.cn",
    "Referer": "https://gs.amac.org.cn/amac-infodisc/res/pof/securities/index.html",
    "Content-Length": "2",
    "Sec-Fetch-Mode": "cors",
    "Sec-Fetch-Site": "same-origin",
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36",
    "X-Requested-With": "XMLHttpRequest",
}


def get_data(page):
    url = 'https://gs.amac.org.cn/amac-infodisc/api/pof/securities'
    data = {
        "rand": 0.18000891596398572,
        "page": page,
        "size": 20
    }

    response = requests.post(url, json=data, headers=headers)
    return response.json()


if __name__ == '__main__':
    print(get_data(1))