请问怎样把JS Fetch的网络请求改成用python request实现?

我想抓取一个网页内的api的内容。 当网页请求接口时,我从Chrome 的开发者工具里边复制出了对应的js fetch代码,我把代码发到NodeJS里边可以正常运行,且可以捕获api返回的内容。 但是我把 Chrome里边的 cURl bash 内容复制到postman 请求时,却出现了防火墙的提示。
如下图:

img

我的js代码如下:

fetch("https://api.example.com/api", {
  headers: {
    accept: "application/json, text/plain, */*",
    "accept-language": "zh-CN,zh;q=0.9",
    "cache-control": "no-cache",
    "content-type": "application/json",
    "firebase-auth": "true",
    "firebase-token": "",
    pragma: "no-cache",
    "sec-ch-ua":
      '"Google Chrome";v="107", "Chromium";v="107", "Not=A?Brand";v="24"',
    "sec-ch-ua-mobile": "?0",
    "sec-ch-ua-platform": '"Windows"',
    "sec-fetch-dest": "empty",
    "sec-fetch-mode": "cors",
    "sec-fetch-site": "same-site",
  },
  referrer: "https://app.example.com/",
  referrerPolicy: "strict-origin-when-cross-origin",
  body: '{"action":"test","draftId":"","start":0,"end":17,"text":"hey what\'s up man","isBatch":false,"lookaheadIndex":0,"selection":{"bulletText":"","start":0,"end":17,"wholeText":"hey what\'s up man"},"languageCode":"en"}',
  method: "POST",
  mode: "cors",
  credentials: "omit",
})
  .then((res) => res.json())
  .then((d) => console.log(d));

我把js代码的内容改写成python request的形式。但是运行时也得到跟postman一样的错误。

import requests
import json

url = "https://api.example.com/api"

payload = json.dumps({
  "action": "REWRITE",
  "draftId": "",
  "start": 0,
  "end": 8,
  "text": "hey,man.",
  "isBatch": False,
  "lookaheadIndex": 0,
  "selection": {
    "bulletText": "",
    "start": 0,
    "end": 8,
    "wholeText": "hey,man."
  },
  "languageCode": "en"
})

headers = {
        'authority': 'https://api.example.com/api',
        'accept': 'application/json, text/plain, */*',
        'accept-language': 'zh-CN,zh;q=0.9',
        'cache-control': 'no-cache',
        'content-type': 'application/json',
        'firebase-auth': 'true',
        'firebase-token': '',
        'origin': 'https://app.example.com/',
        'pragma': 'no-cache',
        'referer': 'https://app.example.com/',
        'sec-ch-ua': '"Google Chrome";v="107", "Chromium";v="107", "Not=A?Brand";v="24"',
        'sec-ch-ua-mobile': '?0',
        'sec-ch-ua-platform': '"Windows"',
        'sec-fetch-dest': 'empty',
        'sec-fetch-mode': 'cors',
        'sec-fetch-site': 'same-site',
        #'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36'
}

response = requests.request("POST", url, headers=headers, data=payload)

print(response.text)

请教大家,我应该怎么修改才能正常获得api返回的内容呢? 求指点,非常感谢。

今天看请求头变成token了,改下面的可以

import urllib
from urllib import request, parse
import json
import http.client
http.client._MAXHEADERS = 1000#fix http.client.HTTPException: got more than 100 headers error

data={"action":"REWRITE","draftId":"","start":0,"end":7,"text":"hey man","isBatch":False,"lookaheadIndex":0,"selection":{"bulletText":"","start":0,"end":7,"wholeText":"hey man"},"languageCode":"en"}
data=json.dumps(data)
headers={
    'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.0.0 Safari/537.36',
    'Content-Type':'application/json',
    #'firebase-auth':'true',
    'token':'token值,注意看浏览器实际发送的请求头是token还是firebase-token,这个请求头会变化,以浏览器的为准'
}
 
url = 'https://api.wordtune.com/rewrite'
 
data = bytes(data, encoding='utf8')
try:
    req = request.Request(url=url, data=data, headers=headers, method='POST')
    response = request.urlopen(req)
    #print(response.status,response.reason)
    print(response.read().decode('utf-8'))
except urllib.error.HTTPError as e:
    # 用异常捕获,http状态码非200时,可解析出响应体
    print(e.read().decode("UTF-8"))

以下备用,题主自己切换测试看

import urllib
from urllib import request, parse
import json
import http.client
http.client._MAXHEADERS = 1000#fix http.client.HTTPException: got more than 100 headers error

data={"action":"REWRITE","draftId":"","start":0,"end":7,"text":"hey man","isBatch":False,"lookaheadIndex":0,"selection":{"bulletText":"","start":0,"end":7,"wholeText":"hey man"},"languageCode":"en"}
data=json.dumps(data)
headers={
    'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.0.0 Safari/537.36',
    'Content-Type':'application/json',
    'firebase-auth':'true',
    'firebase-token':'aaa'
}
 
url = 'https://api.wordtune.com/rewrite'

data = bytes(data, encoding='utf8')
try:
    req = request.Request(url=url, data=data, headers=headers, method='POST')
    response = request.urlopen(req)
    print(response.status,response.reason)
    print(response.read().decode('utf-8'))
except urllib.error.HTTPError as e:
    # 用异常捕获,http状态码非200时,可解析出响应体
    print(e.read().decode("UTF-8"))

我直接把js代码保存成本地文件,在nodejs 环境里边可以正常运行的。 另外,我在目标网页的 Devtool ——Console 里边也能正常运行。
说明Js的请求头是全的,没有缺少。 但是Python代码里边也带了同样数量的请求头,就会请求失败。 我觉得是python代码里边的请求体格式不对。

第一步:在浏览器抓包中右键复制该请求的 cURL Request
第二步:在这里粘贴:https://spidertools.cn/#/curl2Request
第三步:复制生成的python代码

您好,我是有问必答小助手,您的问题已经有小伙伴帮您解答,感谢您对有问必答的支持与关注!
PS:问答VIP年卡 【限时加赠:IT技术图书免费领】,了解详情>>> https://vip.csdn.net/askvip?utm_source=1146287632