爬取url发现返回不是自己想要的

访问这个接口 https://u.y.qq.com/cgi-bin/musics.fcg?_=1694077217790&sign=zzb0df24438odoluuxidpdaal4czwyijgd1e14611

img

应该返回这些内容

img

但是返回了这个
200
{"code":2000,"ts":1694077074205,"start_ts":1694077074205,"traceid":"1ea80fe99972b7a1","req_1":{"code":2000}}
代码如下:

import requests
import json

url = 'https://u.y.qq.com/cgi-bin/musics.fcg?_=1694076319336&sign=zzb0df24438odoluuxidpdaal4czwyijgd1e14611'

headers = {
    'accept': 'application/json',
    'accept-encoding': 'gzip, deflate, br',
    'accept-language': 'zh-CN,zh;q=0.9,en-US;q=0.8,en;q=0.7',
    'content-length': '683',
    'content-type': 'application/x-www-form-urlencoded',
    'cookie': 'fqm_pvqid=f2a8177e-0876-470a-9bce-9c2cad77a2c8; pgv_pvid=412294480; ts_refer=cn.bing.com/; ts_uid=3186657065; RK=rlO4TGr8th; ptcz=638b13b7f4e81cf343d15f80c2d06d4f9f9d8f6f5a32316152269fddbdaff6e9; euin=oi4i7KCFoe6FNv**; tmeLoginType=2; music_ignore_pskey=202306271436Hn@vBj; ptui_loginuin=3535680189; qqmusic_key=Q_H_L_5StuCLuNmUpJwnB5QgATppYrs4higygOPsNWqii3MmnzhI1P8nDmCHw; psrf_qqaccess_token=33F1254894CBB2D748B65E0070CF287A; psrf_access_token_expiresAt=1701849079; uin=3535680189; psrf_musickey_createtime=1694073079; psrf_qqrefresh_token=58220DF18B7CEAD15702E3F0C3F4E97F; wxopenid=; wxrefresh_token=; psrf_qqunionid=349BE460788215216454C3923C682198; qm_keyst=Q_H_L_5StuCLuNmUpJwnB5QgATppYrs4higygOPsNWqii3MmnzhI1P8nDmCHw; wxunionid=; psrf_qqopenid=C07209D48834E4698C634D21363BAA65; fqm_sessionid=0e40cb51-87ca-4d16-879d-18cbc6cb1544; pgv_info=ssid=s2655128773; ts_last=y.qq.com/n/ryqq/player',
    'referer': 'https://y.qq.com/',
    'user-agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36'
}

data ='{"comm": {"cv": 4747474, "ct": 24, "format": "json", "inCharset": "utf-8", "outCharset": "utf-8","notice": 0, "platform": "yqq.json", "needNewCode": 1, "uin": 3535680189,"g_tk_new_20200303": 1776277011, "g_tk": 1776277011},"req_1": {"module": "music.trackInfo.UniformRuleCtrl", "method": "CgiGetTrackInfo","param": {"ids": [217185670, 221655199, 352300412, 475408], "types": [0, 0, 0, 0]}}}'
r = requests.post(url, headers=headers, data=data)
print(r.status_code)
r.encoding = 'utf-8'
print(r.text)

盼复

headers里面的'content-type'改成'application/json'试试,其他的不变,重新请求一下试试


如果以上回答对您有所帮助,点击一下采纳该答案~谢谢


import requests


headers = {
    'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36',
}

params = {
    '_': '1694078753642',
    'sign': 'zzbdbb9ddfao4rsg7jeoahdesng69c9rac77a7f57',
}

data = '{"comm":{"cv":4747474,"ct":24,"format":"json","inCharset":"utf-8","outCharset":"utf-8","notice":0,"platform":"yqq.json","needNewCode":1,"uin":0,"g_tk_new_20200303":5381,"g_tk":5381},"req_1":{"module":"music.musicsearch.HotkeyService","method":"GetHotkeyForQQMusicMobile","param":{"searchid":"27694030791740212","remoteplace":"txt.yqq.top","from":"yqqweb"}},"req_2":{"module":"music.musicHall.MusicHallPlatform","method":"GetFocus","param":{}},"req_3":{"module":"newalbum.NewAlbumServer","method":"get_new_album_area","param":{}},"req_4":{"module":"newalbum.NewAlbumServer","method":"get_new_album_info","param":{"area":1,"sin":0,"num":20}},"req_5":{"module":"musicToplist.ToplistInfoServer","method":"GetAll","param":{}},"req_6":{"module":"MvService.MvInfoProServer","method":"GetNewMv","param":{"style":0,"tag":0,"start":0,"size":40}}}'

response = requests.post('https://u.y.qq.com/cgi-bin/musics.fcg', params=params, headers=headers, data=data)

【以下回答由 GPT 生成】

我看到你的代码中使用了requests库来发送HTTP请求,并传递了headers和data参数。然而,由于我无法访问该URL,无法确切知道请求所期望的参数。

不过,有几个地方可以优化你的代码:

  1. 首先,你的URL中包含了HTML实体编码,比如"。你可以使用标准库html中的unescape函数来解码这些实体字符,使其成为正常的字符串。例如:
from html import unescape

url = 'https://u.y.qq.com/cgi-bin/musics.fcg?_=1694077217790&sign=zzb0df24438odoluuxidpdaal4czwyijgd1e14611'
decoded_url = unescape(url)
  1. 其次,你可以使用requests库的get方法来发送GET请求。这样可以简化代码,并且不需要传递data参数。例如:
import requests

response = requests.get(decoded_url, headers=headers)
  1. 最后,你可以通过查看response对象的status_code属性来确定请求是否成功。例如:
if response.status_code == 200:
    # 请求成功
    content = response.content
    print(content)
else:
    # 请求失败
    print('请求失败,状态码:', response.status_code)

综上所述,以下是优化后的代码:

from html import unescape
import requests

url = 'https://u.y.qq.com/cgi-bin/musics.fcg?_=1694077217790&sign=zzb0df24438odoluuxidpdaal4czwyijgd1e14611'
decoded_url = unescape(url)

headers = {
    'accept': 'application/json',
    'accept-encoding': 'gzip, deflate, br',
    'accept-language': 'zh-CN,zh;q=0.9,en-US;q=0.8,en;q=0.7',
    'content-length': '683',
    'content-type': 'application/x-www-form-urlencoded',
    'cookie': 'fqm_pvqid=f2a8177e-0876-470a-9bce-9c2cad77a2c8; pgv_pvid=412294480; ts_refer=cn.bing.com/; ts_uid=3186657065; RK=rlO4TGr8th; ptcz=638b13b7f4e81cf343d15f80c2d06d4f9f9d8f6f5a32316152269fddbdaff6e9; euin=oi4i7KCFoe6FNv**; tmeLoginType=2; music_ignore_pskey=202306271436Hn@vBj; ptui_loginuin=3535680189; qqmusic_key=Q_H_L_5StuCLuNmUpJwnB5QgATppYrs4higygOPsNWqii3MmnzhI1P8nDmCHw; psrf_qqaccess_token=33F1254894CBB2D748B65E0070CF287A; psrf_access_token_expiresAt=1701849079; uin=3535680189; psrf_musickey_createtime=1694073079; psrf_qqrefresh_token=58220DF18B7CEAD15702E3F0C3F4E97F; wxopenid=; wxrefresh_token=; psrf_qqunionid=349BE460788215216454C3923C682198; qm_keyst=Q_H_L_5StuCLuNmUpJwnB5QgATppYrs4higygOPsNWqii3MmnzhI1P8nDmCHw; wxunionid=; psrf_qqopenid=C07209D48834E4698C634D21363BAA65; fqm_sessionid=0e40cb51-87ca-4d16-879d-18cbc6cb1544; pgv_info=ssid=s2655128773; ts_last=y.qq.com/n/ryqq/player',
    'referer': 'https://y.qq.com/',
    'user-agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36'
}

response = requests.get(decoded_url, headers=headers)

if response.status_code == 200:
    content = response.content
    print(content)
else:
    print('请求失败,状态码:', response.status_code)

希望这些优化能帮助你更好地爬取指定URL的内容!如果你有任何其他问题,请随时问我。



【相关推荐】



如果你已经解决了该问题, 非常希望你能够分享一下解决方案, 写成博客, 将相关链接放在评论区, 以帮助更多的人 ^-^