python中requests库中文乱码问题

问题遇到的现象和发生背景

初学requests库 vscdoe编译器

问题相关代码,请勿粘贴截图

import requests

url = 'https://www.baidu.com/s?%27

headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36'
}

data = {'wd':'北京'}

#url 请求资源路径
#params 参数
#kwargs 字典
response = requests.get(url=url,params=data,headers=headers)

content = response.text

print(content)

运行结果及报错内容
html>
<html lang="zh-CN">
<head>
    <meta charset="utf-8">
    <title>ç¾åº¦å®å
¨éªè¯title>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
    <meta name="apple-mobile-web-app-capable" content="yes">
    <meta name="apple-mobile-web-app-status-bar-style" content="black">
    <meta name="viewport" content="width=device-width, user-scalable=no, initial-scale=1.0, minimum-scale=1.0, maximum-scale=1.0">
    <meta name="format-detection" content="telephone=no, email=no">
    <link rel="shortcut icon" href="https://www.baidu.com/favicon.ico" type="image/x-icon">
    <link rel="icon" sizes="any" mask href="https://www.baidu.com/img/baidu.svg">
    <meta http-equiv="X-UA-Compatible" content="IE=Edge">
    <meta http-equiv="Content-Security-Policy" content="upgrade-insecure-requests">
    <link rel="stylesheet" href="https://ppui-static-wap.cdn.bcebos.com/static/touch/css/api/mkdjump_0635445.css" />
head>
<body>                                              è¯div>
        <button type="button" class="timeout-button">è¿å页button>
    div>iv class="timeout-img">div>  ç»å¼è¯·ç¨
    <div class="timeout-feedback hide">ä¸
        <div class="timeout-feedback-icon">p>
    div> class="timeout-feedback-title">é®é¢

<script src="https://wappass.baidu.com/static/machine/js/api/mkd.js">script>
<script src="https://ppui-static-wap.cdn.bcebos.com/static/touch/js/mkdjump_eac1ee5.js">script>
body>
html>

为何为中文为乱码 不是说requests不需要编码吗

我想要达到的结果

获取正常源代码

设置一下编码集就行了:

import requests

url = 'https://www.baidu.com/s?%27'

headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36'
}

data = {'wd':'北京'}

#url 请求资源路径
#params 参数
#kwargs 字典
response = requests.get(url=url,params=data,headers=headers)

content = response.content.decode('utf8')

print(content)




import requests

url = 'https://www.baidu.com/s?%27'

headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36'
}

data = {'wd': '北京'}

# url 请求资源路径
# params 参数
# kwargs 字典
response = requests.get(url=url, params=data, headers=headers)
# 设置编码
response.encoding = 'utf-8'

content = response.text

print(content)

原因: 编码的问题, 因为没有指定编码,默认的编码是 ISO-8859-1, 所以中文显示乱码,解决方案, 见代码

import requests

url = 'https://www.baidu.com/s?%27'

headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36'
}

data = {'wd':'北京'}

#url 请求资源路径
#params 参数
#kwargs 字典
response = requests.get(url=url,params=data,headers=headers)

# 方式一
content = response.content.decode('utf-8')

# print(content)

# 方式 2 修改默认的编码 为 utf-8
# 查看默认的编码
print(response.encoding)
response.encoding = 'utf-8'
print(response.text)


指定一下编码方式


import requests

url = "https://www.baidu.com/s?%27"
headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36'
}

data = {'wd': '北京'}

# url 请求资源路径
# params 参数
# kwargs 字典
response = requests.get(url=url, params=data, headers=headers)
response.encoding = response.apparent_encoding  # 程序自己推断编码方式  也可以指定utf-8
content = response.text

print(content)

这篇文章:python 解决requests中文乱码 也许有你想要的答案,你可以看看