用xpath爬取中文后，保存到txt文档中，中文成了编号，英文不受影响。


import json
from lxml import etree

html = """
网页代码
"""
parse_html = etree.HTML(html)
ls = []
i = 1
while i <= 10:
    index = parse_html.xpath("/html/body/li/div/div[%d]/div[2]/div/div[2]/a/i[1]/text()" % i)
    ls.append(index)
    i += 1
print(ls)

with open('wordcloud.txt', 'w', encoding='utf-8') as f:
    f.write(json.dumps(ls))

# Now read the file back into a Python list object
with open('wordcloud.txt', 'r', encoding='utf-8') as f:
    ls = json.loads(f.read())

这是导入txt的内容
[["\u6813Q\uff08\u6211\u771f\u7684\u4f1a\u8c22\uff09"], ["PUA\uff08CPU/KTV/ICU/PPT\uff09"],
有没有人能帮帮我

望采纳！！！点击回答右侧采纳即可！！！
这可能是由于爬取到的中文是经过编码的，在保存到txt文档中时需要解码。在保存之前，可以使用Python中的解码函数（如"decode()"）将编码的中文解码成正常的中文。

另外，还需要确保在保存txt文档时使用的编码方式与解码时使用的编码方式相同，否则会导致乱码。

可以使用open（）函数打开txt文档并写入，在写入前可以使用decode()函数解码，在open（）函数中使用'w',encoding='utf-8' 或 'gbk' 等编码方式来保证文件正常保存。

with open('text.txt', 'w',encoding='utf-8') as f:
content = "爬取到的中文"
content = content.decode('编码方式')
f.write(content)

在你的网页代码爬取下来的时候就要先解码，
with open('wordcloud.txt', 'w', encoding='utf-8') as f:你这是已经utf-8打开了，肯定乱码

解码的代码已经详细的写在这篇文章，不会的可以去参考：https://blog.csdn.net/weixin_70445937/article/details/128684326?spm=1001.2014.3001.5502