Xpath用了text()还是乱码?

问题遇到的现象和发生背景

为啥Xpath用了text()还是出来乱码?

问题相关代码,请勿粘贴截图
from lxml import etree
tree = etree.parse('路径.html',etree.HTMLParser())
result = tree.xpath('//*[@data-week="3"]')
for i in result:
    io = i.xpath('./div/div[2]//text()')
    print(io)
运行结果及报错内容

[]
['ã\x80\x8aã\x80\x90æ\x9c¬ã\x80\x91å\x86\x9cä¸\x9aæ¦\x82论ã\x80\x8b[15]']
['ã\x80\x8aã\x80\x90æ\x9c¬ã\x80\x91å\x86\x9cä¸\x9aæ¦\x82论ã\x80\x8b[15]']
[]
['ã\x80\x8aã\x80\x90æ\x9c¬ã\x80\x91ç»\x8fæµ\x8eæ³\x95ã\x80\x8b[02]']
['ã\x80\x8aã\x80\x90æ\x9c¬ã\x80\x91ç»\x8fæµ\x8eæ³\x95ã\x80\x8b[02]']
['ã\x80\x8aã\x80\x90æ\x9c¬ã\x80\x91ä¸\xadå\x9b½ä¹¡æ\x9d\x91伦ç\x90\x86ã\x80\x8b[01]']
['ã\x80\x8aã\x80\x90æ\x9c¬ã\x80\x91ä¸\xadå\x9b½ä¹¡æ\x9d\x91伦ç\x90\x86ã\x80\x8b[01]']
[]
[]
[]
[]
[]
[]

不加text()结果
[]
[]
[]
[]
[]
[]
[]
[]
[]
[]
[]
[]
[]
[]

加text()没问题,是正确的,你的问题是编码问题,试试下面两种方法:

  1. tree = etree.parse('路径.html',etree.HTMLParser(encoding='utf-8')))
  2. tree = etree.parse('路径.html',etree.HTMLParser(encoding='gbk')))