爬取基金净值导出CSV文件后中文显示乱码
import pandas as pd
import csv
for i in range(1,2):
url = 'https://fundf10.eastmoney.com/F10DataApi.aspx?type=lsjz&code=000434&sdate=2016-09-29&edate=2022-02-14&per=20&page=1'
tb = pd.read_html(url)[0]
tb.to_csv(r'1.csv', mode='a', encoding='utf-8-sig', header=1, index=0)
print('第'+str(i)+'页抓取完成')
文件不报错,但导出的文件中文为乱码
尝试把编码改成utf-8,仍然出现乱码
希望最后导出的表格中的中文可以正常显示
指定read_html的编码。
import pandas as pd
import csv
for i in range(1,2):
url = 'https://fundf10.eastmoney.com/F10DataApi.aspx?type=lsjz&code=000434&sdate=2016-09-29&edate=2022-02-14&per=20&page=1'
tb = pd.read_html(url,encoding ='utf-8')[0]########指定编码
tb.to_csv(r'1.csv', mode='a', encoding='utf-8-sig', header=1, index=0)
print('第'+str(i)+'页抓取完成')
有没有尝试一下在控制台输出一下看看呀
import pandas as pd
import csv
for i in range(1,2):
url = 'http://fundf10.eastmoney.com/F10DataApi.aspx?type=lsjz&code=000434&sdate=2016-09-29&edate=2022-02-14&per=20&page={}'.format(str(i))
tb = pd.read_html(url,encoding = "utf-8")[0]
tb.to_csv('1.csv', mode='a', header=1,encoding = "gbk", index=0)
print('第'+str(i)+'页抓取完成')