用tushare包计算以hs300每日收益率为标准计算其与其他三支股票的相关性,并以热力图的形式可视化出来,探究哪支股票与hs300有较大的相关性。下图为hs300和其他三只股票的每日收益率。(附上code,谢谢)
使用pandas中的corr函数计算hs300与每只股票的相关系数。参考代码:
import matplotlib.pyplot as plt
import tushare as ts
import pandas as pd
from concurrent.futures import ThreadPoolExecutor
import seaborn as sns
def get_data(tick):
return ts.get_k_data(tick, start='2020-09-01', end='2021-06-30')
ticks = ['hs300','002254', '002224', '000507']
ndf=pd.DataFrame()
with ThreadPoolExecutor(max_workers=3) as ex:
res=ex.map(get_data,ticks)
for i,df in enumerate(res):
#df.to_excel(f'stock_{ticks[i]}.xlsx',index=False)
x = df['date']
y = (df['close']-df['close'].shift(1))/df['close'].shift(1)
ys=y.cumsum()
ndf[ticks[i]]=ys
ndf=ndf.fillna(method='ffill').fillna(method='bfill')
corr=ndf.corr()
print(corr)
corrs={}
for tick in ndf.columns[1:]:
cor=ndf[['hs300',tick]].corr()
corrs[f'hs300&{tick}']=cor.loc['hs300',tick]
print(corrs)
sns.heatmap(corr, square=True, annot=True)
plt.show()