已提取股票的日行情清单到dataframe,名字DF,关键字为ts_code,trade_date,现在要将财报的数据净利润关键字income根据发布时间ann_date和财报季度时间end_date,dataframe名为Df_qua,将income同步赋值到每天的数据中,请问如何用python编程更快
可以使用pandas的merge函数将日行情清单和财报数据按照ts_code和trade_date以及ts_code和end_date进行合并,从而将income同步赋值到每天的数据中。具体实现步骤如下:
Df_qua['trade_date'] = pd.to_datetime(Df_qua['ann_date'], format='%Y%m%d').dt.to_period('Q').dt.end_time.dt.strftime('%Y%m%d')
DF_merged1 = pd.merge(DF, Df_qua[['ts_code', 'trade_date', 'income']], on=['ts_code', 'trade_date'], how='inner')
DF_merged2 = pd.merge(DF_merged1, Df_qua[['ts_code', 'end_date', 'income']], on=['ts_code', 'end_date'], how='inner')
DF_merged2['income'] = DF_merged2.groupby(['ts_code', 'trade_date'])['income'].transform(lambda x: x.ffill())
DF_merged2['income'] = DF_merged2.groupby(['ts_code'])['income'].fillna(method='ffill')
完整代码入下
import pandas as pd
# 将ann_date转换为trade_date格式
Df_qua['trade_date'] = pd.to_datetime(Df_qua['ann_date'], format='%Y%m%d').dt.to_period('Q').dt.end_time.dt.strftime('%Y%m%d')
# 将DF和Df_qua按照ts_code和trade_date进行内连接
DF_merged1 = pd.merge(DF, Df_qua[['ts_code', 'trade_date', 'income']], on=['ts_code', 'trade_date'], how='inner')
# 将DF_merged1和Df_qua按照ts_code和end_date进行内连接
DF_merged2 = pd.merge(DF_merged1, Df_qua[['ts_code', 'end_date', 'income']], on=['ts_code', 'end_date'], how='inner')
# 将DF_merged2中的income按ann_date进行重复值填充(ffill),并按照trade_date进行排序
DF_merged2['income'] = DF_merged2.groupby('ts_code')['income'].fillna(method='ffill')
DF_merged2 = DF_merged2.sort_values(by=['ts_code', 'trade_date']).reset_index(drop=True)