关联查询到数据库中每天统计的报表,需要每周、每月的指标、每个季度的指标,如何把区间数据累加,存在的问题是由于是动态时间区间,导致可能会存在不满一周的情况,这个怎么去规避
简单写个处理年的:
import pandas as pd
if __name__ == '__main__':
# 自己用pandas读取一下数据源,给df,接着对df开始处理
yearList = []
for key, yearDf in df.groupby(['year_msg']):
yearDict = {'start_days': yearDf['days_msg'].min(),
'end_days': yearDf['days_msg'].max(),
'year_msg': key,
'year_url_nums_sum': yearDf['url_nums'].sum(),
'year_details_nums_sum': yearDf['details_nums'].sum()
}
yearList.append(yearDict)
yearDf = pd.DataFrame(yearList)
yearDf['year_url_nums_sum_new'] = yearDf['year_url_nums_sum'].expanding(min_periods=1).sum()
yearDf['year_details_nums_sum_new'] = yearDf['year_details_nums_sum'].expanding(min_periods=1).sum()
yearDf.to_excel("year.xlsx", index=False)
你要干点啥?