python实现sql中"select count(distinct case when

python实现sql中"select count(distinct case when length(actno)>0 then actno else null end) from data_a group by community "

这篇博客: python与SQL学习比较中的 13.带条件的计数：count（distinct case when …end） 部分也许能够解决你的问题, 你可以仔细阅读以下内容或跳转源博客中阅读:

13. 带条件的计数：count（distinct case when …end）

我们想统计：ts中含有‘2019-08-01’的不重复订单有多少，ts中含有‘2019-08-02’的不重复订单有多少。

#Hive SQL
select count(distinct case when ts like '%2019-08-01%' then orderid end) as 0801_cnt, 
count(distinct case when ts like '%2019-08-02%' then orderid end) as 0802_cnt
from t_order;
#运行结果：
5    11

pandas中实现：
定义两个函数，第一个函数给原数据增加一列，标记我们的条件，第二个函数再增加一列，当满足条件时，给出对应的orderid，然后要对整个dataframe应用这两个函数。对于我们不关心的行，这两列的值都为nan。第三步再进行去重计数操作。

#python
#第一步：构造一个辅助列
def func_1(x):
    if '2019-08-01' in x['ts']:
        return '2019-08-01'#这个地方可以返回其他标记
    elif '2019-08-02' in x['ts']:
        return '2019-08-02'
    else:
        return None

#第二步：将符合条件的order作为新的一列
def func_2(x):
    if '2019-08-01' in x['ts']:
        return str(x['orderid'])
    elif '2019-08-02' in x['ts']:
        return str(x['orderid'])
    else:
        return None

#应用两个函数，查看结果
#注意这里必须加上axis=1，你可以尝试下不加会怎样
order['cnt_condition'] = order.apply(func_1, axis=1)
order['cnt'] = order.apply(func_2, axis=1)
order[order['cnt'].notnull()]

#进行分组计数
order.groupby('cnt_condition').agg({'cnt': 'nunique'})

在这里插入图片描述

啥意思？是连接数据库执行sql，还是用程序来实现这种逻辑？