如何用mysql对用户进行分群,分为新客、活跃、不活跃、回头客四种类型,按月份统计每月这四种类型客户数量有多少?
新用户: 指首次消费。
活跃用户: 出现连续月份均有消费行为的用户。
不活跃用户: 之前有过消费的老客,后续在某时间窗口内没有消费。
回流用户:先前有过消费行为,中间隔了1或多个月后,再次回来消费的客户。即回头客。
用下表具体解释用户的分类如下:
针对提问所示源表,第一步:对用户消费月份pay_month字段偏移,两相比较。若差值diff为null,则为首次消费月,该月该客户定义为新客“new”;若差值diff为1,则该月与上一个消费月是连续月,则该月该客户被定义为活跃客户“active”;若差值diff不为1,则该月与上一个消费月是不连续的,则该月该客户被定义为回头客“return”。具体sql语句如下:
with t3 as (
select
t2.*,
case when diff is null then 'new'
when diff = 1 then 'active'
else 'return'
end as status
from
(
select
t1.*,
(t1.pay_month - t1.t) as diff
from
(SELECT user_id, pay_month, LAG(pay_month) over(PARTITION BY user_id ORDER BY pay_month) as t FROM temp_order2021 group by user_id, pay_month) as t1
) as t2)
查询结果如图:(因为源表很大,所以查询结果的截图只展示得了一部分,并非全部查询结果)
第二步就是统计每月各类型客户数了。
select
status, pay_month, count(status) as num
from t3
group by status, pay_month
order by status, pay_month
查询结果如图:
第三步:对第二步所查询结果按列表呈现即可
因此,整体sql语句如下:
create table cohort_table as
SELECT status,
sum( CASE WHEN pay_month = 1 THEN num ELSE null END ) AS '2021-01',
sum( CASE WHEN pay_month = 2 THEN num ELSE null END ) AS '2021-02',
sum( CASE WHEN pay_month = 3 THEN num ELSE null END ) AS '2021-03',
sum( CASE WHEN pay_month = 4 THEN num ELSE null END ) AS '2021-04',
sum( CASE WHEN pay_month = 5 THEN num ELSE null END ) AS '2021-05',
sum( CASE WHEN pay_month = 6 THEN num ELSE null END ) AS '2021-06',
sum( CASE WHEN pay_month = 7 THEN num ELSE null END ) AS '2021-07',
sum( CASE WHEN pay_month = 8 THEN num ELSE null END ) AS '2021-08',
sum( CASE WHEN pay_month = 9 THEN num ELSE null END ) AS '2021-09',
sum( CASE WHEN pay_month = 10 THEN num ELSE null END ) AS '2021-10',
sum( CASE WHEN pay_month = 11 THEN num ELSE null END ) AS '2021-11',
sum( CASE WHEN pay_month = 12 THEN num ELSE null END ) AS '2021-12'
FROM
(
select
status, pay_month, count(status) as num
from
(
select
t2.*,
case when diff is null then 'new'
when diff = 1 then 'active'
else 'return'
end as status
from
(
select
t1.*,
(t1.pay_month - t1.t) as diff
from
(SELECT user_id, pay_month, LAG(pay_month) over(PARTITION BY user_id ORDER BY pay_month) as t FROM temp_order2021 group by user_id, pay_month) as t1
) as t2
) as t3
group by status, pay_month
order by status, pay_month) as aa
group by status;
最终结果如图。至此,按月统计新客、活跃、回头客数完成,但不活跃客户数暂不知如何通过sql语句解决。
用4个子查询代表4个分组 就行了啊
【以下回答由 GPT 生成】
首先,我们可以按月份将用户分为新客、活跃客户、不活跃客户和回头客。
要实现这个功能,我们可以使用MySQL语句进行筛选和统计。
以下是解决问题的步骤:
首先,我们需要创建一个辅助表来存储每个月的日期,以便在查询中使用。我们可以使用MySQL的日期函数来生成这个表。
可以使用以下语句创建辅助表:
CREATE TABLE IF NOT EXISTS dates (day DATE);
INSERT INTO dates (day)
SELECT '2021-01-01' + INTERVAL (step.number) DAY
FROM
(SELECT (0 + units.i * 100 + tens.i * 10 + ones.i) as number
FROM
(SELECT 0 as i
UNION SELECT 1
UNION SELECT 2
UNION SELECT 3
UNION SELECT 4
UNION SELECT 5
UNION SELECT 6
UNION SELECT 7
UNION SELECT 8
UNION SELECT 9) as ones
CROSS JOIN
(SELECT 0 as i
UNION SELECT 1
UNION SELECT 2
UNION SELECT 3
UNION SELECT 4
UNION SELECT 5
UNION SELECT 6
UNION SELECT 7
UNION SELECT 8
UNION SELECT 9) as tens
CROSS JOIN
(SELECT 0 as i
UNION SELECT 1
UNION SELECT 2
UNION SELECT 3
UNION SELECT 4
UNION SELECT 5
UNION SELECT 6
UNION SELECT 7
UNION SELECT 8
UNION SELECT 9) as units) as step
WHERE '2021-01-01' + INTERVAL (step.number) DAY <= CURDATE();
这将创建一个名为dates的辅助表,其中包含从开始日期(2021-01-01)到当前日期的所有日期。
接下来,我们将创建一个视图,以便我们可以根据购买日期和辅助表中的日期进行查询。
我们可以使用以下语句创建视图:
CREATE VIEW customer_status AS
SELECT customers.id,
customers.name,
customers.purchase_date,
CASE
WHEN DATEDIFF(dates.day, customers.purchase_date) = 0 THEN '新客'
WHEN EXISTS(SELECT 1
FROM customers AS c
WHERE c.id != customers.id
AND c.purchase_date BETWEEN DATE_SUB(customers.purchase_date, INTERVAL 1 MONTH) AND customers.purchase_date) THEN '回头客'
WHEN EXISTS(SELECT 1
FROM customers AS c
WHERE c.id != customers.id
AND MONTH(c.purchase_date) = MONTH(customers.purchase_date)
AND YEAR(c.purchase_date) = YEAR(customers.purchase_date)
AND NOT EXISTS(SELECT 1
FROM customers AS c2
WHERE c2.id != c.id
AND MONTH(c2.purchase_date) = MONTH(customers.purchase_date)
AND YEAR(c2.purchase_date) = YEAR(customers.purchase_date)
AND c2.purchase_date BETWEEN DATE_SUB(c.purchase_date, INTERVAL 1 MONTH) AND c.purchase_date)) THEN '活跃客户'
ELSE '不活跃客户'
END AS status,
dates.day AS date
FROM customers
JOIN dates
WHERE dates.day >= customers.purchase_date;
这将创建一个名为customer_status的视图,其中包含每个客户的ID、名称、购买日期、状态和查询日期。状态根据规则判断。
现在,我们可以针对视图customer_status执行查询来计算每个月份的客户数量。
以下是一个示例查询,返回每个月的新客户、活跃客户、不活跃客户和回头客户的数量:
SELECT DATE_FORMAT(date, '%Y-%m') AS month,
COUNT(CASE WHEN status = '新客' THEN 1 END) AS new_customers,
COUNT(CASE WHEN status = '活跃客户' THEN 1 END) AS active_customers,
COUNT(CASE WHEN status = '不活跃客户' THEN 1 END) AS inactive_customers,
COUNT(CASE WHEN status = '回头客' THEN 1 END) AS returning_customers
FROM customer_status
GROUP BY DATE_FORMAT(date, '%Y-%m');
这将返回一个结果集,其中包含每个月份的新客户、活跃客户、不活跃客户和回头客户的数量。
请注意,这只是一个示例查询,你可以根据实际需要对结果进行修改和调整。
由于这种方法涉及到辅助表和视图的创建,因此对于大量数据,可能会有一定的性能影响。如果数据量很大,可以根据实际情况对查询进行优化。
希望以上解决方案能对你有所帮助,如果你还有其他问题,请随时提问。
【相关推荐】