当统计表中提交时间不为空的数据条数时出现了问题,第一次发现count(列名)和count(列名 is not null)得出的数据条数居然不同。
因为count(列名)的统计是会自动过滤null值的,那count(列名 is not null)为什么会得出不一样的结果?实在不知道哪里出了问题。数据表和问题SQL如下:
sql表创建语句:
drop table if exists exam_record;
CREATE TABLE exam_record (
id int PRIMARY KEY AUTO_INCREMENT COMMENT '自增ID',
uid int NOT NULL COMMENT '用户ID',
exam_id int NOT NULL COMMENT '试卷ID',
start_time datetime NOT NULL COMMENT '开始时间',
submit_time datetime COMMENT '提交时间',
score tinyint COMMENT '得分'
)CHARACTER SET utf8 COLLATE utf8_general_ci;
INSERT INTO exam_record(uid,exam_id,start_time,submit_time,score) VALUES
(1001, 9001, '2021-07-02 09:01:01', null, null),
(1002, 9003, '2021-09-01 12:01:01', '2021-09-01 12:21:01', 60),
(1002, 9002, '2021-09-02 12:01:01', '2021-09-02 12:31:01', 70),
(1002, 9001, '2021-09-05 19:01:01', '2021-09-05 19:40:01', 81),
(1002, 9002, '2021-07-06 12:01:01', null, null),
(1003, 9003, '2021-09-07 10:01:01', '2021-09-07 10:31:01', 86),
(1003, 9003, '2021-09-08 12:01:01', '2021-09-08 12:11:01', 40),
(1003, 9001, '2021-09-08 13:01:01', null, null),
(1003, 9002, '2021-09-08 14:01:01', null, null),
(1003, 9003, '2021-09-08 15:01:01', null, null),
(1005, 9001, '2021-09-01 12:01:01', '2021-09-01 12:31:01', 88),
(1005, 9002, '2021-09-01 12:01:01', '2021-09-01 12:31:01', 88),
(1005, 9002, '2021-09-02 12:11:01', '2021-09-02 12:31:01', 89);
这些SQL创建好的数据表如下:
如下是问题SQL:
select uid from exam_record
group by uid , month(submit_time)
having count(submit_time is not null) >= 3
如果只是count(submit_time),那么最后的结果是1002,1005。这也是正确结果。但是为什么加上 is not null后,结果会变成1002,1003,1005,实在是不知道为什么。
你想知道为什么,那就给告诉你为什么,然后指出你代码问题:
count(submit_time) 得到的是 submit_time 非null值的数量,count(列名 is not null) =count(1) 得到的是列名对应的数据总量,包括null值
我知道你想得到什么,但是你写法不对:
两个问题:
2.代码有问题:group 中有 uid和month(submit_time) 但是你 select 中只有uid,会存在很大问题;你考虑下到底想要得到啥?有没有month(submit_time)完全是两个结果;
下面我给你贴下代码,相信你就明白了
select
uid
,month(submit_time) AS month
,count(1) as cnt
,count(submit_time is not null) AS submit_time_cnt1
,count(submit_time) AS submit_time_cnt2
from exam_record
group by uid,month(submit_time);
uid | month | cnt | submit_time_cnt1 | submit_time_cnt2
------+-------+-----+------------------+------------------
1002 | 9 | 3 | 3 | 3
1003 | 9 | 2 | 2 | 2
1005 | 9 | 3 | 3 | 3
1003 | NULL | 3 | 3 | 0
1002 | NULL | 1 | 1 | 0
1001 | NULL | 1 | 1 | 0
(6 rows)
select
uid
,count(1) AS cnt
,count(submit_time is not null) AS submit_time_cnt1
,count(submit_time) AS submit_time_cnt2
from exam_record
group by uid;
uid | cnt | submit_time_cnt1 | submit_time_cnt2
------+-----+------------------+------------------
1001 | 1 | 1 | 0
1002 | 4 | 4 | 3
1003 | 5 | 5 | 2
1005 | 3 | 3 | 3
(4 rows)
那么该怎么写呢?
count(if(submit_time is not null,1,null))
或者
sum(if(submit_time is not null,1,0))
希望对你有所帮助!
首先你直接执行
select count(submit_time is not null) from exam_record;
会发现查询出13条记录。也就是说你这个判断式是错的,判断的所有row都是true。
当你改成这个语句:
select count(*) from (select * from exam_record where submit_time is not null) as temp;
查询出8条记录。
我想到这你应该懂了。
select sum( case when submit_time is not null then 1 else 0 end) from exam_record;
或者
select count(*) from exam_record where submit_time is not null ;
1、我们都知道 COUNT(非NULL) = 1,COUNT(NULL) = 0
2、count()函数条件判断要带上if,原sql改成下面就行了
select uid from exam_record
group by uid , month(submit_time)
having count(if(submit_time is not null,true,null)) >= 3
或者用sum()函数也行
select uid from exam_record
group by uid , month(submit_time)
having sum(if(submit_time is not null,1,0)) >= 3