If countries were grouped by percent forestation in quartiles,
which group had the most countries in it in 2016?
如果国家根据森林覆盖率排序后再被划分为四分位(0-25%, 26-50%, 51-75%, 76-100%),
哪个区间段国家的个数是最多的。
我打算先用window function计算区间段然后再用COUNT排序
以下是我目前写出的代码,可是COUNT不能用在ranking上。
SELECT f.country_code, f.country_name, f.forest_area_sqkm,
l.total_area_sq_mi * 2.59 AS total_area_sq_km,
f.forest_area_sqkm/(l.total_area_sq_mi2.59) AS percent_of_forest_area,
CAST(f.forest_area_sqkm/(l.total_area_sq_mi2.59)100 AS DECIMAL(18,2)) AS PERCENT,
NTILE(4) OVER (ORDER BY f.forest_area_sqkm/(l.total_area_sq_mi2.59)) AS RANKING
FROM forest_area f
JOIN land_area l
ON f.country_code = l.country_code
AND f.year = l.year
WHERE f.year = 2016
使用 ntile ,是平均分布,不能达到指定区段的分布
使用 DENSE_RANK 来分布可能更好一些
另外,在你这个查询外,套一层查询,这个指令作为子查询,很容易得到结果
select sum(case when PERCENT<=25 then 1 else 0 end) as [0-25]
,sum(case when PERCENT<=50 and PERCENT>25 then 1 else 0 end) as [26-50]
,sum(case when PERCENT<=75 and PERCENT>50 then 1 else 0 end) as [51-75]
,sum(case when PERCENT>75 then 1 else 0 end) as [76-100]
from (
SELECT f.country_code, f.country_name, f.forest_area_sqkm,
l.total_area_sq_mi * 2.59 AS total_area_sq_km,
f.forest_area_sqkm/(l.total_area_sq_mi2.59) AS percent_of_forest_area,
CAST(f.forest_area_sqkm/(l.total_area_sq_mi2.59)100 AS DECIMAL(18,2)) AS PERCENT,
NTILE(4) OVER (ORDER BY f.forest_area_sqkm/(l.total_area_sq_mi2.59)) AS RANKING
FROM forest_area f
JOIN land_area l
ON f.country_code = l.country_code
AND f.year = l.year
WHERE f.year = 2016
) a
可以考虑使用group by having实现,再用order by排序。