SQL中distinct去重多个字段,利⽤窗⼝函数解决去重多个字段
问题
distinct去重多个字段,其他字段不去重,⼀起输出
例如:
lect AA, BB, CC
from tableName;
现实版倒车入库要求是对AA,BB这两个字段都去重,CC是不需要去重的,最后⼀起输出
求解敬老院活动策划
下⾯的内容全部是在Hive/Sparl SQL中实现的, 因为MySQL5.7这种没有窗⼝函数
孙红雷演的电视剧生活小游戏关于distinct需要去重多个字段的问题, 下⾯的例⼦希望可以给你们带来启发, ⼤家⼀起讨论⼀下
distinct去重多个字段可以考虑 row_number() over(partition by 字段1[,字段2] order by ⽇期) 这种形式:
原数据:
-- 原表中中的数据:
lect id
,substring(recode_date, 1, 10) as create_date
发朋友圈怎么发,device_id
from dwd.dwd_ce_ur_log高中体育教案
where substring(recode_date, 1, 10) = '2020-08-24'
and device_id = '00de6cbcf0bbc87f';
id create_date device_id
76052968 2020-08-24 00de6cbcf0bbc87f
76052973 2020-08-24 00de6cbcf0bbc87f
76061303 2020-08-24 00de6cbcf0bbc87f
76061309 2020-08-24 00de6cbcf0bbc87f
76090080 2020-08-24 00de6cbcf0bbc87f
76129054 2020-08-24 00de6cbcf0bbc87f
76139129 2020-08-24 00de6cbcf0bbc87f
76039128 2020-08-24 00de6cbcf0bbc87f
76039129 2020-08-24 00de6cbcf0bbc87f
76042362 2020-08-24 00de6cbcf0bbc87f
76042363 2020-08-24 00de6cbcf0bbc87f
⽤distinct去重多个字段
-- ⽤distinct去重多个字段
lect distinct substring(recode_date, 1, 10) as create_date
,device_id
from dwd.dwd_ce_ur_log
where substring(recode_date, 1, 10) = '2020-08-24'
and device_id = '00de6cbcf0bbc87f';
create_date device_id
剩饭剩菜
2020-08-24 00de6cbcf0bbc87f
⽤row_number() over(partition by 字段1[,字段2] order by 字段)
桃酥饼干
-- ⽤row_number() over(partition by 字段1[,字段2] order by 字段)
lect substring(recode_date, 1, 10) as create_date
,device_id
from (
lect *, row_number() over(partition by device_id, substring(recode_date,1,10) order by recode_date desc) rank from dwd.dwd_ce_ur_log
) tmp
where tmp.rank =1
and substring(recode_date, 1, 10) = '2020-08-24'
and device_id = '00de6cbcf0bbc87f';
create_date device_id
2020-08-24 00de6cbcf0bbc87f