怎么给dataset增加一列并按照用户编号啊?
比如
uid,movieid,time
1 1009 20220101
1 1002 20220104
2 1019 20220203
2 1012 20220209
要求:按用户uid对time进行倒排序,序号增加一列
增加后
uid movieid time order
1 1009 20220101 1
1 1002 20220104 2
2 1019 20220203 1
2 1012 20220209 2
uid很多很多,不能用穷举的方法(一个一个写的那种)
spark基本数据处理之推荐数据movielens_小李飞刀李寻欢的博客-CSDN博客
spark-sql> with t1 as (
> select 1 uid ,1009 movieid,20220101 time union
> select 1 uid ,1002 movieid,20220104 time union
> select 2 uid ,1012 movieid,20220209 time union
> select 2 uid ,1019 movieid,20220103 time )
> select *,row_number()over(partition by uid order by time)
> from t1 ;
1 1009 20220101 1
1 1002 20220104 2
2 1019 20220103 1
2 1012 20220209 2
Time taken: 15.135 seconds, Fetched 4 row(s)