求代码大神帮忙解题,实在是想不明白

有如下hive记录表records,记录车辆的过车信息:
create table records(
id string, //记录编号
indate string, //过车记录时间
plate_no string, //车辆号牌
device_id int, //经过的设备编号
)
partitioned by(month string,day string)
row format delimited fields terminated by '\t' stored as ORC;

1。请使用HQL得到最近一个月内晚上(晚22点-早6点)出现记录最多的车辆号牌Top10及次数
2。请用spark RDD将上述表中indate、plate_no、device_id三个字段记录重复的数据只保留一条
3。有string A和string B,分别由4096个随机的0或1组成,样例为String A = "01010111001....110010",String B="10101110....100101",现有公式double C=(A和B逐位与的和)/(A中1的个数*B中1的个数),请用自己javahuoscala语言实现出满足此公式的方法

  1. 使用HQL得到最近一个月内晚上(晚22点-早6点)出现记录最多的车辆号牌Top10及次数:

    SELECT plate_no, COUNT(*) AS cnt
    FROM records 
    WHERE hour(indate) >= 22 OR hour(indate) < 6
    AND month = '最近一个月'
    GROUP BY plate_no
    ORDER BY cnt DESC
    LIMIT 10;
    
  2. 用spark RDD将上述表中indate、plate_no、device_id三个字段记录重复的数据只保留一条:

val records = sc.textFile("path_to_records_file")
val result = records.map(line => {
  val fields = line.split("\t")
  (fields(1) + "_" + fields(2) + "_" + fields(3), line)
})
.reduceByKey((a, b) => a)
.map(_._2)
result.saveAsTextFile("path_to_output_file")
  1. 实现满足此公式的方法:
public double calculateC(String a, String b) {
  int aCount = 0;
  int bCount = 0;
  int andCount = 0;
  for (int i = 0; i < 4096; i++) {
    if (a.charAt(i) == '1') aCount++;
    if (b.charAt(i) == '1') bCount++;
    if (a.charAt(i) == '1' && b.charAt(i) == '1') andCount++;
  }
  double c = (double) andCount / (aCount * bCount);
  return c;
}