请教一下Hadoop如何用hive实现一个文本的词频统计?要有具体的步骤,明天上机考试。环境安装啥的我都已经安装好了
CREATE EXTERNAL TABLE input_text (
line STRING
)
LOCATION '/path/to/input/directory';
SELECT word, COUNT(*) AS count
FROM (
SELECT explode(split(line, ' ')) AS word
FROM input_text
) t
GROUP BY word;
CREATE TABLE word_count (
word STRING,
count BIGINT
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
STORED AS TEXTFILE;
INSERT INTO word_count
SELECT word, COUNT(*) AS count
FROM (
SELECT explode(split(line, ' ')) AS word
FROM input_text
) t
GROUP BY word;
您好,我是有问必答小助手,您的问题已经有小伙伴帮您解答,感谢您对有问必答的支持与关注!