Java语言读取文本文件,内含大约1000单词的文章,要求统计出现了多少个不同的单词(大小写不区分),并且按照出现的频率的倒序排序输出。
最简单的办法:读出文件流,按空格或“,”“.”等可能出现的符号拆分,放到map中,key是单词,value是出现次数,,最后排序就行了。
public List<Map.Entry<String, Integer>> readFile(String path) throws Exception {
File file = new File(path);
BufferedReader reader = new BufferedReader(new InputStreamReader(new FileInputStream(file)));
String line = "";
Map<String, Integer> map = new HashMap<>();
while((line =reader.readLine()) != null){
line = line.replaceAll("\\s+"," ");
String[] split = line.split("[ |,|\\.]");
for(String key : split){
if(null == map.get(key)){
map.put(key,1);
continue;
}
map.put(key,map.get(key) + 1);
}
}
List<Map.Entry<String,Integer>> result = new ArrayList<>(map.entrySet());
Collections.sort(result, new Comparator<Map.Entry<String, Integer>>() {
@Override
public int compare(Map.Entry<String, Integer> o1, Map.Entry<String, Integer> o2) {
if(o1.getValue() > o2.getValue()){
return -1;
}
if(o1.getValue() == o2.getValue()){
return 0;
}
return 1;
}
});
return result;
}