统计Covid-19.txt正文中行数, 英文单词个数,不同单词的出现
次数,将结果输出到 homewor k.txt 中
import re
read_file = "Covid-19.txt"
out_file = "homewor k.txt"
rows = 0
cnt = 0
word_freq_dict = {}
pattern = r"^[a-zA-Z]+$"
with open(read_file,"r",encoding="utf-8") as fr:
data = fr.readlines()
rows = len(data)
for line in data:
line = line.split()
for word in line:
# 使用正则表达式匹配英文字符
if re.match(pattern,word):
cnt += 1
if word not in word_freq_dict:
word_freq_dict[word] = 1
else:
word_freq_dict[word] += 1
with open(out_file,'w',encoding='utf-8') as fw:
fw.write(f'行数: {rows}\n')
fw.write(f'英文单词个数: {cnt}\n')
# 此处统计的是总个数,如需统计不同单词个数请使用下面这行
# fw.write(f'英文单词个数: {len(word_freq_dict)}\n')
fw.write('单词出现次数如下:\n\n')
for key,value in word_freq_dict.items():
fw.write(f'{key} : {value}次\n')
【以下回答由 GPT 生成】
要实现以上功能,可以遵循以下步骤:
下面是实现该功能的Python代码:
import re
# 打开Covid-19.txt文件并读取内容
with open('Covid-19.txt', 'r') as file:
content = file.readlines()
# 统计行数
line_count = len(content)
# 统计英文单词个数和不同单词的出现次数
word_count = 0
word_dict = {}
for line in content:
words = re.findall(r'\b\w+\b', line)
word_count += len(words)
for word in words:
if word in word_dict:
word_dict[word] += 1
else:
word_dict[word] = 1
# 将结果输出到homework.txt文件中
with open('homework.txt', 'w') as file:
file.write(f"行数:{line_count}\n")
file.write(f"英文单词个数:{word_count}\n")
file.write("不同单词出现次数:\n")
for word, count in word_dict.items():
file.write(f"{word}: {count}\n")
请注意,该代码中使用了正则表达式r'\b\w+\b'
来匹配英文单词。如果需要更准确的匹配,可以根据实际需求进行修改。另外,上述代码中的文件路径可以根据实际文件位置进行修改。如果文件路径不正确或文件不存在,则会抛出相关的异常。