统计如下字符串str 中每个单词出现的次数,结果存入 dict 中,单词为key,次数为 value,
并按照 value 由高到底排序,输出此 dict
str = """The Zen of Python, by Tim Peters
Beautiful is better thanugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparseisbetterthandense.
Readability counts.
Specialcasesaren'tspecialenoughtobreaktherules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
Thereshouldbeone--andpreferablyonlyone --obviouswayto do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespacesareonehonkinggreatidea--let'sdomoreofthose!"""
尝试用正则表达式解答过 没有空格的句子无法提取单词出来,例如“Specialcasesaren'tspecialenoughtobreaktherules”直接视为一个单词了,还有are'nt无法识别提取成一个单词
其中没有空格区分的句子和are'nt该如何区分提取单词出来
注:str dict 都是内置函数,尽量不要用它们作变量名。
zen = """The Zen of Python, by Tim Peters
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!"""
punc = [',','.','-','!','*']
for p in punc:
zen = zen.replace(p,' ')
lst = zen.lower().split()
dic = {}
for i in lst:
dic[i] = dic.get(i,0) + 1
for key,value in sorted(dic.items(), key=lambda x:x[1], reverse=True):
print(f'{key:>15}:{value}')
↓↓↓如有帮助请点个采纳,谢谢!
from collections import Counter
s = '...'#此处省略
s = s.replace(',',' ')
s = s.replace('"',' ')
s = s.replace(':',' ')
s = s.replace('!',' ')
s = s.replace('.',' ')
s = s.replace("'nt",' not')
s = s.replace("'m",' am')
s = s.replace("'s",' is')
dic = Counter(s.split())
print({i:j for i,j in sorted(dic.items(),key=lambda x:-x[1])})
没有空格的句子,说个实话,我都不知道怎样才算是个单词,更不用说提取出来了,这个我无能为力
from collections import Counter
import re
str1 = """The Zen of Python, by Tim Peters
Beautiful is better thanugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparseisbetterthandense.
Readability counts.
Specialcasesaren'tspecialenoughtobreaktherules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
Thereshouldbeone--andpreferablyonlyone --obviouswayto do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespacesareonehonkinggreatidea--let'sdomoreofthose!"""
res = [i.strip() for i in re.split(r"[^a-zA-Z]", str1) if i ]
res = Counter(res)
res = dict(sorted(res.items(), key = lambda x: x[1], reverse = True))
print(res)
""--result
{'is': 9, 'better': 7, 'than': 6, 'the': 4, 'Although': 3, 'never': 3, 'to': 3, 'it': 3, 'of': 2, 'may': 2, 'be': 2, 'If': 2, 'implementation': 2, 'explain': 2, 'a': 2, 'idea': 2, 'The': 1, 'Zen': 1, 'Python': 1, 'by': 1, 'Tim': 1, 'Peters': 1, 'Beautiful': 1, 'thanugly': 1, 'Explicit': 1, 'implicit': 1, 'Simple': 1, 'complex': 1, 'Complex': 1, 'complicated': 1, 'Flat': 1, 'nested': 1, 'Sparseisbetterthandense': 1, 'Readability': 1, 'counts': 1, 'Specialcasesaren': 1, 'tspecialenoughtobreaktherules': 1, 'practicality': 1, 'beats': 1, 'purity': 1, 'Errors': 1, 'should': 1, 'pass': 1, 'silently': 1, 'Unless': 1, 'explicitly': 1, 'silenced': 1, 'In': 1, 'face': 1, 'ambiguity': 1, 'refuse': 1, 'temptation': 1, 'guess': 1, 'Thereshouldbeone': 1, 'andpreferablyonlyone': 1, 'obviouswayto': 1, 'do': 1, 'that': 1, 'way': 1, 'not': 1, 'obvious': 1, 'at': 1, 'first': 1, 'unless': 1, 'you': 1, 're': 1, 'Dutch': 1, 'Now': 1, 'often': 1, 'right': 1, 'now': 1, 'hard': 1, 's': 1, 'bad': 1, 'easy': 1,
'good': 1, 'Namespacesareonehonkinggreatidea': 1, 'let': 1, 'sdomoreofthose': 1}
"""