'<a '
'href="https://m.weibo.cn/search?containerid=231522type%3D1%26t%3D10%26q%3D%23%E6%B7%B1%E5%9C%B3%E5%8F%91%E7%8E%B01%E4%BE%8B%E6%96%B0%E5%86%A0%E8%82%BA%E7%82%8E%E9%98%B3%E6%80%A7%23&extparam=%23%E6%B7%B1%E5%9C%B3%E5%8F%91%E7%8E%B01%E4%BE%8B%E6%96%B0%E5%86%A0%E8%82%BA%E7%82%8E%E9%98%B3%E6%80%A7%23&luicode=10000011&lfid=100103type%3D1%26t%3D10%26q%3D%23%E6%B7%B1%E5%9C%B3%E5%8F%91%E7%8E%B01%E4%BE%8B%E6%96%B0%E5%86%A0%E8%82%BA%E7%82%8E%E9%98%B3%E6%80%A7%23" '
'data-hide=""><span class="surl-text">发先</span></a><br '
'/>希望大家也能带话题<a '
'href="https://m.weibo.cn/search?containerid=231522type%3D1%26t%3D10%26q%3D%23%E9%80%9A%E5%8C%96%E7%96%AB%E6%83%85%23&extparam=%23%E9%80%9A%E5%8C%96%E7%96%AB%E6%83%85%23&luicode=10000011&lfid=100103type%3D1%26t%3D10%26q%3D%23%E6%B7%B1%E5%9C%B3%E5%8F%91%E7%8E%B01%E4%BE%8B%E6%96%B0%E5%86%A0%E8%82%BA%E7%82%8E%E9%98%B3%E6%80%A7%23" '
'data-hide=""><span class="surl-text">#通化#</span></a> <br '
'/>让更多人关注到🙏<br />都要过个好年🙏 ',
例如获取上面的文字,‘让更多人关注到🙏都要过个好年🙏’
使用bs4解析,提取**.text就行
怎么样算是重要信息? "希望大家也能带话题"? "通化"? 没有定义出重要信息的规则是没法提取的.
参考:https://blog.csdn.net/PY0312/article/details/93999895
这其实是爬取微博热搜里面的json文件下面一个text(里面存放有个人发眼)的内容,需要提取的就是个人发言