请教一个有关歌词情感分析的python问题

我正在做一个有关近二十年中文流行歌曲歌词的情感分析,下面是完整代码:


import pandas as pd
import jieba
from textblob import TextBlob
import matplotlib.pyplot as plt
from collections import Counter
from wordcloud import WordCloud, STOPWORDS

# 加载中文停用词表
stopwords = []
with open('D:/桌面/停用词表.txt', 'rb')as f:
    for line in f:
        stopwords.append(line.strip().decode('utf-8'))

# 加载中文情感词表,两份词表由老师发的词表分类整理而成
positive_words = set()
negative_words = set()
with open('D:/桌面/积极情感词表.txt', 'rb')as f:
    for line in f:
        positive_words.add(line.strip().decode('utf-8'))
with open('D:/桌面/消极情感词表.txt', 'rb')as f:
    for line in f:
        negative_words.add(line.strip().decode('utf-8'))

# 读取Excel文件中的数据,并进行清洗和预处理
lyrics_data = pd.read_excel('D:/桌面/近十年中文流行歌曲歌词数据集(1).xlsx')
lyrics_data = lyrics_data.dropna(subset=['lyrics'])  # 删除缺失的数据

# 对歌词数据进行情感分析和数据处理,并生成时间序列
lyrics_dict = {}
for _, data in lyrics_data.iterrows():
    release_year = int(data['year'])
    if release_year in lyrics_dict:
        lyrics_dict[release_year][0].append(data['lyrics'])
    else:
        lyrics_dict[release_year] = [[data['lyrics']], []]

for year in lyrics_dict:
    for lyrics in lyrics_dict[year][0]:
        words = jieba.lcut(lyrics)
        print("分词结果:", words)
        words = [word for word in words if word not in stopwords]   # 过滤无意义词
        print("过滤后:", words)
        emotion_words = []
        for word in words:
            if word in positive_words or word in negative_words:
                emotion_words.append(word)  # 只保留情感相关的词语
        blob = TextBlob(" ".join(words))
        sentiment_score = blob.sentiment.polarity
        lyrics_dict[year][1].append(sentiment_score)

sentiment_data = []
for year in lyrics_dict:
    mean_score = sum(lyrics_dict[year][1]) / len(lyrics_dict[year][1])
    sentiment_data.append((year, mean_score))

    # 将年份转换为datetime类型,然后按日期排序
    sentiment_df = pd.DataFrame(sentiment_data, columns=['year', 'sentiment_score'])
    sentiment_df['year'] = pd.to_datetime(sentiment_df['year'], format='%Y')
    sentiment_df = sentiment_df.sort_values('year')

    # 对时间序列数据进行可视化处理
    x = [str(pair[0]) for pair in sentiment_data]
    y = [pair[1] for pair in sentiment_data]
    plt.plot(x, y)

    # 添加图表标签
    plt.title('Sentiment Analysis of Chinese Pop Music Lyrics Over Time')
    plt.xlabel('Year')
    plt.ylabel('Sentiment Score')

    plt.show()

但是代码输出的图像结果是这样的:

img

我觉得这个输出结果很奇怪,因为原来的Excel文件里涉及到的每年的歌词,情感得分怎么都不可能会只有0分,而图像显示只有2004年的歌词情感得分是正值。我想请教一下各位,是不是我的代码有问题,如果有的话请指出。谢谢各位了!

blob = TextBlob(" ".join(words)) 这段代码错了,应该只用emotion_words


blob = TextBlob(" ".join(emotion_words))

如果有帮助,请点击一下采纳该答案~谢谢