请朋友做了一个程序分析文本情感(文本是现成的,目的是对比两个时间段围绕一个关键词的文本情感变化),最后导出来了几张折线图(如下,数值不明确),我想要一张具体数值的表格,有没有朋友能讲一下这个情感分析是怎么进行的(越具体越好),以及我该如何改代码让它能导出表格。
"### 情感模型加载"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "ea08be3b",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BertForSequenceClassification(\n",
" (bert): BertModel(\n",
" (embeddings): BertEmbeddings(\n",
" (word_embeddings): Embedding(21128, 768, padding_idx=0)\n",
" (position_embeddings): Embedding(512, 768)\n",
" (token_type_embeddings): Embedding(2, 768)\n",
" (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (encoder): BertEncoder(\n",
" (layer): ModuleList(\n",
" (0): BertLayer(\n",
" (attention): BertAttention(\n",
" (self): BertSelfAttention(\n",
" (query): Linear(in_features=768, out_features=768, bias=True)\n",
" (key): Linear(in_features=768, out_features=768, bias=True)\n",
" (value): Linear(in_features=768, out_features=768, bias=True)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (output): BertSelfOutput(\n",
" (dense): Linear(in_features=768, out_features=768, bias=True)\n",
" (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" (intermediate): BertIntermediate(\n",
" (dense): Linear(in_features=768, out_features=3072, bias=True)\n",
" (intermediate_act_fn): GELUActivation()\n",
" )\n",
" (output): BertOutput(\n",
" (dense): Linear(in_features=3072, out_features=768, bias=True)\n",
" (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" (1): BertLayer(\n",
" (attention): BertAttention(\n",
" (self): BertSelfAttention(\n",
" (query): Linear(in_features=768, out_features=768, bias=True)\n",
" (key): Linear(in_features=768, out_features=768, bias=True)\n",
" (value): Linear(in_features=768, out_features=768, bias=True)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (output): BertSelfOutput(\n",
" (dense): Linear(in_features=768, out_features=768, bias=True)\n",
" (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" (intermediate): BertIntermediate(\n",
" (dense): Linear(in_features=768, out_features=3072, bias=True)\n",
" (intermediate_act_fn): GELUActivation()\n",
" )\n",
" (output): BertOutput(\n",
" (dense): Linear(in_features=3072, out_features=768, bias=True)\n",
" (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" (2): BertLayer(\n",
" (attention): BertAttention(\n",
" (self): BertSelfAttention(\n",
" (query): Linear(in_features=768, out_features=768, bias=True)\n",
" (key): Linear(in_features=768, out_features=768, bias=True)\n",
" (value): Linear(in_features=768, out_features=768, bias=True)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (output): BertSelfOutput(\n",
" (dense): Linear(in_features=768, out_features=768, bias=True)\n",
" (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" (intermediate): BertIntermediate(\n",
" (dense): Linear(in_features=768, out_features=3072, bias=True)\n",
" (intermediate_act_fn): GELUActivation()\n",
" )\n",
" (output): BertOutput(\n",
" (dense): Linear(in_features=3072, out_features=768, bias=True)\n",
" (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" (3): BertLayer(\n",
" (attention): BertAttention(\n",
" (self): BertSelfAttention(\n",
" (query): Linear(in_features=768, out_features=768, bias=True)\n",
" (key): Linear(in_features=768, out_features=768, bias=True)\n",
" (value): Linear(in_features=768, out_features=768, bias=True)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (output): BertSelfOutput(\n",
" (dense): Linear(in_features=768, out_features=768, bias=True)\n",
" (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" (intermediate): BertIntermediate(\n",
" (dense): Linear(in_features=768, out_features=3072, bias=True)\n",
" (intermediate_act_fn): GELUActivation()\n",
" )\n",
" (output): BertOutput(\n",
" (dense): Linear(in_features=3072, out_features=768, bias=True)\n",
" (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" (4): BertLayer(\n",
" (attention): BertAttention(\n",
" (self): BertSelfAttention(\n",
" (query): Linear(in_features=768, out_features=768, bias=True)\n",
" (key): Linear(in_features=768, out_features=768, bias=True)\n",
" (value): Linear(in_features=768, out_features=768, bias=True)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (output): BertSelfOutput(\n",
" (dense): Linear(in_features=768, out_features=768, bias=True)\n",
" (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" (intermediate): BertIntermediate(\n",
" (dense): Linear(in_features=768, out_features=3072, bias=True)\n",
" (intermediate_act_fn): GELUActivation()\n",
" )\n",
" (output): BertOutput(\n",
" (dense): Linear(in_features=3072, out_features=768, bias=True)\n",
" (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" (5): BertLayer(\n",
" (attention): BertAttention(\n",
" (self): BertSelfAttention(\n",
" (query): Linear(in_features=768, out_features=768, bias=True)\n",
" (key): Linear(in_features=768, out_features=768, bias=True)\n",
" (value): Linear(in_features=768, out_features=768, bias=True)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (output): BertSelfOutput(\n",
" (dense): Linear(in_features=768, out_features=768, bias=True)\n",
" (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" (intermediate): BertIntermediate(\n",
" (dense): Linear(in_features=768, out_features=3072, bias=True)\n",
" (intermediate_act_fn): GELUActivation()\n",
" )\n",
" (output): BertOutput(\n",
" (dense): Linear(in_features=3072, out_features=768, bias=True)\n",
" (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" (6): BertLayer(\n",
" (attention): BertAttention(\n",
" (self): BertSelfAttention(\n",
" (query): Linear(in_features=768, out_features=768, bias=True)\n",
" (key): Linear(in_features=768, out_features=768, bias=True)\n",
" (value): Linear(in_features=768, out_features=768, bias=True)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (output): BertSelfOutput(\n",
" (dense): Linear(in_features=768, out_features=768, bias=True)\n",
" (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" (intermediate): BertIntermediate(\n",
" (dense): Linear(in_features=768, out_features=3072, bias=True)\n",
" (intermediate_act_fn): GELUActivation()\n",
" )\n",
" (output): BertOutput(\n",
" (dense): Linear(in_features=3072, out_features=768, bias=True)\n",
" (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" (7): BertLayer(\n",
" (attention): BertAttention(\n",
" (self): BertSelfAttention(\n",
" (query): Linear(in_features=768, out_features=768, bias=True)\n",
" (key): Linear(in_features=768, out_features=768, bias=True)\n",
" (value): Linear(in_features=768, out_features=768, bias=True)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (output): BertSelfOutput(\n",
" (dense): Linear(in_features=768, out_features=768, bias=True)\n",
" (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" (intermediate): BertIntermediate(\n",
" (dense): Linear(in_features=768, out_features=3072, bias=True)\n",
" (intermediate_act_fn): GELUActivation()\n",
" )\n",
" (output): BertOutput(\n",
" (dense): Linear(in_features=3072, out_features=768, bias=True)\n",
" (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" (8): BertLayer(\n",
" (attention): BertAttention(\n",
" (self): BertSelfAttention(\n",
" (query): Linear(in_features=768, out_features=768, bias=True)\n",
" (key): Linear(in_features=768, out_features=768, bias=True)\n",
" (value): Linear(in_features=768, out_features=768, bias=True)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (output): BertSelfOutput(\n",
" (dense): Linear(in_features=768, out_features=768, bias=True)\n",
" (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" (intermediate): BertIntermediate(\n",
" (dense): Linear(in_features=768, out_features=3072, bias=True)\n",
" (intermediate_act_fn): GELUActivation()\n",
" )\n",
" (output): BertOutput(\n",
" (dense): Linear(in_features=3072, out_features=768, bias=True)\n",
" (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" (9): BertLayer(\n",
" (attention): BertAttention(\n",
" (self): BertSelfAttention(\n",
" (query): Linear(in_features=768, out_features=768, bias=True)\n",
" (key): Linear(in_features=768, out_features=768, bias=True)\n",
" (value): Linear(in_features=768, out_features=768, bias=True)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (output): BertSelfOutput(\n",
" (dense): Linear(in_features=768, out_features=768, bias=True)\n",
" (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" (intermediate): BertIntermediate(\n",
" (dense): Linear(in_features=768, out_features=3072, bias=True)\n",
" (intermediate_act_fn): GELUActivation()\n",
" )\n",
" (output): BertOutput(\n",
" (dense): Linear(in_features=3072, out_features=768, bias=True)\n",
" (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" (10): BertLayer(\n",
" (attention): BertAttention(\n",
" (self): BertSelfAttention(\n",
" (query): Linear(in_features=768, out_features=768, bias=True)\n",
" (key): Linear(in_features=768, out_features=768, bias=True)\n",
" (value): Linear(in_features=768, out_features=768, bias=True)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (output): BertSelfOutput(\n",
" (dense): Linear(in_features=768, out_features=768, bias=True)\n",
" (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" (intermediate): BertIntermediate(\n",
" (dense): Linear(in_features=768, out_features=3072, bias=True)\n",
" (intermediate_act_fn): GELUActivation()\n",
" )\n",
" (output): BertOutput(\n",
" (dense): Linear(in_features=3072, out_features=768, bias=True)\n",
" (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" (11): BertLayer(\n",
" (attention): BertAttention(\n",
" (self): BertSelfAttention(\n",
" (query): Linear(in_features=768, out_features=768, bias=True)\n",
" (key): Linear(in_features=768, out_features=768, bias=True)\n",
" (value): Linear(in_features=768, out_features=768, bias=True)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" (output): BertSelfOutput(\n",
" (dense): Linear(in_features=768, out_features=768, bias=True)\n",
" (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" (intermediate): BertIntermediate(\n",
" (dense): Linear(in_features=768, out_features=3072, bias=True)\n",
" (intermediate_act_fn): GELUActivation()\n",
" )\n",
" (output): BertOutput(\n",
" (dense): Linear(in_features=3072, out_features=768, bias=True)\n",
" (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" )\n",
" )\n",
" )\n",
" )\n",
" (pooler): BertPooler(\n",
" (dense): Linear(in_features=768, out_features=768, bias=True)\n",
" (activation): Tanh()\n",
" )\n",
" )\n",
" (dropout): Dropout(p=0.1, inplace=False)\n",
" (classifier): Linear(in_features=768, out_features=2, bias=True)\n",
")"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# 加载保存好的tokenizer和模型\n",
"device = torch.device('cuda:0') if torch.cuda.is_available() else torch.device(\n",
" 'cpu') # 使用cpu或者gpu\n",
"tokenizer = BertTokenizer.from_pretrained(\"model_best\")\n",
"model = BertForSequenceClassification.from_pretrained(\n",
" \"model_best\", num_labels=len(label2id))\n",
"model.to(device)"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "840e4628",
"metadata": {},
"outputs": [],
"source": [
"@torch.no_grad()\n",
"def predict(file,\n",
" model=model,\n",
" tokenizer=tokenizer,\n",
" batch_size=32):\n",
" data = pd.read_excel(file)\n",
" texts = list(data['发布内容'])\n",
" text_encodings = tokenizer(texts,\n",
" truncation=True,\n",
" padding=\"max_length\",\n",
" max_length=128)\n",
" text_dataset = CuDataset(text_encodings, [list(label2id.keys())[0]] * len(texts))\n",
" text_loader = DataLoader(text_dataset,\n",
" batch_size=batch_size,\n",
" shuffle=False)\n",
" result = []\n",
" preds = []\n",
" for idx, batch in tqdm(enumerate(text_loader),\n",
" total=len(texts) // batch_size,\n",
" desc=f\"Predict {file}\"):\n",
" input_ids = batch['input_ids'].to(device)\n",
" attention_mask = batch['attention_mask'].to(device)\n",
" outputs = model(input_ids, attention_mask=attention_mask) # 输出所有概率\n",
" # 获取情感得分\n",
" pred = torch.softmax(outputs[0], dim=-1)\n",
" result.extend(torch.argmax(pred, dim=-1).cpu().numpy()) # 拿到标签\n",
" preds.extend(pred[:, 1].cpu().numpy())\n",
" result = [id2label[i] for i in result]\n",
" # res_df = pd.DataFrame()\n",
" # res_df['text'] = texts\n",
" data['label'] = result\n",
" data['score'] = preds\n",
" data.to_csv(f'{file}_res.csv', index=False, encoding='utf-8-sig')\n",
" return data"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "65a10d24-013a-4fc1-801c-44f7e1610b39",
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "6d782c2cd8884d85871a56db64681ebd",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Predict .\\微博爬取内容.xlsx: 0%| | 0/1130 [00:00
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"for root, dirs, files in os.walk('.', topdown=True):\n",
" for file_name in files:\n",
" if 'xlsx' not in file_name:\n",
" continue\n",
" if 'res' in file_name:\n",
" continue\n",
" predict(file=os.path.join(root, file_name), batch_size=8)"
]
},
{
"cell_type": "markdown",
"id": "732a8f23",
"metadata": {},
"source": [
"### 按年份画图,计算月份情感值"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "46b84e39",
"metadata": {},
"outputs": [],
"source": [
"def plot_count(month_count_21, month_count_22, flag='ainlp'):\n",
" for k,v in month_count_21.items():\n",
" month_count_21[k] = np.mean(v)\n",
" for k,v in month_count_22.items():\n",
" month_count_22[k] = np.mean(v)\n",
" sorted_count_21 = sorted(month_count_21.items(), key=lambda x: int(x[0].split('月')[0]))\n",
" sorted_count_22 = sorted(month_count_22.items(), key=lambda x: int(x[0].split('月')[0]))\n",
" x = [i for i, j in sorted_count_21]\n",
" y1 = [j for i, j in sorted_count_21]\n",
" y2 = [j for i, j in sorted_count_22]\n",
" plt.plot(x, y1, clip_on=False, markevery=1, label='2021')\n",
" plt.plot(x, y2, clip_on=False, markevery=1, label='2022')\n",
" plt.xticks(x, fontproperties=my_font, rotation=60)\n",
" plt.title(f'情感变化-{flag}', fontproperties=my_font)\n",
" plt.xlabel('月份', fontproperties=my_font)\n",
" plt.ylabel('情感值', fontproperties=my_font)\n",
" # plt.rcParams[\"figure.dpi\"] = 140\n",
" # plt.figure(figsize=(5, 5))\n",
" plt.legend(loc=(0.80, 0.85))\n",
" plt.savefig(f'freqs_{flag}.png')\n",
" plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "96379d3e",
"metadata": {},
"outputs": [
{
"ename": "NameError",
"evalue": "name 'keywords' is not defined",
"output_type": "error",
"traceback": [
"\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[1;31mNameError\u001b[0m Traceback (most recent call last)",
"\u001b[1;32m~\\AppData\\Local\\Temp\\ipykernel_24624\\2443971286.py\u001b[0m in \u001b[0;36m\u001b[1;34m\u001b[0m\n\u001b[1;32m----> 1\u001b[1;33m \u001b[1;32mfor\u001b[0m \u001b[0mflag\u001b[0m \u001b[1;32min\u001b[0m \u001b[0mkeywords\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 2\u001b[0m \u001b[0mmonth_count_2021\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mdefaultdict\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mlist\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 3\u001b[0m \u001b[0mmonth_count_2022\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mdefaultdict\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mlist\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 4\u001b[0m \u001b[0mtemp_data\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mdata\u001b[0m\u001b[1;33m[\u001b[0m\u001b[0mdata\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;34m'关键词'\u001b[0m\u001b[1;33m]\u001b[0m \u001b[1;33m==\u001b[0m \u001b[0mflag\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 5\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n" ,
"\u001b[1;31mNameError\u001b[0m: name 'keywords' is not defined"
]
}
],
"source": [
"for flag in keywords:\n",
" month_count_2021 = defaultdict(list)\n",
" month_count_2022 = defaultdict(list)\n",
" temp_data = data[data['关键词'] == flag]\n",
"\n",
" for i, row in tqdm(temp_data.iterrows(), total=len(temp_data)):\n",
" line = row['发布时间']\n",
" line = line.split(' ')\n",
" year = line[0]\n",
" if not year.startswith('2021'):\n",
" year = '2022年' + year\n",
" year = year.split('月')[0]\n",
" month = str(int(year.split('年')[1])) + '月'\n",
" month_count_2022[month].append(row['score'])\n",
" else:\n",
" year = year.split('月')[0]\n",
" month = str(int(year.split('年')[1])) + '月'\n",
" month_count_2021[month].append(row['score'])\n",
" plot_count(month_count_2021, month_count_2022, flag)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "98ad7313",
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"id": "a5333b98",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.13"
},
"vscode": {
"interpreter": {
"hash": "ebcc9ab275a0b8333d76f3c9007aff6798cc23ea07d3b4c53c0b1df0392e66fe"
}
}
},
"nbformat": 4,
"nbformat_minor": 5
}
可以参考:
《python实现中文情感分析与可视化》
《python实现情感分析》
以下答案基于ChatGPT与GISer Liu编写:
不知道你这个问题是否已经解决, 如果还没有解决的话:情感分析通常基于自然语言处理技术,通过对文本内容进行语义分析和情感分析,判断文本中表达的情感倾向(如积极、消极、中性等)。常见的情感分析方法有基于词典的方法、基于机器学习的方法和基于深度学习的方法等。
具体来说,基于词典的方法是将文本中出现的词汇与预定义的情感词典进行匹配,计算文本情感倾向得分;基于机器学习的方法则需要构建一个分类器,通过训练数据集来学习文本情感分类规则;基于深度学习的方法则通过神经网络模型来学习文本特征和情感分类规则。
关于代码方面,你可以尝试使用Python中的一些开源情感分析库(如NLTK、TextBlob、PyTorch等),这些库提供了现成的情感分析模型和相关算法,可以方便地进行情感分析。另外,如果想要导出具体数值的表格,可以将情感得分保存为CSV文件或Excel表格,便于后续处理和分析。你可以参考Python中相关的文件IO操作和数据处理库,如pandas库。
使用自然语言处理(NLP)技术。NLP技术可以帮助您分析文本中的情感,并自动提取文本中的情感特征。还可以使用机器学习技术来构建情感分析模型,以便更准确地分析文本中的情感。
提供参考实例,链接:http://t.csdn.cn/sNi2T