python statannotations显著性标记添加细节conselling!

最终希望实现的是:在下面这个图上加上显著性标记,然后统计方法用的是重复测量方差分析,两两比较之后P值大于0.05则为ns,小于0.05为,小于0.01为,小于0.001为

img


现有代码为:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.font_manager as fm
data = pd.read_csv(r'D:\pythonprogram\data\emojidata\data01.csv',encoding='gbk')
df = pd.DataFrame({
    'Positive emoji':(data['zza1'].mean(),data['zzha1'].mean(),data['zfa1'].mean()),
    'Negative emoji':(data['fza1'].mean(),data['fzha1'].mean(),data['ffa1'].mean())
})
df_std = pd.DataFrame({
    'Positive emoji':(data['zza1'].std(),data['zzha1'].std(),data['zfa1'].std()),
    'Negative emoji':(data['fza1'].std(),data['fzha1'].std(),data['ffa1'].std())
})

ax = df.plot(kind = 'bar',color=['pink','lime'],width=0.5,hatch='',yerr=df_std*0.5,capsize=10,ecolor='grey')
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
plt.xticks([0,1,2],
           ['Positive','Neutral','Negative'],
           fontproperties='stkaiti',
           rotation = 0
           )
plt.yticks(np.arange(0,1.1,0.1),fontproperties='stkaiti')
plt.ylabel('Correct rate',fontproperties='stkaiti',fontsize=14)
plt.xlabel('Sentence',fontproperties='stkaiti',fontsize=14)
font = fm.FontProperties(fname=r'C:\Windows\Fonts\AdobeSongStd-Light.otf')
plt.legend(prop=font,bbox_to_anchor=(1.12, 1.12))
plt.show()
plt.savefig('no1emoji.png', dpi=600, bbox_inches='tight')

这是gitub上statannotations作者的示例:

img

代码如下:

df = sns.load_dataset('diamonds')
df = df[df['color'].map(lambda x: x in 'EIJ')]
# Modifying data to yield unequal boxes in the hue value
df.loc[df['cut'] == 'Ideal', 'price'] = df.loc[df['cut'] == 'Ideal', 'price'].map(lambda x: min(x, 5000))
df.loc[df['cut'] == 'Premium', 'price'] = df.loc[df['cut'] == 'Premium', 'price'].map(lambda x: min(x, 7500))
df.loc[df['cut'] == 'Good', 'price'] = df.loc[df['cut'] == 'Good', 'price'].map(lambda x: min(x, 15000))
df.loc[df['cut'] == 'Very Good', 'price'] = df.loc[df['cut'] == 'Very Good', 'price'].map(lambda x: min(x, 3000))
df.head()
x = "color"
y = "price"
hue = "cut"
hue_order=['Ideal', 'Premium', 'Good', 'Very Good', 'Fair']
order = ["E", "I", "J"]
pairs=[
    (("E", "Ideal"), ("E", "Very Good")),
    (("E", "Ideal"), ("E", "Premium")),
    (("E", "Ideal"), ("E", "Good")),
    (("I", "Ideal"), ("I", "Premium")),
    (("I", "Ideal"), ("I", "Good")),
    (("J", "Ideal"), ("J", "Premium")),
    (("J", "Ideal"), ("J", "Good")),
    (("E", "Good"), ("I", "Ideal")),
    (("I", "Premium"), ("J", "Ideal")),
    ]
ax = sns.boxplot(data=df, x=x, y=y, order=order, hue=hue, hue_order=hue_order)
annot.new_plot(ax, pairs, data=df, x=x, y=y, order=order, hue=hue, hue_order=hue_order)
annot.configure(test='Mann-Whitney', verbose=2)
annot.apply_test()
annot.annotate()
plt.legend(loc='upper left', bbox_to_anchor=(1.03, 1))
plt.savefig('example_hue_layout.png', dpi=300, bbox_inches='tight')

直接跑的话会发现annot这个函数没有定义,那么找到链接https://github.com/trevismd/statannotations/tree/master/usage%E9%87%8C%E7%9A%84example.ipynb%E6%9C%80%E5%89%8D%E9%9D%A2%E4%BC%9A%E5%8F%91%E7%8E%B0%E6%A0%B9%E6%9C%AC%E5%B0%B1%E4%B8%8D%E6%98%AF%E8%BF%99%E4%B8%AAdataset%E7%9A%84pairs%EF%BC%8C%E7%84%B6%E5%90%8E%E7%96%AF%E7%8B%82%E6%8A%A5%E9%94%99%EF%BC%9A

Traceback (most recent call last):
  File "D:\python\lib\code.py", line 90, in runcode
    exec(code, self.locals)
  File "<input>", line 1, in <module>
  File "D:\pycharm\PyCharm 2022.2.3\plugins\python\helpers\pydev\_pydev_bundle\pydev_umd.py", line 198, in runfile
    pydev_imports.execfile(filename, global_vars, local_vars)  # execute the script
  File "D:\pycharm\PyCharm 2022.2.3\plugins\python\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "C:\Users\20893\PycharmProjects\pythonProject\paperwrite\journal\230430.py", line 32, in <module>
    annot = Annotator(ax, pairs, data=df, x=x, y=y, order=order)
  File "C:\Users\20893\AppData\Roaming\Python\Python310\site-packages\statannotations\Annotator.py", line 106, in __init__
    self._plotter = self._get_plotter(engine, ax, pairs, plot, data,
  File "C:\Users\20893\AppData\Roaming\Python\Python310\site-packages\statannotations\Annotator.py", line 776, in _get_plotter
    return engine_plotter(*args, **kwargs)
  File "C:\Users\20893\AppData\Roaming\Python\Python310\site-packages\statannotations\_Plotter.py", line 82, in __init__
    _Plotter.__init__(self, ax, pairs, data, x, y, hue, order, hue_order,
  File "C:\Users\20893\AppData\Roaming\Python\Python310\site-packages\statannotations\_Plotter.py", line 28, in __init__
    check_pairs_in_data(pairs, data, group_coord, hue, hue_order)
  File "C:\Users\20893\AppData\Roaming\Python\Python310\site-packages\statannotations\utils.py", line 130, in check_pairs_in_data
    _check_pairs_in_data_no_hue(pairs, data, coord)
  File "C:\Users\20893\AppData\Roaming\Python\Python310\site-packages\statannotations\utils.py", line 77, in _check_pairs_in_data_no_hue
    raise ValueError(f"Missing x value(s) "
ValueError: Missing x value(s) `"('J', 'Ideal')", "('J', 'Good')", "('I', 'Good')", "('I', 'Ideal')", "('J', 'Premium')", "('E', 'Ideal')", "('E', 'Very Good')", "('E', 'Good')", "('I', 'Premium')", "('E', 'Premium')"` in color (specified in `pairs`) in data


  • 以下回答由chatgpt基于相关博客总结生成:

    对于使用Python中的statannotations库添加显著性标记出现的annot函数未定义的问题,我可以提供如下的帮助:

    1. 确认是否已成功安装statannotations库。如果未安装该库,则需要使用pip或conda进行安装,可以通过以下命令进行安装:

    pip install statannotations

    1. 确认是否已正确导入Annotator类。在Python代码中,需要导入Annotator类才能使用该库的显著性标记功能。可以使用以下命令进行导入:

    from statannotations.Annotator import Annotator

    1. 确认是否已正确配置Annotator类的参数。在使用Annotator类时,需要针对具体的统计图表进行配置参数,比如指定测试方法、显著性标记类型等。可以使用如下命令进行配置:
    annotator = Annotator(ax, pairs, data=df, x=x, y=y, order=order)
    annotator.configure(test='Mann-Whitney', text_format='star', line_height=0.03, line_width=1)
    

    其中ax是绘图对象,pairs是待比较的数据对,data、x、y和order是数据源和绘图参数,test指定使用的测试方法,text_format指定显著性标记类型,line_height和line_width指定标记线条的长度和宽度。

    1. 确认是否已正确应用和绘制显著性标记。在完成Annotator类的参数配置后,需要使用apply_and_annotate()方法应用和绘制显著性标记。可以使用如下命令进行应用:

    annotator.apply_and_annotate()

    在使用该方法时,Annotator类会根据配置的参数自动计算并绘制显著性标记。

    如果以上四个步骤都已经确认正确,并且仍然出现annot函数未定义的问题,则可能是该库的版本问题或者其他未知原因。此时,可以尝试更新statannotations库的版本或者联系该库的维护者进行咨询。

    另外,针对重复测量方差分析得出两两比较之后P值的大小,从而确定显著性标记的类型是ns、、还是的问题,可以使用Annotator类中的text_format参数进行指定。text_format参数可以设置为'none'、'ns'、'simple'、'star'、'bracket'中的一种,分别对应不显示标记、显示'ns'、显示''、显示'**'和显示''。例如:

    annotator.configure(test='Mann-Whitney', text_format='star', line_height=0.03, line_width=1)

    以上代码配置了Mann-Whitney检验、显示'*'标记、标记线条长度为0.03、线条宽度为1。根据具体情况进行参数设置即可。