在判断模型是否欠拟合遇到问题?

阅读《python机器学习》,看到书上说模型欠拟合(图1),不太理解:在训练集上表现还行啊。于是做出学习曲线(图2),发现和欠拟合图形也不太一样,请教下书上说的是对还是错啊(机器学习初学者,如果问题太简单,还请多多包涵)
问题相关代码,请勿粘贴截图

图1

img

图2

img

具体代码:

#数据:使用葡萄酒数据

from sklearn.ensemble import AdaBoostClassifier
tree=DecisionTreeClassifier(criterion='entropy',
                           random_state=1,
                           max_depth=1)
ada=AdaBoostClassifier(base_estimator=tree,
                      n_estimators=500,
                      learning_rate=0.1,
                      random_state=1)#集成500课决策树
tree.fit(X_train,y_train)
y_train_pred=tree.predict(X_train)
y_test_pred=tree.predict(X_test)

tree_train=accuracy_score(y_train,y_train_pred)
tree_test=accuracy_score(y_test,y_test_pred)

print("Decison Tree train/test accuracy: %0.3f/%0.3f"%(tree_train,tree_test))

Decison Tree train/test accuracy: 0.916/0.875

```python
#学习曲线判断欠拟合还是过拟合
import matplotlib.pyplot as plt
from  sklearn.model_selection import learning_curve

train_sizes,train_scores,test_scores=learning_curve(estimator=tree,
              X=X_train,
              y=y_train,
              train_sizes=np.linspace(0.1,1,10),
              cv=10,
              n_jobs=1)

train_mean=np.mean(train_scores,axis=1)
test_mean=np.mean(test_scores,axis=1)

plt.plot(train_sizes,train_mean,color='g',marker='o',
         label='training accuracy')
plt.plot(train_sizes,test_mean,color='r',marker='^',
         label='validation accuracy')

plt.xlabel('numbers of samples')
plt.ylabel('accuracy')
plt.legend(loc='best')
plt.show()

一般先看树的深度还有数据量的大小来粗略判断出啥问题

欠拟合就是训练时还没有完全学习到训练集的特征,导致曲线拟合度较低

欠拟合应该说的是你的train和val曲线并没有平滑,还在上下波动,也就是模型并没有收敛,如果模型收敛的话,曲线末端应该是接近平滑的直线的,但是你这个并没有看出来有变成直线的趋势,画张图你看下就知道了

img

欠拟合和过拟合的出现原因以及解决方法
https://blog.csdn.net/LOVEzhang666/article/details/122335925