图1
图2
具体代码:
#数据:使用葡萄酒数据
from sklearn.ensemble import AdaBoostClassifier
tree=DecisionTreeClassifier(criterion='entropy',
random_state=1,
max_depth=1)
ada=AdaBoostClassifier(base_estimator=tree,
n_estimators=500,
learning_rate=0.1,
random_state=1)#集成500课决策树
tree.fit(X_train,y_train)
y_train_pred=tree.predict(X_train)
y_test_pred=tree.predict(X_test)
tree_train=accuracy_score(y_train,y_train_pred)
tree_test=accuracy_score(y_test,y_test_pred)
print("Decison Tree train/test accuracy: %0.3f/%0.3f"%(tree_train,tree_test))
Decison Tree train/test accuracy: 0.916/0.875
```python
#学习曲线判断欠拟合还是过拟合
import matplotlib.pyplot as plt
from sklearn.model_selection import learning_curve
train_sizes,train_scores,test_scores=learning_curve(estimator=tree,
X=X_train,
y=y_train,
train_sizes=np.linspace(0.1,1,10),
cv=10,
n_jobs=1)
train_mean=np.mean(train_scores,axis=1)
test_mean=np.mean(test_scores,axis=1)
plt.plot(train_sizes,train_mean,color='g',marker='o',
label='training accuracy')
plt.plot(train_sizes,test_mean,color='r',marker='^',
label='validation accuracy')
plt.xlabel('numbers of samples')
plt.ylabel('accuracy')
plt.legend(loc='best')
plt.show()
一般先看树的深度还有数据量的大小来粗略判断出啥问题
欠拟合就是训练时还没有完全学习到训练集的特征,导致曲线拟合度较低
欠拟合应该说的是你的train和val曲线并没有平滑,还在上下波动,也就是模型并没有收敛,如果模型收敛的话,曲线末端应该是接近平滑的直线的,但是你这个并没有看出来有变成直线的趋势,画张图你看下就知道了
欠拟合和过拟合的出现原因以及解决方法
https://blog.csdn.net/LOVEzhang666/article/details/122335925