使用GradientBoostingRegressor时,在输出ACC,MCC等结果时遇到的问题

遇到的是我试图使用GradientBoostingRegressor算法,然后算出这个算法的ACC,MCC,Precision之类的数据,但是失败了。
遇到的报错主要是这个

raise ValueError("Classification metrics can't handle a mix of {0} "
ValueError: Classification metrics can't handle a mix of binary and continuous targets

代码块如下

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import random
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, matthews_corrcoef, auc, roc_curve, roc_auc_score
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
from sklearn.metrics import cohen_kappa_score
from sklearn.metrics import roc_auc_score
from sklearn.model_selection import train_test_split
from pathlib import Path
from sklearn import ensemble
from sklearn import datasets
from sklearn.utils import shuffle
from sklearn.metrics import mean_squared_error
#叠上去的应用模块


df1=pd.read_csv('/root/data.csv',skip_blank_lines=True)
df1.dropna(inplace=True)
X=df1.drop(columns=["id","label"],axis=1)
Y=df1["label"]
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.3, random_state=88)
#这是将数据库中的数据随机划分

params = {'n_estimators': 500, 'max_depth': 4, 'min_samples_split': 2,'learning_rate': 0.01, 'loss': 'ls'}
clf = ensemble.GradientBoostingRegressor(**params)
clf.fit(X_train, Y_train)
mse = mean_squared_error(Y_test, clf.predict(X_test))
#截至到这一步都是正常的,这个mse可以正常输出

############################下面这一块是我猜测出问题的重点
y_pred=clf.predict(X_test)
y_ture=Y_test
ACC=accuracy_score(y_ture,y_pred)
#报错就是出现在ACC这一行上面
############################

Precision=precision_score(y_ture,y_pred)
recall=recall_score(y_ture,y_pred)
F1_score=f1_score(y_ture,y_pred)
mcc=matthews_corrcoef(y_ture,y_pred)

print("ACC:",ACC)
print("MCC:",mcc)
print("Precision:",Precision)
print("f1_score:",F1_score)
print("recall:",recall)
print("MSE: %.4f" % mse)
#最后的输出部分
这是运行出来的结果和详细的报错内容

Traceback (most recent call last):
File "/root/two.py", line 42, in
ACC=accuracy_score(y_ture,y_pred)
File "/usr/local/lib/python3.10/dist-packages/sklearn/utils/validation.py", line 63, in inner_f
return f(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/sklearn/metrics/_classification.py", line 202, in accuracy_score
y_type, y_true, y_pred = _check_targets(y_true, y_pred)
File "/usr/local/lib/python3.10/dist-packages/sklearn/metrics/_classification.py", line 92, in _check_targets
raise ValueError("Classification metrics can't handle a mix of {0} "
ValueError: Classification metrics can't handle a mix of binary and continuous targets

我最开始尝试过

https://blog.csdn.net/qq_24211837/article/details/121012374?ops_request_misc=&request_id=&biz_id=102&utm_term=Classification%20metrics%20can%27t%20h&utm_medium=distribute.pc_search_result.none-task-blog-2~all~sobaiduweb~default-0-121012374.142^v71^js_top,201^v4^add_ask&spm=1018.2226.3001.4187
这个链接作者的三个方法,但是均有bug,不会修改

我想要达到的结果是可以正常输出ACC,以及之后的一系列数值

附:这是里面的数据
X_train,X_test,Y_train,Y_test如下
X train
protected followers_count ... notifications translator_type
8769 0 13 ... 0 0
8823 0 11 ... 0 0
7062 0 1643 ... 0 1
5628 0 23 ... 0 0
2493 0 15 ... 0 0
... ... ... ... ... ...
8555 0 10 ... 0 0
2482 0 4198 ... 0 0
4048 0 946 ... 0 0
6433 0 10 ... 0 0
10201 0 22 ... 0 0

[7449 rows x 19 columns]

X test
protected followers_count ... notifications translator_type
6541 0 588 ... 0 0
8253 0 32 ... 0 0
8534 0 1582 ... 0 0
8580 0 15 ... 0 0
10039 0 48727 ... 0 0
... ... ... ... ... ...
510 0 2 ... 0 0
4572 0 34 ... 0 0
814 0 10 ... 0 0
8699 0 54 ... 0 0
3469 0 18 ... 0 0

[3193 rows x 19 columns]

Y train
8769 1.0
8823 1.0
7062 0.0
5628 0.0
2493 1.0
...
8555 1.0
2482 1.0
4048 0.0
6433 1.0
10201 1.0
Name: label, Length: 7449, dtype: float64

Y test
6541 1.0
8253 1.0
8534 0.0
8580 1.0
10039 0.0
...
510 1.0
4572 1.0
814 0.0
8699 0.0
3469 1.0
Name: label, Length: 3193, dtype: float64

之前也用其它的代码刷过,没有空值或者无穷大什么的,是不是这里的数据和后面函数需求的行或列的要求不一样?

当使用 GradientBoostingRegressor 算法时,出现“Classification metrics can't handle a mix of binary and continuous targets”错误,是因为该算法是回归算法,而不是分类算法。回归算法的目标是预测连续值,而分类算法的目标是预测类别。

这个错误的原因是,你在使用分类指标(例如 ACC,MCC 和 Precision)来评估回归模型的性能,而这些指标是用来评估分类模型性能的,而不是回归模型。

解决方案是,改用回归指标来评估模型性能,如R平方,均方误差,平均绝对误差等。

另外,检查一下输入数据是否是一个连续值,而不是离散值或二元值,如果是离散值或二元值,需要对数据进行预处理。

看起来像是数据的问题,请检查下数据格式是否符合要求。或者调试下定位到具体的代码进行分析。祝你问题早日解决!!

参考以下:你的混淆矩阵传值可以这样来搞,加个spilt


import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import confusion_matrix

# Assuming your target column is y, otherwise use the appropriate column name
X = df.drop(['y'], axis=1).values.astype('float')
y = df['y'].values.astype('float') # assuming you have label encoded your target variable

X_train, X_test, y_train, y_test = train_test_split(
        X, y, test_size=0.2, random_state=23, stratify=y)

knn = KNeighborsClassifier()
knn.fit(X_train, y_train)
y_pred = knn.predict(X_test)

cm = confusion_matrix(y_test, y_pred)
print(cm)

报错提示 42 行 (对应程序中实际为 35行)

File "/root/two.py", line 42, in
ACC=accuracy_score(y_ture,y_pred)

你可以先查看一下这两个变量的形状和内容,例如:

print(y_ture.shape,y_pred.shape)
print(y_ture.max,y_pred.max, y_ture.min,y_pred.min)