小小白在上手titanic,出了问题,求帮忙解答~
代码如下:
import pandas
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import KFold
#导入
titanic = pandas.read_csv("all/train.csv")
#print(titanic.head(3))
#print(titanic.describe())
#处理缺失数据
titanic["Age"] = titanic["Age"].fillna(titanic["Age"].median())
#print(titanic.describe())
titanic.loc[titanic["Sex"]=="male","Sex"]=0
titanic.loc[titanic["Sex"]=="female","Sex"]=1
titanic["Embarked"] = titanic["Embarked"].fillna('S')
titanic.loc[titanic["Embarked"]=="S","Embarked"]=0
titanic.loc[titanic["Embarked"]=="C","Embarked"]=1
titanic.loc[titanic["Embarked"]=="Q","Embarked"]=2
#print(titanic["Sex"].unique())
#print(titanic["Embarked"].unique())
#KFold
predictors = ["Pclass","Sex","SibSp","Parch","Fare","Embareked"]
alg = LinearRegression()
kf = KFold(titanic.shape[0],n_folds=3,random_state=1)
predictions = []
for train, test in kf:
train_predictiors = (titanic[predictors].iloc[train,:])
train_target = titanic["Survived"].iloc[train]
alg.fit(train_predictiors,train_target)
test_prdictions = alg.predict(titanic[predictors].iloc[test,:])
predictions.append(test_prdictions)
错误如下:
Traceback (most recent call last):
File "F:/python项目/titanic.py", line 20, in
kf = KFold(titanic.shape[0],n_folds=3,random_state=1)
TypeError: init() got an unexpected keyword argument 'n_folds'
Process finished with exit code 1
非常感谢~
该问题是由于sklearn中模块的迁移和更新导致的
解决方法一:不理会版本情况
直接将
from sklearn.model_selection import KFold
改为
from sklearn.cross_validation import KFold
解决方法二:
输入:
import sklearn
print(help(sklearn.model_selection.KFold())
查看帮助文档
sklearn.model_selection.KFold()可以设置三个参数:
1. n_splits : int型, default=3,代表交叉验证的折数,最小值为 2.
2. shuffle : boolean型,可选参数,default=False,代表是否在切分数据前对数据进行洗牌操作
3. random_state : int型,可选参数,default=None,代表随机种子
根据帮助文档学习如何使用新的KFold函数,依此更改你的代码。
init类型错误
我也遇到了同样的问题,这样改一下就不会报错了
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import KFold
predictors = ["Pclass", "Sex", "Age", "SibSp", "Parch", "Fare", "Embarked"]
alg = LinearRegression()
kf = KFold(n_splits=3, random_state=1,shuffle=False)
predictions = []
for train, test in kf.split(data["Survived"]):
train_predictors = data[predictors].iloc[train,:]
train_target = data["Survived"].iloc[train]
alg.fit(train_predictors, train_target)
test_predictions = alg.predict(data[predictors].iloc[test,:])
predictions.append(test_predictions)