R语言 回归树/袋装树/随机森林预测

给定数据集data:链接:https://pan.baidu.com/s/1589RdaTGZaTlQAAOzfO58A 提取码:65bb
将前 90 个观察值视为训练集,其余 4 个观察值视为评估集。通过MSFE比较回归树、装袋树和随机森林的预测性能。
请问这个代码该如何编辑?

请参考:

# 加载数据
data <- read.csv("data.csv")

# 将前 90 个观察值分配给训练集,其余4个观察值分配给评估集
trainIndex <- 1:90
testIndex <- (90 + 1):nrow(data)
trainData <- data[trainIndex, ]
testData <- data[testIndex, ]

# 建立回归树模型
library(rpart)
regressionTree <- rpart(Budget ~ ., data = trainData)

# 建立装袋树模型
library(ipred)
baggingTree <- bagging(Budget ~ ., data = trainData)

# 建立随机森林模型
library(randomForest)
randomForest <- randomForest(Budget ~ ., data = trainData)

# 通过MSFE比较回归树、装袋树和随机森林的预测性能
msfeRegressionTree <- mean((predict(regressionTree, newdata = testData) - testData$Budget)^2)
msfeBaggingTree <- mean((predict(baggingTree, newdata = testData) - testData$Budget)^2)
msfeRandomForest <- mean((predict(randomForest, newdata = testData) - testData$Budget)^2)

cat("MSFE of regression tree:", msfeRegressionTree, "\n")
cat("MSFE of bagging tree:", msfeBaggingTree, "\n")
cat("MSFE of random forest:", msfeRandomForest, "\n")


该回答引用ChatGPT
请参考下面的解决方案,如果可行 ,还请点击 采纳,感谢!

实现代码如下,还请测试:

# Load the data
movies <- read.csv("movies.csv")

# Split the data into training and evaluation sets
training_set <- movies[1:90, ]
evaluation_set <- movies[91:94, ]

# Fit the regression tree model
library(rpart)
reg_tree <- rpart(OpenBox ~ Action + Adventure + Animation + Comedy + Crime + Drama + Family + Fantasy + Mystery + Romance + SciFi + Thriller + PG + PG13 + R + Budget + Weeks + Screens, data = training_set)

# Fit the bagging tree model
library(randomForest)
bagging_tree <- randomForest(OpenBox ~ Action + Adventure + Animation + Comedy + Crime + Drama + Family + Fantasy + Mystery + Romance + SciFi + Thriller + PG + PG13 + R + Budget + Weeks + Screens, data = training_set, ntree = 500, mtry = 2)

# Fit the random forest model
random_forest <- randomForest(OpenBox ~ Action + Adventure + Animation + Comedy + Crime + Drama + Family + Fantasy + Mystery + Romance + SciFi + Thriller + PG + PG13 + R + Budget + Weeks + Screens, data = training_set, ntree = 500, mtry = 2)

# Predict using the regression tree model
reg_tree_prediction <- predict(reg_tree, newdata = evaluation_set)

# Predict using the bagging tree model
bagging_tree_prediction <- predict(bagging_tree, newdata = evaluation_set)

# Predict using the random forest model
random_forest_prediction <- predict(random_forest, newdata = evaluation_set)

# Calculate the mean square error for each model
msfe_reg_tree <- mean((reg_tree_prediction - evaluation_set$OpenBox)^2)
msfe_bagging_tree <- mean((bagging_tree_prediction - evaluation_set$OpenBox)^2)
msfe_random_forest <- mean((random_forest_prediction - evaluation_set$OpenBox)^2)

# Compare the performance of each model
cat("Mean Square Error for Regression Tree:", msfe_reg_tree, "\n")
cat("Mean Square Error for Bagging Tree:", msfe_bagging_tree, "\n")
cat("Mean Square Error for Random Forest:", msfe_random_forest, "\n")


  • 关于该问题,我找了一篇非常好的博客,你可以看看是否有帮助,链接:R语言 随机森林