vignettes/tutorial/out_of_bag_predictions.Rmd
out_of_bag_predictions.Rmd
Some learners like random forest use bagging. Bagging means that the learner consists of an ensemble of several base learners and each base learner is trained with a different random subsample or bootstrap sample from all observations. A prediction made for an observation in the original data set using only base learners not trained on this particular observation is called out-of-bag (OOB) prediction. These predictions are not prone to overfitting, as each prediction is only made by learners that did not use the observation for training.
To get a list of learners that provide OOB predictions, you can call listLearners(obj = NA, properties = "oobpreds")
.
listLearners(obj = NA, properties = "oobpreds")[c("class", "package")]
## class package
## 1 classif.randomForest randomForest
## 2 classif.randomForestSRC randomForestSRC
## 3 classif.ranger ranger
## 4 classif.rFerns rFerns
## 5 regr.randomForest randomForest
## 6 regr.randomForestSRC randomForestSRC
## 7 regr.ranger ranger
## 8 surv.randomForestSRC survival,randomForestSRC
In mlr
function getOOBPreds()
can be used to extract these observations from the trained models. These predictions can be used to evaluate the performance of a given learner like in the following example.
lrn = makeLearner("classif.ranger", predict.type = "prob", predict.threshold = 0.6)
mod = train(lrn, sonar.task)
oob = getOOBPreds(mod, sonar.task)
oob
## Prediction: 208 observations
## predict.type: prob
## threshold: M=0.60,R=0.40
## time: NA
## id truth prob.M prob.R response
## 1 1 R 0.5771858 0.4228142 R
## 2 2 R 0.5845517 0.4154483 R
## 3 3 R 0.5839262 0.4160738 R
## 4 4 R 0.4512921 0.5487079 R
## 5 5 R 0.4909478 0.5090522 R
## 6 6 R 0.4311339 0.5688661 R
## ... (#rows: 208, #cols: 5)
performance(oob, measures = list(auc, mmce))
## auc mmce
## 0.9383301 0.1442308
As the predictions that are used are out-of-bag, this evaluation strategy is very similar to common resampling strategies like 10-fold cross-validation, but much faster, as only one training instance of the model is required.