Estimates the relative overfitting of a model as the ratio of the difference in test and train performance to the difference of test performance in the no-information case and train performance. In the no-information case the features carry no information with respect to the prediction. This is simulated by permuting features and predictions.

estimateRelativeOverfitting(
  predish,
  measures,
  task,
  learner = NULL,
  pred.train = NULL,
  iter = 1
)

Arguments

predish

(ResampleDesc | ResamplePrediction | Prediction)
Resampling strategy or resampling prediction or test predictions.

measures

(Measure | list of Measure)
Performance measure(s) to evaluate. Default is the default measure for the task, see here getDefaultMeasure.

task

(Task)
The task.

learner

(Learner | character(1))
The learner. If you pass a string the learner will be created via makeLearner.

pred.train

(Prediction)
Training predictions. Only needed if test predictions are passed.

iter

(integer)
Iteration number. Default 1, usually you don't need to specify this. Only needed if test predictions are passed.

Value

(data.frame). Relative overfitting estimate(s), named by measure(s), for each resampling iteration.

Details

Currently only support for classification and regression tasks is implemented.

References

Bradley Efron and Robert Tibshirani; Improvements on Cross-Validation: The .632+ Bootstrap Method, Journal of the American Statistical Association, Vol. 92, No. 438. (Jun., 1997), pp. 548-560.

See also

Examples

task = makeClassifTask(data = iris, target = "Species") rdesc = makeResampleDesc("CV", iters = 2) estimateRelativeOverfitting(rdesc, acc, task, makeLearner("classif.knn"))
#> iter relative.overfit.acc #> 1: 1 -0.02173913 #> 2: 2 0.02127660
estimateRelativeOverfitting(rdesc, acc, task, makeLearner("classif.lda"))
#> iter relative.overfit.acc #> 1: 1 -0.06382979 #> 2: 2 0.06000000
rpred = resample("classif.knn", task, rdesc)$pred
#> Resampling: cross-validation
#> Measures: mmce
#> [Resample] iter 1: 0.0400000
#> [Resample] iter 2: 0.0266667
#>
#> Aggregated Result: mmce.test.mean=0.0333333
#>
estimateRelativeOverfitting(rpred, acc, task)
#> iter relative.overfit.acc #> 1: 1 0.02083333 #> 2: 2 -0.02127660