
Over- or undersample binary classification task to handle class imbalancy.
Source:R/OverUnderSampling.R
oversample.RdOversampling: For a given class (usually the smaller one) all existing observations are taken and copied and extra observations are added by randomly sampling with replacement from this class.
Undersampling: For a given class (usually the larger one) the number of observations is reduced (downsampled) by randomly sampling without replacement from this class.
Arguments
- task
(Task)
The task.- rate
(
numeric(1))
Factor to upsample or downsample a class. For undersampling: Must be between 0 and 1, where 1 means no downsampling, 0.5 implies reduction to 50 percent and 0 would imply reduction to 0 observations. For oversampling: Must be between 1 andInf, where 1 means no oversampling and 2 would mean doubling the class size.- cl
(
character(1))
Which class should be over- or undersampled. IfNULL,oversamplewill select the smaller andundersamplethe larger class.
Value
Task.
See also
Other imbalancy:
makeOverBaggingWrapper(),
makeUndersampleWrapper(),
smote()