Remove constant features from a data set.
Source:R/removeConstantFeatures.R
removeConstantFeatures.Rd
Constant features can lead to errors in some models and obviously provide no information in the training set that can be learned from. With the argument “perc”, there is a possibility to also remove features for which less than “perc” percent of the observations differ from the mode value.
Usage
removeConstantFeatures(
obj,
perc = 0,
dont.rm = character(0L),
na.ignore = FALSE,
wrap.tol = .Machine$double.eps^0.5,
show.info = getMlrOption("show.info"),
...
)
Arguments
- obj
(data.frame | Task)
Input data.- perc
(
numeric(1)
)
The percentage of a feature values in [0, 1) that must differ from the mode value. Default is 0, which means only constant features with exactly one observed level are removed.- dont.rm
(character)
Names of the columns which must not be deleted. Default is no columns.- na.ignore
(
logical(1)
)
Should NAs be ignored in the percentage calculation? (Or should they be treated as a single, extra level in the percentage calculation?) Note that if the feature has only missing values, it is always removed. Default isFALSE
.- wrap.tol
(
numeric(1)
)
Numerical tolerance to treat two numbers as equal. Variables stored asdouble
will get rounded accordingly before computing the mode. Default issqrt(.Maschine$double.eps)
.- show.info
(
logical(1)
)
Print verbose output on console? Default is set via configureMlr.- ...
To ensure backward compatibility with old argument
tol
Value
data.frame | Task. Same type as obj
.
See also
Other eda_and_preprocess:
capLargeValues()
,
createDummyFeatures()
,
dropFeatures()
,
mergeSmallFactorLevels()
,
normalizeFeatures()
,
summarizeColumns()
,
summarizeLevels()