Normalize features by different methods. Internally BBmisc::normalize is used for every feature column. Non numerical features will be left untouched and passed to the result. For constant features most methods fail, special behaviour for this case is implemented.
Arguments
- obj
(data.frame | Task)
Input data.- target
(
character(1)
|character(2)
|character(n.classes)
)
Name(s) of the target variable(s). Only used whenobj
is a data.frame, otherwise ignored. If survival analysis is applicable, these are the names of the survival time and event columns, so it has length 2. For multilabel classification these are the names of logical columns that indicate whether a class label is present and the number of target variables corresponds to the number of classes.- method
(
character(1)
)
Normalizing method. Available are:
“center”: Subtract mean.
“scale”: Divide by standard deviation.
“standardize”: Center and scale.
“range”: Scale to a given range.- cols
(character)
Columns to normalize. Default is to use all numeric columns.- range
(
numeric(2)
)
Range for method “range”. Default isc(0,1)
.- on.constant
(
character(1)
)
How should constant vectors be treated? Only used, of “method != center”, since this methods does not fail for constant vectors. Possible actions are:
“quiet”: Depending on the method, treat them quietly:
“scale”: No division by standard deviation is done, input values. will be returned untouched.
“standardize”: Only the mean is subtracted, no division is done.
“range”: All values are mapped to the mean of the given range.
“warn”: Same behaviour as “quiet”, but print a warning message.
“stop”: Stop with an error.
Value
data.frame | Task. Same type as obj
.
See also
Other eda_and_preprocess:
capLargeValues()
,
createDummyFeatures()
,
dropFeatures()
,
mergeSmallFactorLevels()
,
removeConstantFeatures()
,
summarizeColumns()
,
summarizeLevels()