Optimizes the features for a classification or regression problem by choosing a variable selection wrapper approach. Allows for different optimization methods, such as forward search or a genetic algorithm. You can select such an algorithm (and its settings) by passing a corresponding control object. For a complete list of implemented algorithms look at the subclasses of (FeatSelControl).
All algorithms operate on a 0-1-bit encoding of candidate solutions. Per
default a single bit corresponds to a single feature, but you are able to
change this by using the arguments bit.names
and bits.to.features
. Thus
allowing you to switch on whole groups of features with a single bit.
Usage
selectFeatures(
learner,
task,
resampling,
measures,
bit.names,
bits.to.features,
control,
show.info = getMlrOption("show.info")
)
Arguments
- learner
(Learner |
character(1)
)
The learner. If you pass a string the learner will be created via makeLearner.- task
(Task)
The task.- resampling
(ResampleInstance | ResampleDesc)
Resampling strategy for feature selection. If you pass a description, it is instantiated once at the beginning by default, so all points are evaluated on the same training/test sets. If you want to change that behavior, look at FeatSelControl.- measures
(list of Measure | Measure)
Performance measures to evaluate. The first measure, aggregated by the first aggregation function is optimized, others are simply evaluated. Default is the default measure for the task, see here getDefaultMeasure.- bit.names
character
Names of bits encoding the solutions. Also defines the total number of bits in the encoding. Per default these are the feature names of the task. Has to be used together withbits.to.features
.- bits.to.features
(
function(x, task)
)
Function which transforms an integer-0-1 vector into a character vector of selected features. Per default a value of 1 in the ith bit selects the ith feature to be in the candidate solution. The vectorx
will correspond to thebit.names
and has to be of the same length.- control
[see FeatSelControl) Control object for search method. Also selects the optimization algorithm for feature selection.
- show.info
(
logical(1)
)
Print verbose output on console? Default is set via configureMlr.
See also
Other featsel:
FeatSelControl
,
analyzeFeatSelResult()
,
getFeatSelResult()
,
makeFeatSelWrapper()
Examples
# \donttest{
rdesc = makeResampleDesc("Holdout")
ctrl = makeFeatSelControlSequential(method = "sfs", maxit = NA)
res = selectFeatures("classif.rpart", iris.task, rdesc, control = ctrl)
#> [FeatSel] Started selecting features for learner 'classif.rpart'
#> With control class: FeatSelControlSequential
#> Imputation value: 1
#> [FeatSel-x] 1: 0000 (0 bits)
#> [FeatSel-y] 1: mmce.test.mean=0.6800000; time: 0.0 min
#> [FeatSel-x] 2: 1000 (1 bits)
#> [FeatSel-y] 2: mmce.test.mean=0.2800000; time: 0.0 min
#> [FeatSel-x] 2: 0100 (1 bits)
#> [FeatSel-y] 2: mmce.test.mean=0.4600000; time: 0.0 min
#> [FeatSel-x] 2: 0010 (1 bits)
#> [FeatSel-y] 2: mmce.test.mean=0.0400000; time: 0.0 min
#> [FeatSel-x] 2: 0001 (1 bits)
#> [FeatSel-y] 2: mmce.test.mean=0.0400000; time: 0.0 min
#> [FeatSel-x] 3: 1010 (2 bits)
#> [FeatSel-y] 3: mmce.test.mean=0.0400000; time: 0.0 min
#> [FeatSel-x] 3: 0110 (2 bits)
#> [FeatSel-y] 3: mmce.test.mean=0.0400000; time: 0.0 min
#> [FeatSel-x] 3: 0011 (2 bits)
#> [FeatSel-y] 3: mmce.test.mean=0.0400000; time: 0.0 min
#> [FeatSel] Result: Petal.Length (1 bits)
analyzeFeatSelResult(res)
#> Features : 1
#> Performance : mmce.test.mean=0.0400000
#> Petal.Length
#>
#> Path to optimum:
#> - Features: 0 Init : Perf = 0.68 Diff: NA *
#> - Features: 1 Add : Petal.Length Perf = 0.04 Diff: 0.64 *
#>
#> Stopped, because no improving feature was found.
# }