Skip to contents

Optimizes the features for a classification or regression problem by choosing a variable selection wrapper approach. Allows for different optimization methods, such as forward search or a genetic algorithm. You can select such an algorithm (and its settings) by passing a corresponding control object. For a complete list of implemented algorithms look at the subclasses of (FeatSelControl).

All algorithms operate on a 0-1-bit encoding of candidate solutions. Per default a single bit corresponds to a single feature, but you are able to change this by using the arguments bit.names and bits.to.features. Thus allowing you to switch on whole groups of features with a single bit.

Usage

selectFeatures(
  learner,
  task,
  resampling,
  measures,
  bit.names,
  bits.to.features,
  control,
  show.info = getMlrOption("show.info")
)

Arguments

learner

(Learner | character(1))
The learner. If you pass a string the learner will be created via makeLearner.

task

(Task)
The task.

resampling

(ResampleInstance | ResampleDesc)
Resampling strategy for feature selection. If you pass a description, it is instantiated once at the beginning by default, so all points are evaluated on the same training/test sets. If you want to change that behavior, look at FeatSelControl.

measures

(list of Measure | Measure)
Performance measures to evaluate. The first measure, aggregated by the first aggregation function is optimized, others are simply evaluated. Default is the default measure for the task, see here getDefaultMeasure.

bit.names

character
Names of bits encoding the solutions. Also defines the total number of bits in the encoding. Per default these are the feature names of the task. Has to be used together with bits.to.features.

bits.to.features

(function(x, task))
Function which transforms an integer-0-1 vector into a character vector of selected features. Per default a value of 1 in the ith bit selects the ith feature to be in the candidate solution. The vector x will correspond to the bit.names and has to be of the same length.

control

[see FeatSelControl) Control object for search method. Also selects the optimization algorithm for feature selection.

show.info

(logical(1))
Print verbose output on console? Default is set via configureMlr.

Value

(FeatSelResult).

Examples

# \donttest{
rdesc = makeResampleDesc("Holdout")
ctrl = makeFeatSelControlSequential(method = "sfs", maxit = NA)
res = selectFeatures("classif.rpart", iris.task, rdesc, control = ctrl)
#> [FeatSel] Started selecting features for learner 'classif.rpart'
#> With control class: FeatSelControlSequential
#> Imputation value: 1
#> [FeatSel-x] 1: 0000 (0 bits)
#> [FeatSel-y] 1: mmce.test.mean=0.6800000; time: 0.0 min
#> [FeatSel-x] 2: 1000 (1 bits)
#> [FeatSel-y] 2: mmce.test.mean=0.2800000; time: 0.0 min
#> [FeatSel-x] 2: 0100 (1 bits)
#> [FeatSel-y] 2: mmce.test.mean=0.4600000; time: 0.0 min
#> [FeatSel-x] 2: 0010 (1 bits)
#> [FeatSel-y] 2: mmce.test.mean=0.0400000; time: 0.0 min
#> [FeatSel-x] 2: 0001 (1 bits)
#> [FeatSel-y] 2: mmce.test.mean=0.0400000; time: 0.0 min
#> [FeatSel-x] 3: 1010 (2 bits)
#> [FeatSel-y] 3: mmce.test.mean=0.0400000; time: 0.0 min
#> [FeatSel-x] 3: 0110 (2 bits)
#> [FeatSel-y] 3: mmce.test.mean=0.0400000; time: 0.0 min
#> [FeatSel-x] 3: 0011 (2 bits)
#> [FeatSel-y] 3: mmce.test.mean=0.0400000; time: 0.0 min
#> [FeatSel] Result: Petal.Length (1 bits)
analyzeFeatSelResult(res)
#> Features         : 1
#> Performance      : mmce.test.mean=0.0400000
#> Petal.Length
#> 
#> Path to optimum:
#> - Features:    0  Init   :                       Perf = 0.68  Diff: NA  *
#> - Features:    1  Add    : Petal.Length          Perf = 0.04  Diff: 0.64  *
#> 
#> Stopped, because no improving feature was found.
# }