First, calls generateFilterValuesData. Features are then selected via select and val.

filterFeatures(task, method = "randomForestSRC_importance",
  fval = NULL, perc = NULL, abs = NULL, threshold = NULL,
  mandatory.feat = NULL, cache = FALSE, ...)

Arguments

task

(Task)
The task.

method

(character(1))
See listFilterMethods. Default is “randomForestSRC_importance”.

fval

(FilterValues)
Result of generateFilterValuesData. If you pass this, the filter values in the object are used for feature filtering. method and ... are ignored then. Default is NULL and not used.

perc

(numeric(1))
If set, select perc*100 top scoring features. perc = 1 means to select all features.Mutually exclusive with argumentsabsandthreshold`.

abs

(numeric(1))
If set, select abs top scoring features. Mutually exclusive with arguments perc and threshold.

threshold

(numeric(1))
If set, select features whose score exceeds threshold. Mutually exclusive with arguments perc and abs.

mandatory.feat

(character)
Mandatory features which are always included regardless of their scores

cache

(character(1) | logical)
Whether to use caching during filter value creation. See details.

...

(any)
Passed down to selected filter method.

Value

Task.

Caching

If cache = TRUE, the default mlr cache directory is used to cache filter values. The directory is operating system dependent and can be checked with getCacheDir().
The default cache can be cleared with deleteCacheDir(). Alternatively, a custom directory can be passed to store the cache.

Note that caching is not thread safe. It will work for parallel computation on many systems, but there is no guarantee.

See also