Estimate how important individual features or groups of features are by contrasting prediction performances. For method “permutation.importance” compute the change in performance from permuting the values of a feature (or a group of features) and compare that to the predictions made on the unmcuted data.

generateFeatureImportanceData(
task,
method = "permutation.importance",
learner,
features = getTaskFeatureNames(task),
interaction = FALSE,
measure,
contrast = function(x, y) x - y,
aggregation = mean,
nmc = 50L,
replace = TRUE,
local = FALSE,
show.info = FALSE
)

## Arguments

task |
(Task)
The task. |

method |
(`character(1)` )
The method used to compute the feature importance.
The only method available is “permutation.importance”.
Default is “permutation.importance”. |

learner |
(Learner | `character(1)` )
The learner.
If you pass a string the learner will be created via makeLearner. |

features |
(character)
The features to compute the importance of.
The default is all of the features contained in the Task. |

interaction |
(`logical(1)` )
Whether to compute the importance of the `features` argument jointly.
For `method = "permutation.importance"` this entails permuting the values of
all `features` together and then contrasting the performance with that of
the performance without the features being permuted.
The default is `FALSE` . |

measure |
(Measure)
Performance measure.
Default is the first measure used in the benchmark experiment. |

contrast |
(`function` )
A difference function that takes a numeric vector and returns a numeric vector
of the same length.
The default is element-wise difference between the vectors. |

aggregation |
(`function` )
A function which aggregates the differences.
This function must take a numeric vector and return a numeric vector of length 1.
The default is `mean` . |

nmc |
(`integer(1)` )
The number of Monte-Carlo iterations to use in computing the feature importance.
If `nmc == -1` and `method = "permutation.importance"` then all
permutations of the `features` are used.
The default is 50. |

replace |
(`logical(1)` )
Whether or not to sample the feature values with or without replacement.
The default is `TRUE` . |

local |
(`logical(1)` )
Whether to compute the per-observation importance.
The default is `FALSE` . |

show.info |
(`logical(1)` )
Whether progress output (feature name, time elapsed) should be displayed. |

## Value

(`FeatureImportance`

). A named list which contains the computed feature importance and the input arguments.

Object members:

res(data.frame)

Has columns for each feature or combination of features (colon separated) for which the importance is computed.
A row coresponds to importance of the feature specified in the column for the target.

interaction(`logical(1)`

)

Whether or not the importance of the `features`

was computed jointly rather than individually.

measure(Measure)

The measure used to compute performance.

contrast(`function`

)

The function used to compare the performance of predictions.

aggregation(`function`

)

The function which is used to aggregate the contrast between the performance of predictions across Monte-Carlo iterations.

replace(`logical(1)`

)

Whether or not, when `method = "permutation.importance"`

, the feature values
are sampled with replacement.

nmc(`integer(1)`

)

The number of Monte-Carlo iterations used to compute the feature importance.
When `nmc == -1`

and `method = "permutation.importance"`

all permutations are used.

local(`logical(1)`

)

Whether observation-specific importance is computed for the `features`

.

## References

Jerome Friedman; Greedy Function Approximation: A Gradient Boosting Machine, Annals of Statistics, Vol. 29, No. 5 (Oct., 2001), pp. 1189-1232.

## See also

## Examples