Estimate how important individual features or groups of features are by contrasting prediction performances. For method “permutation.importance” compute the change in performance from permuting the values of a feature (or a group of features) and compare that to the predictions made on the unmcuted data.
generateFeatureImportanceData( task, method = "permutation.importance", learner, features = getTaskFeatureNames(task), interaction = FALSE, measure, contrast = function(x, y) x - y, aggregation = mean, nmc = 50L, replace = TRUE, local = FALSE, show.info = FALSE )
| task | (Task) |
|---|---|
| method | ( |
| learner | (Learner | |
| features | (character) |
| interaction | ( |
| measure | (Measure) |
| contrast | ( |
| aggregation | ( |
| nmc | ( |
| replace | ( |
| local | ( |
| show.info | ( |
(FeatureImportance). A named list which contains the computed feature importance and the input arguments.
Object members:
(data.frame)
Has columns for each feature or combination of features (colon separated) for which the importance is computed.
A row coresponds to importance of the feature specified in the column for the target.
(logical(1))
Whether or not the importance of the features was computed jointly rather than individually.
(Measure)
(function)
The function used to compare the performance of predictions.
(function)
The function which is used to aggregate the contrast between the performance of predictions across Monte-Carlo iterations.
(logical(1))
Whether or not, when method = "permutation.importance", the feature values
are sampled with replacement.
(integer(1))
The number of Monte-Carlo iterations used to compute the feature importance.
When nmc == -1 and method = "permutation.importance" all permutations are used.
(logical(1))
Whether observation-specific importance is computed for the features.
Jerome Friedman; Greedy Function Approximation: A Gradient Boosting Machine, Annals of Statistics, Vol. 29, No. 5 (Oct., 2001), pp. 1189-1232.
Other generate_plot_data:
generateCalibrationData(),
generateCritDifferencesData(),
generateFilterValuesData(),
generateLearningCurveData(),
generatePartialDependenceData(),
generateThreshVsPerfData(),
plotFilterValues()
lrn = makeLearner("classif.rpart", predict.type = "prob") fit = train(lrn, iris.task) imp = generateFeatureImportanceData(iris.task, "permutation.importance", lrn, "Petal.Width", nmc = 10L, local = TRUE)