Estimate how important individual features or groups of features are by contrasting prediction performances. For method “permutation.importance” compute the change in performance from permuting the values of a feature (or a group of features) and compare that to the predictions made on the unmcuted data.
generateFeatureImportanceData( task, method = "permutation.importance", learner, features = getTaskFeatureNames(task), interaction = FALSE, measure, contrast = function(x, y) x - y, aggregation = mean, nmc = 50L, replace = TRUE, local = FALSE, show.info = FALSE )
task | (Task) |
---|---|
method | ( |
learner | (Learner | |
features | (character) |
interaction | ( |
measure | (Measure) |
contrast | ( |
aggregation | ( |
nmc | ( |
replace | ( |
local | ( |
show.info | ( |
(FeatureImportance
). A named list which contains the computed feature importance and the input arguments.
Object members:
(data.frame)
Has columns for each feature or combination of features (colon separated) for which the importance is computed.
A row coresponds to importance of the feature specified in the column for the target.
(logical(1)
)
Whether or not the importance of the features
was computed jointly rather than individually.
(Measure)
(function
)
The function used to compare the performance of predictions.
(function
)
The function which is used to aggregate the contrast between the performance of predictions across Monte-Carlo iterations.
(logical(1)
)
Whether or not, when method = "permutation.importance"
, the feature values
are sampled with replacement.
(integer(1)
)
The number of Monte-Carlo iterations used to compute the feature importance.
When nmc == -1
and method = "permutation.importance"
all permutations are used.
(logical(1)
)
Whether observation-specific importance is computed for the features
.
Jerome Friedman; Greedy Function Approximation: A Gradient Boosting Machine, Annals of Statistics, Vol. 29, No. 5 (Oct., 2001), pp. 1189-1232.
Other generate_plot_data:
generateCalibrationData()
,
generateCritDifferencesData()
,
generateFilterValuesData()
,
generateLearningCurveData()
,
generatePartialDependenceData()
,
generateThreshVsPerfData()
,
plotFilterValues()
lrn = makeLearner("classif.rpart", predict.type = "prob") fit = train(lrn, iris.task) imp = generateFeatureImportanceData(iris.task, "permutation.importance", lrn, "Petal.Width", nmc = 10L, local = TRUE)