For some learners it is possible to calculate a feature importance measure.
getFeatureImportance extracts those values from trained models.
See below for a list of supported learners.
FeatureImportance) An object containing a
data.frame of the
variable importances and further information.
Measure which accounts the gain of Gini index given by a feature in a tree and the weight of that tree.
Permutation principle of the 'mean decrease in accuracy' principle in randomForest. If
auc=TRUE (only for binary classification), area under
the curve is used as measure. The algorithm used for the survival learner
is 'extremely slow and experimental; use at your own risk'. See
party::varimp() for details and further parameters.
Estimation of relative influence for each feature. See
for details and further parameters.
Relative feature importances as returned by
type = 2 (the default) the 'MeanDecreaseGini' is measured, which is
based on the Gini impurity index used for the calculation of the nodes.
Alternatively, you can set
type to 1, then the measure is the mean
decrease in accuracy calculated on OOB data. Note, that in this case the
importance needs to be set to be able to compute
feature importance values.
randomForest::importance() for details.
This is identical to randomForest.
This method can calculate feature importance for various measures. By default the Breiman-Cutler permutation method is used. See
randomForestSRC::vimp() for details.
Supports both measures mentioned above for the randomForest learner. Note, that you need to specifically set the learners parameter
importance, to be able to compute feature importance measures.
ranger::ranger() for details.
Sum of decrease in impurity for each of the surrogate variables at each node
The value implies the relative contribution of the corresponding feature to the model calculated by taking each feature's contribution for each tree in the model. The exact computation of the importance in xgboost is undocumented.