A performance measure is evaluated after a single train/predict step and returns a single number to assess the quality of the prediction (or maybe only the model, think AIC). The measure itself knows whether it wants to be minimized or maximized and for what tasks it is applicable.
All supported measures can be found by listMeasures or as a table in the tutorial appendix: https://mlr.mlr-org.com/articles/tutorial/measures.html.
If you want a measure for a misclassification cost matrix, look at makeCostMeasure. If you want to implement your own measure, look at makeMeasure.
Most measures can directly be accessed via the function named after the scheme measureX (e.g. measureSSE).
For clustering measures, we compact the predicted cluster IDs such that they form a continuous series starting with 1. If this is not the case, some of the measures will generate warnings.
Some measure have parameters. Their defaults are set in the constructor makeMeasure and can be overwritten using setMeasurePars.
measureSSE(truth, response) measureMSE(truth, response) measureRMSE(truth, response) measureMEDSE(truth, response) measureSAE(truth, response) measureMAE(truth, response) measureMEDAE(truth, response) measureRSQ(truth, response) measureEXPVAR(truth, response) measureRRSE(truth, response) measureRAE(truth, response) measureMAPE(truth, response) measureMSLE(truth, response) measureRMSLE(truth, response) measureKendallTau(truth, response) measureSpearmanRho(truth, response) measureMMCE(truth, response) measureACC(truth, response) measureBER(truth, response) measureAUNU(probabilities, truth) measureAUNP(probabilities, truth) measureAU1U(probabilities, truth) measureAU1P(probabilities, truth) measureMulticlassBrier(probabilities, truth) measureLogloss(probabilities, truth) measureSSR(probabilities, truth) measureQSR(probabilities, truth) measureLSR(probabilities, truth) measureKAPPA(truth, response) measureWKAPPA(truth, response) measureAUC(probabilities, truth, negative, positive) measureBrier(probabilities, truth, negative, positive) measureBrierScaled(probabilities, truth, negative, positive) measureBAC(truth, response) measureTP(truth, response, positive) measureTN(truth, response, negative) measureFP(truth, response, positive) measureFN(truth, response, negative) measureTPR(truth, response, positive) measureTNR(truth, response, negative) measureFPR(truth, response, negative, positive) measureFNR(truth, response, negative, positive) measurePPV(truth, response, positive, probabilities = NULL) measureNPV(truth, response, negative) measureFDR(truth, response, positive) measureMCC(truth, response, negative, positive) measureF1(truth, response, positive) measureGMEAN(truth, response, negative, positive) measureGPR(truth, response, positive) measureMultilabelHamloss(truth, response) measureMultilabelSubset01(truth, response) measureMultilabelF1(truth, response) measureMultilabelACC(truth, response) measureMultilabelPPV(truth, response) measureMultilabelTPR(truth, response)
truth | (factor) |
---|---|
response | (factor) |
probabilities | (numeric | matrix) |
negative | ( |
positive | ( |
He, H. & Garcia, E. A. (2009) Learning from Imbalanced Data. IEEE Transactions on Knowledge and Data Engineering, vol. 21, no. 9. pp. 1263-1284.
H. Uno et al. On the C-statistics for Evaluating Overall Adequacy of Risk Prediction Procedures with Censored Survival Data Statistics in medicine. 2011;30(10):1105-1117. https://doi.org/10.1002/sim.4154.
H. Uno et al. Evaluating Prediction Rules for T-Year Survivors with Censored Regression Models Journal of the American Statistical Association 102, no. 478 (2007): 527-37.
Other performance:
ConfusionMatrix
,
calculateConfusionMatrix()
,
calculateROCMeasures()
,
estimateRelativeOverfitting()
,
makeCostMeasure()
,
makeCustomResampledMeasure()
,
makeMeasure()
,
performance()
,
setAggregation()
,
setMeasurePars()