mlr 2.2: 2014-10-29

  • The web tutorial was MUCH improved!
  • more example tasks and data sets
  • Learners and tasks now support ordered factors as features. The task description knows whether ordered factors are present and it is checked whether the learner supports such a feature. We have set this property ‘ordered’ very conservatively, so very few learners have it, where we are sure ordered inputs are handled correctly during training. If you know of more models that support this, please inform us.
  • basic R learners now have new slots: name (a descriptive name of the algorithm), short.name (abbreviation that can be used in plots and tables) and note (notes regarding slight changes for the mlr integration of the learner and such).
  • makeLearner now supports some options regarding learner error handling and output which could before only be set globally via configureMlr
  • Additional arguments for imputation functions to allow a more fine-grain control of dummy column creation
  • imputeMin and imputeMax now subtract or add a multiple of the range of the data from the minimum or to the maximum, respectively.
  • cluster methods now have property ‘prob’ when they support fuzzy cluster membership probabilities, and also then support predict.type = ‘prob’. Everything basically works the same as for posterior probabilities in classif.* methods.
  • predict preserves the rownames of the input in its output
  • fixed a bug in createDummyFeatures that caused an error when the data contained missing values.
  • plotLearnerPrediction works for clustering and allows greyscale plots (for printing or articles)
  • the whole object-oriented structure behind feature filtering was much improved. Smaller changes in the signature of makeFilterWrapper and filterFeatures have become necessary.
  • fixed a bug in filter methods of the FSelector package that caused an error when variable names contained accented letters
  • filterFeatures can now be also applied to the result of getFilterValues
  • We dropped the data.frame version of some preprocessing operations like mergeFactorLevelsBySize, joinClassLevels and removeConstantFeatures for consistency. These now always require tasks as input.
  • We support a pretty generic framework for stacking / super-learning now, see makeStackedLearner
  • imbalancy correction + smote: ** fix a bug in “smote” when only factor features are present ** change to oversampling: sample new observations only (with replacement) ** extension to smote algorithm (sampling): minority class observations in binary classification are either chosen via sampling or alternatively, each minority class observation is used an equal number of times
  • made the getters for BenchmarkResult more consistent. These are now: getBMRTaskIds, getBMRLearnerIds, getBMRPredictions, getBMRPerformances, getBMRAggrPerformances getBMRTuneResults, getFeatSelResults, getBMRFilteredFeatures The following methods do not work for BenchmarkResult anymore: getTuneResult, getFeatSelResult
  • Removed getFilterResult, because it does the same as getFilteredFeatures

new learners:

  • classif.bartMachine
  • classif.lqa
  • classif.randomForestSRC
  • classif.sda
  • regr.ctree
  • regr.plsr
  • regr.randomForestSRC
  • cluster.cmeans
  • cluster.DBScan
  • cluster.kmeans
  • cluster.FarthestFirst
  • surv.cvglmnet
  • surv.optimCoxBoostPenalty

new filters:

  • variance
  • univariate
  • carscore
  • rf.importance, rf.min.depth
  • anova.test, kruskal.test
  • mrmr

new functions

  • makeMulticlassWrapper
  • makeStackedLearner, getStackedBaseLearnerPredictions
  • joinClassLevels
  • summarizeColumns, summarizeLevels
  • capLargeValues, mergeFactorLevelsBySize