mlr 2.14.0 2019-04-25

general

functions - new

functions - general

filter - general

  • Caching is now used when generating filter values. This means that filter values are only computed once for a specific setting and the stored cache is used in subsequent iterations. This change inherits a significant speed-up when tuning fw.perc, fw.abs or fw.threshold. It can be triggered with the new cache argument in makeFilterWrapper() or filterFeatures() (@pat-s, #2463).

filter - new

  • praznik_JMI
  • praznik_DISR
  • praznik_JMIM
  • praznik_MIM
  • praznik_NJMIM
  • praznik_MRMR
  • praznik_CMIM
  • FSelectorRcpp_gain.ratio
  • FSelectorRcpp_information.gain
  • FSelectorRcpp_symuncert

Additionally, filter names have been harmonized using the following scheme: _. Exeptions are filters included in base R packages. In this case, the package name is omitted.

filter - general

  • Added filters FSelectorRcpp_gain.ratio, FSelectorRcpp_information.gain and FSelectorRcpp_symmetrical.uncertainty from package FSelectorRcpp. These filters are ~ 100 times faster than the implementation of the FSelector pkg. Please note that both implementations do things slightly different internally and the FSelectorRcpp methods should not be seen as direct replacement for the FSelector pkg.

  • filter names have been harmonized using the following scheme: _. (@pat-s, #2533)

    • information.gain -> FSelector_information.gain
    • gain.ratio -> FSelector_gain.ratio
    • symmetrical.uncertainty -> FSelector_symmetrical.uncertainty
    • chi.squared -> FSelector_chi.squared
    • relief -> FSelector_relief
    • oneR -> FSelector_oneR
    • randomForestSRC.rfsrc -> randomForestSRC_importance
    • randomForestSRC.var.select -> randomForestSRC_var.select
    • randomForest.importance -> randomForest_importance
  • fixed a bug related to the loading of namespaces for required filter packages (@pat-s, #2483)

learners - new

learners - general

  • regr.h2o.gbm: Various parameters added, "h2o.use.data.table" = TRUE is now the default (@j-hartshorn, #2508)
  • h2o learners now support getting feature importance (@markusdumke, #2434)

learners - fixes

  • In some cases the optimized hyperparameters were not applied in the performance level of a nested CV (@berndbischl, #2479)

featSel - general

  • The FeatSelResult object now contains an additional slot x.bit.names that stores the optimal bits
  • The slot x now always contains the real feature names and not the bit.names
  • This fixes a bug and makes makeFeatSelWrapper usable with custom bit.names.
  • Fixed a bug due to which sffs crashed in some cases (@bmihaljevic, #2486)