Skip to content

Conceptualisation, evaluation and statistical testing of an ensemble of model trees (MTF).

Notifications You must be signed in to change notification settings

WernerVdM97/Masters-Thesis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A Genetic Algorithm Based Model Tree Forest

This thesis presents an ensemble approach that reduces the high variance error exhibited by model trees that comprise multivariate non-linear models and increases their overall robustness. The ensemble approach is conceptualised, tuned for, and evaluated against competing regression models on ten separate benchmarking datasets. The ensemble, referred to as the model tree forest (MTF), incorporates a hybrid genetic algorithm approach to construct structurally optimal polynomial expressions (GASOPE) within the leaf nodes of greedy induced model trees that form the base learners of the ensemble. Bootstrap aggregation, together with randomised splitting feature spaces during tree induction, sufficiently decorrelates the base learners within the ensemble, thereby reducing the variance error of MTF compared to that of a single model tree whilst retaining the favourable low bias error that model trees exhibit. The multivariate non-linear models that predicts the output enable MTF to produce approximations of highly non-linear data The addition of ensembling methods passively combat overfitting brought forth from the increased model complexity, compared to a previous implementation of GASOPE within a tree structure which is shown to exhibit overfitting in specific cases. MTF produced similar performance to an artificial feed-forward neural network and outperformed the M5 model tree, an ensemble of M5 model trees and support vector regression.

This repo contains

  • The full project report, containing:
    • A study on relevant literature,
    • A redesign of GASOPE,
    • Conceptualisation of Model Tree Forest (MTF),
    • Visualisation & evaluation of tuning MTF and
    • Statistical tests to compare MTF against other state-of-the-art regression techniques.
  • All classes necessary to implement MTF, see rsc/PyGasope/PyForest.py for the main class.
  • Raw results of comparing GASOPE, MT and MTF against other state of the art regression techniques.