Adaboost and its application using classification trees

Show simple item record

dc.contributor.advisor Kanfer, F.H.J. (Frans)
dc.contributor.coadvisor Millard, Sollie M.
dc.contributor.postgraduate Vithal, Nishay
dc.date.accessioned 2021-04-06T07:22:10Z
dc.date.available 2021-04-06T07:22:10Z
dc.date.created 2017/02/10
dc.date.issued 2013
dc.description Dissertation (MSc)--University of Pretoria, 2013.
dc.description.abstract This mini-dissertation seeks to provide the reader with an understanding of one of the most popular boosting methods in use today called Adaboost and its first extension Adaboost.M1. Boosting, as the name suggests, is an ensemble and machine learningmethod created to improve or "boost" prediction accuracy via repeatedMonte- Carlo type simulations. Due to the methods flexibility to be applied over any learning algorithm, in this dissertation we have chosen to make use of decision trees, or more specifically classification trees constructed by the CART method, as a base predictor. The reason for boosting classification trees include the learning algorithms lack of accuracy when applied on a stand-alone basis in many settings, its practical real world application and the ability for classification trees to perform natural internal feature selection. The core topics covered include where the Adaboost method arose from, how and why it works, possible issues with the method and examples using classification trees as the base predictor to demonstrate and assess the methods performance. Although no formal mathematical derivation of the method was provided at the time the method was created, a statistical justification was put forward several years later which explained Adaboost in terms of well known additive modelling when minimizing a specific exponential loss function or criterion. This justification is provided along with real and simulated examples demonstrating Adaboost’s performance using two types of classification trees i.e. stumps (classification trees with two terminal nodes) and optimized or pruned full trees. What is shown empirically is that when boosting tree stumps the performance enhancements achieved by Adaboost in many cases meets or exceeds the single or boosted larger tree structures. This finding has benefits such as simplified model structures and lower computational time. Lastly we provide a cursory review of new developments within the field of boosting such as margin theory which seeks to provide an explanation as to the methods seemingly mysterious test and training error performance; optimized tree boosting procedures such as gradient boosted methods and combinatorial ensemble methods using bagging and boosting.
dc.description.availability Unrestricted
dc.description.degree MSc
dc.description.department Statistics
dc.identifier.citation Vithal, N 2013, Adaboost and its application using classification trees, MSc Dissertation, University of Pretoria, Pretoria, viewed yymmdd <http://hdl.handle.net/2263/79206>
dc.identifier.other E14/4/555
dc.identifier.uri http://hdl.handle.net/2263/79206
dc.language.iso en
dc.publisher University of Pretoria
dc.rights © 2020 University of Pretoria. All rights reserved. The copyright in this work vests in the University of Pretoria. No part of this work may be reproduced or transmitted in any form or by any means, without the prior written permission of the University of Pretoria.
dc.subject UCTD
dc.title Adaboost and its application using classification trees
dc.type Dissertation


Files in this item

This item appears in the following Collection(s)

Show simple item record