Adaboost and its application using classification trees

dc.contributor.advisorKanfer, F.H.J. (Frans)
dc.contributor.coadvisorMillard, Sollie M.
dc.contributor.emailnvithal@iburst.co.za
dc.contributor.postgraduateVithal, Nishay
dc.date.accessioned2021-04-06T07:22:10Z
dc.date.available2021-04-06T07:22:10Z
dc.date.created2017/02/10
dc.date.issued2013
dc.descriptionDissertation (MSc)--University of Pretoria, 2013.
dc.description.abstractThis mini-dissertation seeks to provide the reader with an understanding of one of the most popular boosting methods in use today called Adaboost and its first extension Adaboost.M1. Boosting, as the name suggests, is an ensemble and machine learningmethod created to improve or "boost" prediction accuracy via repeatedMonte- Carlo type simulations. Due to the methods flexibility to be applied over any learning algorithm, in this dissertation we have chosen to make use of decision trees, or more specifically classification trees constructed by the CART method, as a base predictor. The reason for boosting classification trees include the learning algorithms lack of accuracy when applied on a stand-alone basis in many settings, its practical real world application and the ability for classification trees to perform natural internal feature selection. The core topics covered include where the Adaboost method arose from, how and why it works, possible issues with the method and examples using classification trees as the base predictor to demonstrate and assess the methods performance. Although no formal mathematical derivation of the method was provided at the time the method was created, a statistical justification was put forward several years later which explained Adaboost in terms of well known additive modelling when minimizing a specific exponential loss function or criterion. This justification is provided along with real and simulated examples demonstrating Adaboost’s performance using two types of classification trees i.e. stumps (classification trees with two terminal nodes) and optimized or pruned full trees. What is shown empirically is that when boosting tree stumps the performance enhancements achieved by Adaboost in many cases meets or exceeds the single or boosted larger tree structures. This finding has benefits such as simplified model structures and lower computational time. Lastly we provide a cursory review of new developments within the field of boosting such as margin theory which seeks to provide an explanation as to the methods seemingly mysterious test and training error performance; optimized tree boosting procedures such as gradient boosted methods and combinatorial ensemble methods using bagging and boosting.
dc.description.availabilityUnrestricted
dc.description.degreeMSc
dc.description.departmentStatistics
dc.identifier.citationVithal, N 2013, Adaboost and its application using classification trees, MSc Dissertation, University of Pretoria, Pretoria, viewed yymmdd <http://hdl.handle.net/2263/79206>
dc.identifier.otherE14/4/555
dc.identifier.urihttp://hdl.handle.net/2263/79206
dc.language.isoen
dc.publisherUniversity of Pretoria
dc.rights© 2020 University of Pretoria. All rights reserved. The copyright in this work vests in the University of Pretoria. No part of this work may be reproduced or transmitted in any form or by any means, without the prior written permission of the University of Pretoria.
dc.subjectUCTD
dc.titleAdaboost and its application using classification trees
dc.typeDissertation

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Vithal_Adaboost_2013.pdf
Size:
4.2 MB
Format:
Adobe Portable Document Format