Adaboost and its application using classification trees

Vithal, Nishay

UPSpace Home
→
University of Pretoria: Research Output
→
Theses and Dissertations (University of Pretoria)
→
View Item

dc.contributor.advisor	Kanfer, F.H.J. (Frans)
dc.contributor.coadvisor	Millard, Sollie M.
dc.contributor.postgraduate	Vithal, Nishay
dc.date.accessioned	2021-04-06T07:22:10Z
dc.date.available	2021-04-06T07:22:10Z
dc.date.created	2017/02/10
dc.date.issued	2013
dc.description	Dissertation (MSc)--University of Pretoria, 2013.
dc.description.abstract	This mini-dissertation seeks to provide the reader with an understanding of one of the most popular boosting methods in use today called Adaboost and its first extension Adaboost.M1. Boosting, as the name suggests, is an ensemble and machine learningmethod created to improve or "boost" prediction accuracy via repeatedMonte- Carlo type simulations. Due to the methods flexibility to be applied over any learning algorithm, in this dissertation we have chosen to make use of decision trees, or more specifically classification trees constructed by the CART method, as a base predictor. The reason for boosting classification trees include the learning algorithms lack of accuracy when applied on a stand-alone basis in many settings, its practical real world application and the ability for classification trees to perform natural internal feature selection. The core topics covered include where the Adaboost method arose from, how and why it works, possible issues with the method and examples using classification trees as the base predictor to demonstrate and assess the methods performance. Although no formal mathematical derivation of the method was provided at the time the method was created, a statistical justification was put forward several years later which explained Adaboost in terms of well known additive modelling when minimizing a specific exponential loss function or criterion. This justification is provided along with real and simulated examples demonstrating Adaboost’s performance using two types of classification trees i.e. stumps (classification trees with two terminal nodes) and optimized or pruned full trees. What is shown empirically is that when boosting tree stumps the performance enhancements achieved by Adaboost in many cases meets or exceeds the single or boosted larger tree structures. This finding has benefits such as simplified model structures and lower computational time. Lastly we provide a cursory review of new developments within the field of boosting such as margin theory which seeks to provide an explanation as to the methods seemingly mysterious test and training error performance; optimized tree boosting procedures such as gradient boosted methods and combinatorial ensemble methods using bagging and boosting.
dc.description.availability	Unrestricted
dc.description.degree	MSc
dc.description.department	Statistics
dc.identifier.citation	Vithal, N 2013, Adaboost and its application using classification trees, MSc Dissertation, University of Pretoria, Pretoria, viewed yymmdd <http://hdl.handle.net/2263/79206>
dc.identifier.other	E14/4/555
dc.identifier.uri	http://hdl.handle.net/2263/79206
dc.language.iso	en
dc.publisher	University of Pretoria
dc.rights	© 2020 University of Pretoria. All rights reserved. The copyright in this work vests in the University of Pretoria. No part of this work may be reproduced or transmitted in any form or by any means, without the prior written permission of the University of Pretoria.
dc.subject	UCTD
dc.title	Adaboost and its application using classification trees
dc.type	Dissertation