Adaboost and its application using classification trees

Vithal, Nishay

UPSpace Home
→
University of Pretoria: Research Output
→
Theses and Dissertations (University of Pretoria)
→
View Item

We are excited to announce that the repository will soon undergo an upgrade, featuring a new look and feel along with several enhanced features to improve your experience. Please be on the lookout for further updates and announcements regarding the launch date. We appreciate your support and look forward to unveiling the improved platform soon.

Adaboost and its application using classification trees

Vithal, Nishay

URI: http://hdl.handle.net/2263/79206

Date: 2013

Abstract:

This mini-dissertation seeks to provide the reader with an understanding of one of the most popular boosting methods in use today called Adaboost and its first extension Adaboost.M1. Boosting, as the name suggests, is an ensemble and machine learningmethod created to improve or "boost" prediction accuracy via repeatedMonte- Carlo type simulations. Due to the methods flexibility to be applied over any learning algorithm, in this dissertation we have chosen to make use of decision trees, or more specifically classification trees constructed by the CART method, as a base predictor. The reason for boosting classification trees include the learning algorithms lack of accuracy when applied on a stand-alone basis in many settings, its practical real world application and the ability for classification trees to perform natural internal feature selection. The core topics covered include where the Adaboost method arose from, how and why it works, possible issues with the method and examples using classification trees as the base predictor to demonstrate and assess the methods performance. Although no formal mathematical derivation of the method was provided at the time the method was created, a statistical justification was put forward several years later which explained Adaboost in terms of well known additive modelling when minimizing a specific exponential loss function or criterion. This justification is provided along with real and simulated examples demonstrating Adaboost’s performance using two types of classification trees i.e. stumps (classification trees with two terminal nodes) and optimized or pruned full trees. What is shown empirically is that when boosting tree stumps the performance enhancements achieved by Adaboost in many cases meets or exceeds the single or boosted larger tree structures. This finding has benefits such as simplified model structures and lower computational time. Lastly we provide a cursory review of new developments within the field of boosting such as margin theory which seeks to provide an explanation as to the methods seemingly mysterious test and training error performance; optimized tree boosting procedures such as gradient boosted methods and combinatorial ensemble methods using bagging and boosting.

Description:

Dissertation (MSc)--University of Pretoria, 2013.

Show full item record

Files in this item

Name: Vithal_Adaboost_2 ...

Size: 4.196Mb

Format: PDF

View/Open

This item appears in the following Collection(s)

Search UPSpace

Browse

All of UPSpace
This Collection
- Issue Date
- Authors
- Titles
- Subjects
- Supervisor
- UP Author
- UP Postgraduate
- Type

Adaboost and its application using classification trees

Adaboost and its application using classification trees

Abstract:

Description:

Files in this item

This item appears in the following Collection(s)

Search UPSpace

Browse

All of UPSpace

This Collection

My Account

UPSpace Workspace