Minimum sample size for estimating the Bayes error at a predetermined level

Minimum sample size for estimating the Bayes error at a predetermined level

Files

Potgieter_Minimum_2013.pdf (5.53 MB)

Date

2013

Authors

Potgieter, Ryno

Publisher

University of Pretoria

Abstract

Determining the correct sample size is of utmost importance in study design. Large samples yield classifiers or parameters with more precision and conversely, samples that are too small yield unreliable results. Fixed sample size methods, as determined by the specified level of error between the obtained parameter and population value, or a confidence level associated with the estimate, have been developed and are available. These methods are extremely useful when there is little or no cost (consequences of action), financial and time, involved in gathering the data. Alternatively, sequential sampling procedures have been developed specifically to obtain a classifier or parameter estimate that is as accurate as deemed necessary by the researcher, while sampling the least number of observations required to obtain the specified level of accuracy. This dissertation discusses a sequential procedure, derived using Martingale Limit Theory, which had been developed to train a classifier with the minimum number of observations to ensure, with a high enough probability, that the next observation sampled has a low enough probability of being misclassified. Various classification methods are discussed and tested, with multiple combinations of parameters tested. Additionally, the sequential procedure is tested on microarray data. Various advantages and shortcomings of the sequential procedure are pointed out and discussed. This dissertation also proposes a new sequential procedure that trains the classifier to such an extent as to accurately estimate the Bayes error with a high probability. The sequential procedure retains all of the advantages of the previous method, while addressing the most serious shortcoming. Ultimately, the sequential procedure developed enables the researcher to dictate how accurate the classifier should be and provides more control over the trained classifier.

Description

Dissertation (MSc)--University of Pretoria, 2013.

Keywords

Sequential Analysis, UCTD

Citation

Potgieter, R 2013, Minimum sample size for estimating the Bayes error at a predetermined level, MSc dissertation, University of Pretoria, Pretoria, viewed yymmdd<http://hdl.handle.net/2263/33479>

URI

http://hdl.handle.net/2263/33479

Collections

Theses and Dissertations (University of Pretoria)
Theses and Dissertations (Statistics)

Full item page

Minimum sample size for estimating the Bayes error at a predetermined level

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Sustainable Development Goals

Citation

URI

Collections