Minimum sample size for estimating the Bayes error at a predetermined level

Potgieter, Ryno

UPSpace Home
→
University of Pretoria: Research Output
→
Theses and Dissertations (University of Pretoria)
→
View Item

dc.contributor.advisor	Kanfer, F.H.J. (Frans)
dc.contributor.coadvisor	Millard, Sollie M.
dc.contributor.postgraduate	Potgieter, Ryno
dc.date.accessioned	2014-02-13T13:06:50Z
dc.date.available	2014-02-13T13:06:50Z
dc.date.created	2014
dc.date.issued	2013
dc.description	Dissertation (MSc)--University of Pretoria, 2013.	en_US
dc.description.abstract	Determining the correct sample size is of utmost importance in study design. Large samples yield classifiers or parameters with more precision and conversely, samples that are too small yield unreliable results. Fixed sample size methods, as determined by the specified level of error between the obtained parameter and population value, or a confidence level associated with the estimate, have been developed and are available. These methods are extremely useful when there is little or no cost (consequences of action), financial and time, involved in gathering the data. Alternatively, sequential sampling procedures have been developed specifically to obtain a classifier or parameter estimate that is as accurate as deemed necessary by the researcher, while sampling the least number of observations required to obtain the specified level of accuracy. This dissertation discusses a sequential procedure, derived using Martingale Limit Theory, which had been developed to train a classifier with the minimum number of observations to ensure, with a high enough probability, that the next observation sampled has a low enough probability of being misclassified. Various classification methods are discussed and tested, with multiple combinations of parameters tested. Additionally, the sequential procedure is tested on microarray data. Various advantages and shortcomings of the sequential procedure are pointed out and discussed. This dissertation also proposes a new sequential procedure that trains the classifier to such an extent as to accurately estimate the Bayes error with a high probability. The sequential procedure retains all of the advantages of the previous method, while addressing the most serious shortcoming. Ultimately, the sequential procedure developed enables the researcher to dictate how accurate the classifier should be and provides more control over the trained classifier.	en_US
dc.description.availability	Unrestricted	en_US
dc.description.department	Statistics	en_US
dc.identifier.citation	Potgieter, R 2013, Minimum sample size for estimating the Bayes error at a predetermined level, MSc dissertation, University of Pretoria, Pretoria, viewed yymmdd<http://hdl.handle.net/2263/33479>
dc.identifier.other	C14/4/166/gm
dc.identifier.uri	http://hdl.handle.net/2263/33479
dc.language.iso	en	en_US
dc.publisher	University of Pretoria	en_ZA
dc.rights	© 2013 University of Pretoria. All rights reserved. The copyright in this work vests in the University of Pretoria. No part of this work may be reproduced or transmitted in any form or by any means, without the prior written permission of the University of Pretoria.	en_US
dc.subject	Sequential Analysis	en_US
dc.subject	UCTD	en_US
dc.title	Minimum sample size for estimating the Bayes error at a predetermined level	en_US
dc.type	Dissertation	en_US