Minimum sample size for estimating the Bayes error at a predetermined level

dc.contributor.advisorKanfer, F.H.J. (Frans)
dc.contributor.coadvisorMillard, Sollie M.
dc.contributor.postgraduatePotgieter, Ryno
dc.date.accessioned2014-02-13T13:06:50Z
dc.date.available2014-02-13T13:06:50Z
dc.date.created2014
dc.date.issued2013
dc.descriptionDissertation (MSc)--University of Pretoria, 2013.en_US
dc.description.abstractDetermining the correct sample size is of utmost importance in study design. Large samples yield classifiers or parameters with more precision and conversely, samples that are too small yield unreliable results. Fixed sample size methods, as determined by the specified level of error between the obtained parameter and population value, or a confidence level associated with the estimate, have been developed and are available. These methods are extremely useful when there is little or no cost (consequences of action), financial and time, involved in gathering the data. Alternatively, sequential sampling procedures have been developed specifically to obtain a classifier or parameter estimate that is as accurate as deemed necessary by the researcher, while sampling the least number of observations required to obtain the specified level of accuracy. This dissertation discusses a sequential procedure, derived using Martingale Limit Theory, which had been developed to train a classifier with the minimum number of observations to ensure, with a high enough probability, that the next observation sampled has a low enough probability of being misclassified. Various classification methods are discussed and tested, with multiple combinations of parameters tested. Additionally, the sequential procedure is tested on microarray data. Various advantages and shortcomings of the sequential procedure are pointed out and discussed. This dissertation also proposes a new sequential procedure that trains the classifier to such an extent as to accurately estimate the Bayes error with a high probability. The sequential procedure retains all of the advantages of the previous method, while addressing the most serious shortcoming. Ultimately, the sequential procedure developed enables the researcher to dictate how accurate the classifier should be and provides more control over the trained classifier.en_US
dc.description.availabilityUnrestricteden_US
dc.description.departmentStatisticsen_US
dc.identifier.citationPotgieter, R 2013, Minimum sample size for estimating the Bayes error at a predetermined level, MSc dissertation, University of Pretoria, Pretoria, viewed yymmdd<http://hdl.handle.net/2263/33479>
dc.identifier.otherC14/4/166/gm
dc.identifier.urihttp://hdl.handle.net/2263/33479
dc.language.isoenen_US
dc.publisherUniversity of Pretoriaen_ZA
dc.rights© 2013 University of Pretoria. All rights reserved. The copyright in this work vests in the University of Pretoria. No part of this work may be reproduced or transmitted in any form or by any means, without the prior written permission of the University of Pretoria.en_US
dc.subjectSequential Analysisen_US
dc.subjectUCTDen_US
dc.titleMinimum sample size for estimating the Bayes error at a predetermined levelen_US
dc.typeDissertationen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Potgieter_Minimum_2013.pdf
Size:
5.53 MB
Format:
Adobe Portable Document Format
Description:
Dissertation

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: