A comparative study of sample selection methods for classification

dc.contributor.authorLutu, Patricia Elizabeth Nalwoga
dc.contributor.authorEngelbrecht, Andries P.
dc.date.accessioned2008-04-08T12:45:29Z
dc.date.available2008-04-08T12:45:29Z
dc.date.issued2006-06
dc.description.abstractSampling of large datasets for data mining is important for at least two reasons. The processing of large amounts of data results in increased computational complexity. The cost of this additional complexity may not be justifiable. On the other hand, the use of small samples results in fast and efficient computation for data mining algorithms. Statistical methods for obtaining sufficient samples from datasets for classification problems are discussed in this paper. Results are presented for an empirical study based on the use of sequential random sampling and sample evaluation using univariate hypothesis testing and an information theoretic measure. Comparisons are made between theoretical and empirical estimates.en
dc.format.extent342371 bytes
dc.format.mimetypeapplication/pdf
dc.identifier.citationLutu, PEN & Engelbrecht, AP 2006, 'A comparative study of sample selection methods for classification', South African Computer Journal, issue 36, pp.69-85,[http://www.journals.co.za/ej/ejour_comp.html]en
dc.identifier.issn1015-7999
dc.identifier.urihttp://hdl.handle.net/2263/4904
dc.language.isoenen
dc.publisherComputer Society of South Africaen
dc.rightsComputer Society of South Africaen
dc.subjectDataset samplingen
dc.subjectData analysisen
dc.subjectMachine learningen
dc.subjectClassificationen
dc.subjectInformation measuresen
dc.subject.lcshSampling
dc.subject.lcshInformation measurement
dc.subject.lcshMachine learning
dc.titleA comparative study of sample selection methods for classificationen
dc.typeArticleen

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Lutu_Comparative(2006).pdf
Size:
334.35 KB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
2.39 KB
Format:
Item-specific license agreed upon to submission
Description: