Comparing logistic regression methods for completely separated and quasi-separated data
Loading...
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
University of Pretoria
Abstract
An occurrence which is sometimes observed in a model based on dichotomous dependent variables is separation in the data. Separation in the data is when one or more of the independent variables can perfectly predict some binary outcome and it primarily occurs in small samples. There are three different mutually exclusive and exhaustive classes into which the data from a logistic regression can be classified: complete separation, quasi-complete separation and overlap. Separation (either complete or quasi-complete) in the data gives rise to a number of problems since it implies in nite or zero maximum likelihood estimates which are idealistic and does not happen in practice. In this dissertation the theory behind a logistic regression model, the definition of separation and different methods to deal with separation are discussed in part I. The methods that will be focused on are exact logistic regression, Firth s method which penalises the likelihood function and hidden logistic regression. In part II of this dissertation the three fore mentioned methods will be compared to one another. This will be done by applying each method to data sets which exhibit either complete or quasi-complete separation for different sample sizes and different covariate types.
Description
Dissertation (MSc)--University of Pretoria, 2013.
Keywords
Regression methods, UCTD
Sustainable Development Goals
Citation
Botes, M 2013, Comparing logistic regression methods for completely separated and quasi-separated data, MSc dissertation, University of Pretoria, Pretoria, viewed yymmdd<http://hdl.handle.net/2263/33314>