Abstract:
The most common analysis used for binary data is generalised linear model (GLM) with either
a binomial or bernoulli distribution using either a logit, probit, complementary log-log
or other type of link functions. However, such analyses violate the independence assumption
if the binary data are measured repeatedly over time at the same subject or site. Failure to
take into account the correlation can lead to incorrect estimation of regression parameters
and the estimates are less efficient, particularly when the correlations are large. Therefore,
to obtain the most efficient estimates that are also unbiased the methods that incorporate
correlations (McCullagh and Nelder, 1989) should be used. Two of the statistical methodologies
that can be used to account for this correlation for the longitudinal data are the
generalized linear mixed models (GLMMs) and generalized estimating equation (GEE).
The GLMM method is based on extending the fixed effects GLM to include random effects
and covariance patterns. Unlike the GLM and GLMM methods, the GEE method is based
on the quasi-likelihood theory and no assumption is made about the distribution of response
observations (Liang and Zeger, 1986). The main objective of the study is to investigate the
statistical properties and limitations of these three approaches, i.e. GLM, GLMMs and GEE
for analyzing longitudinal data through use of a binary data from an entomology study. The
results reaffirms the point made by these authors that misspecification of working correlation
in GEE approach would still give consistent regression parameter estimates. Further, the
results of this study suggest that even with small correlation, ignoring a random effects in
a binary model can lead to inconsistent estimation.