Contaminated models of reparameterised versions of the Dirichlet-multinomial distribution

Loading...
Thumbnail Image

Date

Journal Title

Journal ISSN

Volume Title

Publisher

University of Pretoria

Abstract

The Dirichlet-Multinomial (DM) distribution is often used for the modelling of multivariate count data, which has been applied in diverse areas such as microbiome studies, genetics, and ecological analysis. Despite its wide use, the distribution lacks easily interpretable parameters and the ability to account for outliers. In this study, we propose a novel reconstruction/perspective of the DM distribution: namely, reparameterisation of the DM distribution, which will be utilised to develop contaminated versions. Two reparameterisations are considered: the first in terms of the mode and a parameter referred to as the pseudo-variance and the second in terms of the mean and another pseudo-variance parameter. Such reparameterisations improve interpretability and allow the further construction of contaminated models that are robust to outliers. We consider properties such as the derived probability mass functions and moments for the proposed models. Simulation studies evaluate these models under varying scenarios, comparing estimation accuracy, bias, and computational performance. The relevance of the proposed models is illustrated via a microbiome data application. The developments from this study enhance the flexibility of the DM distribution and reinforce its usefulness for analyzing modern complex datasets in the biological and statistical sciences.

Description

Mini Dissertation

Keywords

Contaminated Models, Dirichlet-multinomial, Outliers, Overdispersion, Reparameterisation

Sustainable Development Goals

None

Citation

*