Abstract:
INTRODUCTION : The identification of classes of nutritionally similar food items is
important for creating food exchange lists to meet health requirements and for
informing nutrition guidelines and campaigns. Cluster analysis methods can
assign food items into classes based on the similarity in their nutrient contents.
Finite mixture models use probabilistic classification with the advantage of taking
into account the uncertainty of class thresholds.
METHODS : This paper uses univariate Gaussian mixture models to determine the
probabilistic classification of food items in the South African Food Composition
Database (SAFCDB) based on nutrient content.
RESULTS : Classifying food items by animal protein, fatty acid, available carbohydrate,
total fibre, sodium, iron, vitamin A, thiamin and riboflavin contents produced
data-driven classes with differing means and estimates of variability and could
be clearly ranked on a low to high nutrient contents scale. Classifying food items
by their sodium content resulted in five classes with the class means ranging
from 1.57 to 706.27 mg per 100 g. Four classes were identified based on available
carbohydrate content with the highest carbohydrate class having a mean content
of 59.15 g per 100 g. Food items clustered into two classes when examining their
fatty acid content. Foods with a high iron content had a mean of 1.46 mg per
100 g and was one of three classes identified for iron. Classes containing nutrientrich
food items that exhibited extreme nutrient values were also identified for
several vitamins and minerals.
DISCUSSION : The overlap between classes was evident and supports the use of
probabilistic classification methods. Food items in each of the identified classes
were comparable to allowed food lists developed for therapeutic diets. This datadriven
ranking of nutritionally similar classes could be considered for diet planning for medical conditions and individuals with dietary restrictions.