Prediction of recent HIV-1 infections using Shannon entropy analysis of HIV-1 group-specific antigen protein sequence

Abstract

BACKGROUND : Avidity assays often misclassify chronic HIV-1 infection as recent HIV-1 infection (false recency rate), especially in participants on antiretroviral therapy. The aim of this study was to use Shannon entropy to evaluate HIV-1 group-specific antigen (Gag) sequence diversity for the prediction of recent HIV-1 infections. METHODS : This was a retrospective study that characterised the complete HIV-1 Gag using Sanger sequences obtained from participants with confirmed recent or chronic HIV-1 infection. Shannon entropy was calculated for the entire HIV-1 Gag amino acid (aa) sequence (501aa) and sliding window analysis was computed at intervals of 100aa each. This was followed by searching for aa sites that exhibited a different distribution of mutations between recent and chronic HIV-1 infection stages. Reference sequences were obtained from GenBank and the Los Alamos HIV database to verify the findings obtained from study sequences. RESULTS : Forty-seven participants with a mean age of 28.7 years (18 – 44) were enrolled, and fourteen (30%) of them had recent HIV-1 infection. Shannon entropy analysis showed a significantly higher aa diversity in chronic HIV-1 infection compared to recent HIV-1 infection (p = 0.0003). Analysis of sliding windows led to identification of four aa positions; S54, E55, I256, and S451; with different pattern of distribution between recent and chronic HIV-1 infection stages; however statistical significance was only observed for three of these aa, p values = 0.094, 0.027, 0.027 and 0.045, respectively. The performance of these informative sites for detection of recent HIV-1 infection in study sequences ranged from 71—86%, however, they had a high false recency rate (FRR) ranging from 39%—52%. Similar performance was observed in reference sequences. The combination of some informative aa sites reduced FRR in study sequences to below 24%. CONCLUSIONS : Our data show that a Gag-based molecular strategy can be used to detect recent HIV-1 infections where Gag sequences are available. However, the results would have to be interpreted with caution due to an association with a high FRR. Further studies are needed to develop a molecular-based strategy with better performance for detection of recent HIV-1 infections.

Description

DATA AVAILABILITY : All data generated or analysed during this study are included in this manuscript.

Keywords

Human immunodeficiency virus (HIV), HIV-1 group-specific antigen (Gag), Shannon entropy, Recent HIV, HIV-1 infection, False recency rate, Chronic HIV-1 infection, Gag diversity

Sustainable Development Goals

SDG-03: Good health and well-being

Citation

ortuin, T.L., Nkone, P., Loubser, S. et al. Prediction of recent HIV-1 infections using Shannon entropy analysis of HIV-1 group-specific antigen protein sequence. Virology Journal 23, 61: 1-11 (2026). https://doi.org/10.1186/s12985-026-03080-x.