Challenges encountered and lessons learned when using a novel anonymised linked dataset of health and social care records for public health intelligence : the Sussex integrated dataset

dc.contributor.authorFord, Elizabeth
dc.contributor.authorTyler, Richard
dc.contributor.authorJohnston, Natalie
dc.contributor.authorSpencer-Hughes, Vicki
dc.contributor.authorEvans, Graham
dc.contributor.authorElsom, Jon
dc.contributor.authorMadzvamuse, Anotida
dc.contributor.authorClay, Jacqueline
dc.contributor.authorGilchrist, Kate
dc.contributor.authorRees-Roberts, Melanie
dc.date.accessioned2024-07-31T05:38:08Z
dc.date.available2024-07-31T05:38:08Z
dc.date.issued2023-02-08
dc.descriptionDATA AVAILABILITY STATEMENT : The data in the Sussex Integrated Dataset are currently only available for analysis to employees of joint data controller organisations. These are limited to the member organisations of the Sussex Integrated Care Partnership called “NHS Sussex.”en_US
dc.description.abstractBACKGROUND : In the United Kingdom National Health Service (NHS), digital transformation programmes have resulted in the creation of pseudonymised linked datasets of patient-level medical records across all NHS and social care services. In the Southeast England counties of East andWest Sussex, public health intelligence analysts based in local authorities (LAs) aimed to use the newly created “Sussex Integrated Dataset” (SID) for identifying cohorts of patients who are at risk of early onset multiple long-term conditions (MLTCs). Analysts from the LAs were among the first to have access to this new dataset. METHODS : Data access was assured as the analysts were employed within joint data controller organisations and logged into the data via virtual machines following approval of a data access request. Analysts examined the demographics and medical history of patients against multiple external sources, identifying data quality issues and developing methods to establish true values for cases with multiple conflicting entries. Service use was plotted over timelines for individual patients. RESULTS : Early evaluation of the data revealed multiple conflicting within-patient values for age, sex, ethnicity and date of death. This was partially resolved by creating a “demographic milestones” table, capturing demographic details for each patient for each year of the data available in the SID. Older data ( 5 y) was found to be sparse in events and diagnoses. Open-source code lists for defining long-term conditions were poor at identifying the expected number of patients, and bespoke code lists were developed by hand and validated against other sources of data. At the start, the age and sex distributions of patients submitted by GP practices were substantially different from those published by NHS Digital, and errors in data processing were identified and rectified. CONCLUSIONS : While new NHS linked datasets appear a promising resource for tracking multi-service use, MLTCs and health inequalities, substantial investment in data analysis and data architect time is necessary to ensure high enough quality data for meaningful analysis. Our team made conceptual progress in identifying the skills needed for programming analyses and understanding the types of questions which can be asked and answered reliably in these datasets.en_US
dc.description.departmentMathematics and Applied Mathematicsen_US
dc.description.librarianam2024en_US
dc.description.sdgSDG-03:Good heatlh and well-beingen_US
dc.description.sponsorshipThe National Institute of Health Research Public Health Research Programme.en_US
dc.description.urihttps://www.mdpi.com/journal/informationen_US
dc.identifier.citationFord, E.; Tyler, R.; Johnston, N.; Spencer-Hughes, V.; Evans, G.; Elsom, J.; Madzvamuse, A.; Clay, J.; Gilchrist, K.; Rees-Roberts, M. Challenges Encountered and Lessons Learned When Using a Novel Anonymised Linked Dataset of Health and Social Care Records for Public Health Intelligence: The Sussex Integrated Dataset. Information 2023, 14, 106. https://DOI.org/10.3390/info14020106.en_US
dc.identifier.issn2078-2489 (online)
dc.identifier.other10.3390/info14020106
dc.identifier.urihttp://hdl.handle.net/2263/97343
dc.language.isoenen_US
dc.publisherMDPIen_US
dc.rights© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.en_US
dc.subjectHealth dataen_US
dc.subjectElectronic health recordsen_US
dc.subjectData linkageen_US
dc.subjectData qualityen_US
dc.subjectPublic healthen_US
dc.subjectSDG-03: Good health and well-beingen_US
dc.subjectMultiple long-term conditions (MLTCs)en_US
dc.titleChallenges encountered and lessons learned when using a novel anonymised linked dataset of health and social care records for public health intelligence : the Sussex integrated dataseten_US
dc.typeArticleen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Ford_Challenges_2023.pdf
Size:
1.58 MB
Format:
Adobe Portable Document Format
Description:
Article

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: