Improving probabilistic record linkage with a single-layer neural network

Show simple item record

dc.contributor.author Hamersma, Kris A.
dc.date.accessioned 2019-02-04T13:10:23Z
dc.date.available 2019-02-04T13:10:23Z
dc.date.created 2017
dc.date.issued 2017
dc.description Mini Dissertation (B Eng. (Industrial and Systems Engineering))--University of Pretoria, 2017. en_ZA
dc.description.abstract Data analysis requires data to be of a high quality. Unfortunately this is not always the case, especially when data is extracted from di erent data sources. In the case where there is no unique identi er to match data records from multiple data sources alternative methods need to be developed to match the records. Record linkage attempts to do this primarily with deterministic and probabilistic approaches. Deterministic models depend on certain corresponding elds from each record pair to be identical matches to match the record pair together. Probabilistic methods use a set of equations called the Fellegi- Sunter formulae to calculate decision-making weights, which is used to score a record pair on how well they match. If the matching score is above a certain threshold, the record pair is considered to be a match. This project investigates whether the development of a learning algorithm that re nes the weights will improve the probabilistic model's matching accuracy. The dataset that was used to train and test the record linkage models was a set of 92650 record pairs, some of which were matches and some of which were non-matches. It was found that a learning algorithm did improve the matching accuracy of the probabilistic model, although it is likely that the increase in the number of input features will improve the matching performance even more. en_ZA
dc.format.medium PDF en_ZA
dc.identifier.uri http://hdl.handle.net/2263/68389
dc.language en
dc.language.iso en en_ZA
dc.publisher University of Pretoria. Faculty of Engineering, Built Environment and Information Technology. Dept. of Industrial and Systems Engineering en_ZA
dc.rights © 2017 University of Pretoria. All rights reserved. The copyright in this work vests in the University of Pretoria. No part of this work may be reproduced or transmitted in any form or by any means, without the prior written permission of the University of Pretoria. en_ZA
dc.subject Mini-dissertations (Industrial and Systems Engineering) en_ZA
dc.title Improving probabilistic record linkage with a single-layer neural network en_ZA
dc.type Mini Dissertation en_ZA


Files in this item

This item appears in the following Collection(s)

Show simple item record