Record linkage (also known as data linkage or data matching) refers to methods for bringing together information that relates to the same person or other named entity (eg, place or event) from different sources. The techniques of probabilistic record linkage (‘fuzzy matching’) are those most familiar to historians. Historical record linkage can be (and often is) carried out manually, though the Digital Panopticon project is particularly concerned with large-scale automated linking. While historical records often present particular challenges for linkage (eg, because of spelling variation, lack of unique identifiers or imprecision in dating), techniques – and challenges – of record linkage are shared across a number of different fields including health sciences (the original source for many techniques), social sciences, computer science and cultural institutions.
Introductory/General
- Wikipedia on Record Linkage
- What is record linkage?
- An introduction to probabilistic record linkage (pdf)
- Record linkage basics
- Deterministic and probabilistic record linkage (pdf)
Further Reading
- Longitudinal Analysis, Historical Sources and Generational Change
- RIDDLE Repository of Information on Duplicate Detection, Record Linkage and Identity Uncertainty
- Duplicate Record Detection (Wikiversity)
- Febrl – Freely extensible biomedical record linkage
- ANU Data Mining Group Record Linkage project
- German RLC Bibliography
Projects and Groups
- Irish Record Linkage 1864-1913
- Pauper Lives in Georgian London and Manchester
- Cambridge Group for History of Population and Social Structure (CAMPOP)
- LINKS – Linking System for Historical Family Reconstruction
- Connecting Shakespeare
- RecordLink (Canada)
- Swiss National Cohort
- German Record Linkage Center
Linked Open Data
- Improving Record Matching across Disparate Historical Records
- Linking History in Place
- LODLAM – Linked Open Data in Libraries, Archives and Museums
- LInked Data: Evolving the Web into a Global Data Space
- Europeana Linked Open Data
Select Bibliography
- Gerrit Bloothooft, ‘Assessment of Systems for Nominal Retrieval and Historical Record Linkage’, Computers and the Humanities 32, no. 1 (January 1, 1998): 39–56, http://www.jstor.org/stable/30200450.
- Gérard Bouchard and Christian Pouyez, ‘Name variations and computerized record linkage’, Historical Methods: A Journal of Quantitative and Interdisciplinary History 13, no. 2 (1980): 119–125, http://www.tandfonline.com/doi/pdf/10.1080/01615440.1980.10594037.
- Peter Christen, Mac Boot ANU, and Vassilios S. Verykios, ‘Advanced record linkage methods and privacy aspects for population reconstruction’ (presented at Population Reconstruction, Amsterdam, 2014), http://socialhistory.org/sites/default/files/docs/christen_-_advanced_record_linkage_methods.pdf.
- H. Rhodri Davies, ‘Automated Record Linkage of Census Enumerators Books and Registration Data: Obstacles, Challenges and Solutions’, History and Computing 4, no. 1 (1992): 16–26.
- I. P. Fellegi and A. B. Sunter, ‘A Theory for Record Linkage’, Journal of the American Statistical Association 64 (1969): 1183–1210.
- Dallan Quass and Paul Starkey, ‘Record linkage for genealogical databases’, in KDD 2003 Workshop on Data Cleaning, Record Linkage and Object Consolidation, 2003, http://geddiff.googlecode.com/svn/branches/c-plus-plus/docs/quass-starkey.pdf.
- Ian Winchester, ‘The linkage of historical records by man and computer: Techniques and problems’, The Journal of Interdisciplinary History 1, no. 1 (1970): 107–124, http://www.jstor.org/stable/202411.
- Ian Winchester, ‘What Every Historian Needs to Know about Record Linkage for the Microcomputer Era’, Historical Methods: A Journal of Quantitative and Interdisciplinary History 25, no. 4 (October 1992): 149–165, doi:10.1080/01615440.1992.10112722, http://www.tandfonline.com/doi/abs/10.1080/01615440.1992.10112722.
- Zhichun Fu et al., ‘Automatic Record Linkage of Individuals and Households in Historical Census Data’, International Journal of Humanities and Arts Computing 8, no. 2 (October 1, 2014): 204–225, doi:10.3366/ijhac.2014.0130, http://www.euppublishing.com/doi/abs/10.3366/ijhac.2014.0130.