Record Linkage

Record linkage (also known as data linkage or data matching) refers to methods for bringing together information that relates to the same person or other named entity (eg, place or event) from different  sources. The techniques of probabilistic record linkage (‘fuzzy matching’) are those most familiar to historians.  Historical record linkage can be (and often is) carried out manually, though the Digital Panopticon project is particularly concerned with large-scale automated linking. While historical records often present particular challenges for linkage (eg, because of spelling variation, lack of unique identifiers or imprecision in dating), techniques – and challenges – of record linkage are shared across a number of different fields including health sciences (the original source for many techniques), social sciences, computer science and cultural institutions.

Introductory/General

Further Reading

Projects and Groups

Linked Open Data

Select Bibliography

  • Gerrit Bloothooft, ‘Assessment of Systems for Nominal Retrieval and Historical Record Linkage’, Computers and the Humanities 32, no. 1 (January 1, 1998): 39–56, http://www.jstor.org/stable/30200450.
  • Gérard Bouchard and Christian Pouyez, ‘Name variations and computerized record linkage’, Historical Methods: A Journal of Quantitative and Interdisciplinary History 13, no. 2 (1980): 119–125, http://www.tandfonline.com/doi/pdf/10.1080/01615440.1980.10594037.
  • Peter Christen, Mac Boot ANU, and Vassilios S. Verykios, ‘Advanced record linkage methods and privacy aspects for population reconstruction’ (presented at Population Reconstruction, Amsterdam, 2014), http://socialhistory.org/sites/default/files/docs/christen_-_advanced_record_linkage_methods.pdf.
  • H. Rhodri Davies, ‘Automated Record Linkage of Census Enumerators Books and Registration Data: Obstacles, Challenges and Solutions’, History and Computing 4, no. 1 (1992): 16–26.
  • I. P. Fellegi and A. B. Sunter, ‘A Theory for Record Linkage’, Journal of the American Statistical Association 64 (1969): 1183–1210.
  • Dallan Quass and Paul Starkey, ‘Record linkage for genealogical databases’, in KDD 2003 Workshop on Data Cleaning, Record Linkage and Object Consolidation, 2003, http://geddiff.googlecode.com/svn/branches/c-plus-plus/docs/quass-starkey.pdf.
  • Ian Winchester, ‘The linkage of historical records by man and computer: Techniques and problems’, The Journal of Interdisciplinary History 1, no. 1 (1970): 107–124, http://www.jstor.org/stable/202411.
  • Ian Winchester, ‘What Every Historian Needs to Know about Record Linkage for the Microcomputer Era’, Historical Methods: A Journal of Quantitative and Interdisciplinary History 25, no. 4 (October 1992): 149–165, doi:10.1080/01615440.1992.10112722, http://www.tandfonline.com/doi/abs/10.1080/01615440.1992.10112722.
  • Zhichun Fu et al., ‘Automatic Record Linkage of Individuals and Households in Historical Census Data’, International Journal of Humanities and Arts Computing 8, no. 2 (October 1, 2014): 204–225, doi:10.3366/ijhac.2014.0130, http://www.euppublishing.com/doi/abs/10.3366/ijhac.2014.0130.