Bibliography¶
- http://research.microsoft.com/apps/pubs/default.aspx?id=153478
- http://cs.anu.edu.au/~Peter.Christen/data-matching-book-2012.html
- http://www.umiacs.umd.edu/~getoor/Tutorials/ER_VLDB2012.pdf
New School¶
- Steorts, Rebecca C., Rob Hall and Stephen Fienberg. “A Bayesian Approach to Record Linkage and De-duplication” December 2013. http://arxiv.org/abs/1312.4645
Very beautiful work. Records are matched to latent individuals. O(N) running time. Unsupervised, but everything hinges on tuning hyperparameters. This work only contemplates categorical variables.
To Read¶
- Domingos and Domingos Multi-relational record linkage. http://homes.cs.washington.edu/~pedrod/papers/mrdm04.pdf
- An Entity Based Model for Coreference Resolution http://people.cs.umass.edu/~mwick/MikeWeb/Publications_files/wick09entity.pdf