Survey on Entity Resolution and Similarity Joins

The paper “Blocking and Filtering Techniques for Entity Resolution: A Survey”, by George Papadakis, Dimitrios Skoutas, Emmanouil Thanos and Themis Palpanas, has been published in ACM Computing Surveys.

This survey reviews a large number of works under two different but related frameworks: Blocking and Filtering. The former restricts comparisons to entity pairs that are more likely to match, while the latter identifies quickly entity pairs that are likely to satisfy predetermined similarity thresholds. It also elaborates on hybrid approaches that combine different characteristics. For each framework it provides a comprehensive list of the relevant works, discussing them in the greater context. It concludes with the most promising directions for future work in the field.