Transformation-based Framework for Record Matching

Today's record matching infrastructure does not allow a flexible way to account for synonyms such as "Robert" and "Bob" which refer to the same name, and more general forms of string transformations such as abbreviations. We propose a programmatic framework of record matchin...

Full description

Saved in:
Bibliographic Details
Published in2008 IEEE 24th International Conference on Data Engineering pp. 40 - 49
Main Authors Arasu, A., Chaudhuri, S., Kaushik, R.
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.04.2008
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Today's record matching infrastructure does not allow a flexible way to account for synonyms such as "Robert" and "Bob" which refer to the same name, and more general forms of string transformations such as abbreviations. We propose a programmatic framework of record matching that takes such user-defined string transformations as input. To the best of our knowledge, this is the first proposal for such a framework. This transformational framework, while expressive, poses significant computational challenges which we address. We empirically evaluate our techniques over real data.
ISBN:9781424418367
1424418364
ISSN:1063-6382
2375-026X
DOI:10.1109/ICDE.2008.4497412