PARALLEL DOCUMENT MINING

A technique includes providing a collection of documents in multiple languages, identifying, from the collection of documents, a group of candidate documents, where each candidate document in the group shares multiple corresponding rare features, evaluating pairs of candidate documents in the group...

Full description

Saved in:
Bibliographic Details
Main Authors POPAT ASHOK C, DUBINER MOSHE, PONTE JAY M, USZKOREIT JAKOB
Format Patent
LanguageEnglish
Published 23.02.2012
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:A technique includes providing a collection of documents in multiple languages, identifying, from the collection of documents, a group of candidate documents, where each candidate document in the group shares multiple corresponding rare features, evaluating pairs of candidate documents in the group using multiple common features present in the collection of documents, and determining, based on evaluating the pairs of candidate documents, whether each pair of candidate documents corresponds to a translated pair of documents.
Bibliography:Application Number: US201113214941