种基于自引导方式的领域自适应句子对齐系统

种基于自引导方式的领域自适应句子对齐系统,包括:网页处理模块,中文文本处理模块,英文文本处理模块和双语文处理模块。首先,针对不同的网页,对于料进行提取和相应做预处理;使用种基于自引导的方式并融合多种特征的句子对齐算法对中英文进行句子级的对齐;同时,对可能能够反映相关领域信息和主题信息的互译词对进行提取。本发明提高了句子对齐质量,具有领域适应性强的优点。 Provided is a domain self-adaption sentence alignment system based on a self-guidance mode. The domain self-adaption sente...

Full description

Saved in:
Bibliographic Details
Format Patent
LanguageChinese
Published 15.02.2017
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:种基于自引导方式的领域自适应句子对齐系统,包括:网页处理模块,中文文本处理模块,英文文本处理模块和双语文处理模块。首先,针对不同的网页,对于料进行提取和相应做预处理;使用种基于自引导的方式并融合多种特征的句子对齐算法对中英文进行句子级的对齐;同时,对可能能够反映相关领域信息和主题信息的互译词对进行提取。本发明提高了句子对齐质量,具有领域适应性强的优点。 Provided is a domain self-adaption sentence alignment system based on a self-guidance mode. The domain self-adaption sentence alignment system comprises a webpage processing module, a Chinese text processing module, an English text processing module and a double language text processing module. Firstly, materials of different web pages are extracted and correspondingly pre-processed; sentence-level alignment is carried out on Chinese and English sentences through a sentence alignment algorithm which is based on the self-guidance mode and integrates a plurality of characteristics. Meanwhile, intertranslation words capable of reflecting related domain information and subject information are extracted. Sentence alignment quality is improved, and the domain self-adaption sentence alignment system has the advantage of being strong in domai
Bibliography:Application Number: CN201310659722