CONVERSION PROCESSING DEVICE, INFORMATION PROCESSING APPARATUS WITH THE SAME, PROGRAM, AND RECORDING MEDIUM

PROBLEM TO BE SOLVED: To rightly extract and convert an object, even if an object such as an image other than a character is present in a cell of a table, while preventing the object from being extracted erroneously as a character, and to rightly dispose the object in the table.SOLUTION: A character...

Full description

Saved in:
Bibliographic Details
Main Authors TAKASHIMA MASAHIKO, MATSUOKA TERUHIKO, HAMADA KAZUYUKI
Format Patent
LanguageEnglish
Japanese
Published 18.05.2017
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:PROBLEM TO BE SOLVED: To rightly extract and convert an object, even if an object such as an image other than a character is present in a cell of a table, while preventing the object from being extracted erroneously as a character, and to rightly dispose the object in the table.SOLUTION: A character region that is present in document image information is extracted, a line segment that is present in the document image information is extracted, a table region is extracted by using information about the line segment, a predetermined local region is set to the document image information, and luminance change information of the local region is calculated by creating a luminance histogram of the local region. The luminance change information, information about the character region, the information about the line segment and information about a table region are used to extract an image object region including a drawing or a photograph that is present outside of the table region or inside of the table region. A table structure is analyzed based on the character region, the line segment and the image object region in the table region, and table structure information for reconfiguring the table is acquired.SELECTED DRAWING: Figure 7 【課題】表のセルの中に画像など文字以外のオブジェクトが存在する場合でも、そのオブジェクトを文字として誤って抽出することなく、正しくオブジェクトを抽出して変換し、表の中にそのオブジェクトを正しく配置する。【解決手段】文書画像情報に存在する文字領域を抽出し、文書画像情報に存在する線分を抽出し、線分の情報を用い、表領域を抽出し、文書画像情報に対して、予め定められる局所領域を設定し、局所領域の輝度ヒストグラムを作成して局所領域の輝度変化情報を求め、輝度変化情報と、文字領域の情報と、線分の情報と、表領域の情報と、を用いて表領域の外、または表領域の中に存在する図、または写真を含む画像オブジェクト領域の抽出を行い、表領域における文字領域、線分、および画像オブジェクト領域をもとに表構造を解析し、表を再構成する表構造情報を取得する。【選択図】図7
Bibliography:Application Number: JP20150210168