METHOD AND ELECTRONIC DEVICE RECOGNIZING TEXT IN IMAGE

An electronic device for recognizing text and a method for operating the same are provided. The method may comprise the steps of: detecting positions of text fragments constituting text in an image; generating cropped images by cropping regions corresponding to the text fragments in the image; recog...

Full description

Saved in:

Bibliographic Details
Main Authors	KIM YOUNG UK, KIM KYUNG SU, LEE HYUNG MIN, KWON OH JOON, KIM YE HOON, KIM HYUN HAN, KIM HYO SANG
Format	Patent
Language	English Korean
Published	05.07.2023
Subjects	CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING PHYSICS
Online Access	Get full text

Cover

Loading…

More Information
Summary:	An electronic device for recognizing text and a method for operating the same are provided. The method may comprise the steps of: detecting positions of text fragments constituting text in an image; generating cropped images by cropping regions corresponding to the text fragments in the image; recognizing characters in the text fragments on the basis of the cropped images; generating a sentence by inputting the positions of the text fragments and the characters in the text fragments into a multimodal language model, wherein the multimodal language model is an artificial intelligence model that infers an original sentence from the text; and displaying the sentence. The objective of the present invention is to accurately infer the original sentence of text in an image using the multimodal language model. 텍스트를 인식하는 전자 장치 및 그 동작 방법이 제공된다. 상기 방법은, 이미지 내에서 텍스트를 구성하는 텍스트 조각들의 위치들을 검출하는 단계; 상기 이미지 내에서 텍스트 조각들에 대응하는 영역들을 잘라냄으로써 크롭 이미지(cropped image)들을 생성하는 단계; 상기 크롭 이미지들에 기초하여 상기 텍스트 조각들의 문자들을 인식하는 단계; 상기 텍스트 조각들의 위치들 및 상기 텍스트 조각들의 문자들을 멀티모달 언어 모델에 입력하여 문장을 생성하되, 상기 멀티모달 언어 모델은, 상기 텍스트의 원래 문장을 추론하는 인공지능 모델인 것인, 단계; 및 상기 문장을 표시하는 단계를 포함할 수 있다.
Bibliography:	Application Number: KR20220022452