Multilingual Text Enhancer (MLTE) - A LLaMA2 based Model for Prompt Generation

This research work introduces MLTE - Multilingual Text Enhancer, a text enhancement model developed primarily to the enhance input text for text-to-image generation models. The existing text encoders in image generation models often have limited capabilities, and the quality of image generation depe...

Full description

Saved in:
Bibliographic Details
Published in2024 3rd International Conference on Applied Artificial Intelligence and Computing (ICAAIC) pp. 895 - 900
Main Authors Teja, Nv Sai, Kumar, Kuldeep, Malarvel, Muthukumaran
Format Conference Proceeding
LanguageEnglish
Published IEEE 05.06.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:This research work introduces MLTE - Multilingual Text Enhancer, a text enhancement model developed primarily to the enhance input text for text-to-image generation models. The existing text encoders in image generation models often have limited capabilities, and the quality of image generation depends on the prompt. If the prompt includes misspelled words, the encoder will create an irrelevant image. Most encoders are primarily based on English. MLTE effectively addresses multilingual prompts, misspelled words, overly verbose prompts, and creatively enhances the prompt to achieve improved results. MLTE employs sophisticated natural language processing algorithms to establish a link between unprocessed textual input and the production of highly precise visual content. MLTE is based on LLaMA2, which has the ability to handle numerous languages and enables the simple incorporation of content from diverse linguistic origins. Additionally, its spellchecking and correction functions ensure the quality and coherence of the prompt. Moreover, MLTE's scene and text augmentation features strengthen the visual richness and coherence of generated photos, thereby enhancing their overall quality and realism. Its summarizing capability condenses large paragraphs into concise yet helpful summaries, which assisting the image creation process by delivering more focused inputs. MLTE can be used with any text to image generating models. By undertaking empirical evaluation, this paper demonstrates the effectiveness of MLTE in boosting text quality for text-to-image synthesis tasks, leading to significantly improved image generation results.
DOI:10.1109/ICAAIC60222.2024.10575122