LLaVA-Docent: Instruction Tuning with Multimodal Large Language Model to Support Art Appreciation Education

Despite the development of various AI systems to support learning in various domains, AI assistance for art appreciation education has not been extensively explored. Art appreciation, often perceived as an unfamiliar and challenging endeavor for most students, can be more accessible with a generativ...

Full description

Saved in:

Bibliographic Details
Published in	arXiv.org
Main Authors	Lee, Unggi, Jeon, Minji, Lee, Yunseo, Byun, Gyuri, Son, Yoorim, Shin, Jaeyoon, Ko, Hongkyu, Kim, Hyeoncheol
Format	Paper Journal Article
Language	English
Published	Ithaca Cornell University Library, arXiv.org 18.09.2024
Subjects	Chatbots Computer Science - Artificial Intelligence Computer Science - Computation and Language Computer Science - Social and Information Networks Datasets Large language models Literature reviews Technical education
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Despite the development of various AI systems to support learning in various domains, AI assistance for art appreciation education has not been extensively explored. Art appreciation, often perceived as an unfamiliar and challenging endeavor for most students, can be more accessible with a generative AI enabled conversation partner that provides tailored questions and encourages the audience to deeply appreciate artwork. This study explores the application of multimodal large language models (MLLMs) in art appreciation education, with a focus on developing LLaVA-Docent, a model designed to serve as a personal tutor for art appreciation. Our approach involved design and development research, focusing on iterative enhancement to design and develop the application to produce a functional MLLM-enabled chatbot along with a data design framework for art appreciation education. To that end, we established a virtual dialogue dataset that was generated by GPT-4, which was instrumental in training our MLLM, LLaVA-Docent. The performance of LLaVA-Docent was evaluated by benchmarking it against alternative settings and revealed its distinct strengths and weaknesses. Our findings highlight the efficacy of the MMLM-based personalized art appreciation chatbot and demonstrate its applicability for a novel approach in which art appreciation is taught and experienced.
ISSN:	2331-8422
DOI:	10.48550/arxiv.2402.06264