Jiayi Kuang, Jiarui Ouyang, Ying Shen 0001. Explore the Textual Perception Ability on the Images for Multimodal Large Language Models. In Derek F. Wong, Zhongyu Wei, Muyun Yang, editors, Natural Language Processing and Chinese Computing - 13th National CCF Conference, NLPCC 2024, Hangzhou, China, November 1-3, 2024, Proceedings, Part V. Volume 15363 of Lecture Notes in Computer Science, pages 300-311, Springer, 2024. [doi]
Abstract is missing.