享乐主义者 发表于 2025-3-26 21:36:54

http://reply.papertrans.cn/99/9838/983777/983777_31.png

HAWK 发表于 2025-3-27 01:45:10

Referring Expression Comprehensionscribe this task and subsequently introduce prevalent datasets proposed for REC tasks such as the RefCOCO, RefCOCO+ and RefCOCOg datasets. Finally, we classify the methods in the REC domain into three main categories: two-stage models, one-stage models and reasoning process comprehension.

影响带来 发表于 2025-3-27 05:17:26

Question Answering (QA) Basicser, we discuss the QA task from the following aspects: rule-based methods, information retrieval-based methods, neural semantic parsing-based methods and approaches taking knowledge base into account.

土坯 发表于 2025-3-27 09:51:29

http://reply.papertrans.cn/99/9838/983777/983777_34.png

描绘 发表于 2025-3-27 14:39:06

Text-Based VQAxtVQA [.], ST-VQA [.] and OCR-VQA [.]. Subsequently, we describe an important tool (OCR) that is a prerequisite for the reasoning process, as texts must be first recognized. Next, we select 3 representative and effective models to address this problem and describe them in a sequential manner.

Invertebrate 发表于 2025-3-27 19:57:00

http://reply.papertrans.cn/99/9838/983777/983777_36.png

Tonometry 发表于 2025-3-28 00:10:51

http://reply.papertrans.cn/99/9838/983777/983777_37.png

西瓜 发表于 2025-3-28 03:43:27

http://reply.papertrans.cn/99/9838/983777/983777_38.png

金桌活画面 发表于 2025-3-28 09:50:24

http://reply.papertrans.cn/99/9838/983777/983777_39.png

伪善 发表于 2025-3-28 13:47:05

http://reply.papertrans.cn/99/9838/983777/983777_40.png
页: 1 2 3 [4] 5 6
查看完整版本: Titlebook: Visual Question Answering; From Theory to Appli Qi Wu,Peng Wang,Wenwu Zhu Book 2022 The Editor(s) (if applicable) and The Author(s), under