obsession 发表于 2025-3-23 10:30:12

Distinctive-Attribute Extraction for Image Captioningn open issue. In previous works, a caption involving semantic description can be generated by applying additional information into the RNNs. In this approach, we propose a distinctive-attribute extraction (DaE) method that extracts attributes which explicitly encourage RNNs to generate an accurate c

难听的声音 发表于 2025-3-23 15:05:57

http://reply.papertrans.cn/24/2343/234202/234202_12.png

dyspareunia 发表于 2025-3-23 19:00:12

Knowing When to Look for What and Where: Evaluating Generation of Spatial Descriptions with Adaptives in end-to-end neural networks, in particular how adaptive attention is informative for generating spatial relations. We show that the model generates spatial relations more on the basis of textual rather than visual features and therefore confirm the previous observations that the learned visual f

Vulvodynia 发表于 2025-3-23 23:59:02

How Clever Is the FiLM Model, and How Clever Can it Be?vely simple and easily transferable architecture. In this paper, we investigate in more detail the ability of FiLM to learn various linguistic constructions. Our results indicate that (a) FiLM is not able to learn relational statements straight away except for very simple instances, (b) training on

epicondylitis 发表于 2025-3-24 05:12:01

Image-Sensitive Language Modeling for Automatic Speech Recognitionhis paper explores the benefits of introducing the visual modality as context information to automatic speech recognition. We use neural multimodal language models to rescore the recognition results of utterances that describe visual scenes. We provide a comprehensive survey of how much the language

妈妈不开心 发表于 2025-3-24 08:45:07

http://reply.papertrans.cn/24/2343/234202/234202_16.png

离开 发表于 2025-3-24 11:41:27

http://reply.papertrans.cn/24/2343/234202/234202_17.png

crumble 发表于 2025-3-24 18:25:26

H. Bjørke,O. Dragesund,Ø. Ulltangons. It is not only applicable to human skeletons but also to other kinematic chains for instance animals or industrial robots. We achieve state-of-the-art results on different benchmark databases and real world scenes.

reflection 发表于 2025-3-24 22:03:45

http://reply.papertrans.cn/24/2343/234202/234202_19.png

jeopardize 发表于 2025-3-25 01:22:12

Video Object Segmentation with Referring Expressions and ., with language descriptions of target objects. We show that our approach performs on par with the methods which have access to the object mask on . and is competitive to methods using scribbles on challenging ..
页: 1 [2] 3 4 5 6
查看完整版本: Titlebook: Computer Vision – ECCV 2018 Workshops; Munich, Germany, Sep Laura Leal-Taixé,Stefan Roth Conference proceedings 2019 Springer Nature Switze