漂浮 发表于 2025-3-28 16:46:54

http://reply.papertrans.cn/17/1677/167619/167619_41.png

捕鲸鱼叉 发表于 2025-3-28 19:57:57

http://reply.papertrans.cn/17/1677/167619/167619_42.png

Overdose 发表于 2025-3-28 23:54:12

http://reply.papertrans.cn/17/1677/167619/167619_43.png

Robust 发表于 2025-3-29 03:22:40

http://reply.papertrans.cn/17/1677/167619/167619_44.png

尖叫 发表于 2025-3-29 11:04:45

http://reply.papertrans.cn/17/1677/167619/167619_45.png

加入 发表于 2025-3-29 11:35:26

Cross-Modal Attention Alignment Network with Auxiliary Text Description for Zero-Shot Sketch-Based ILLM with several interrogative sentences, (ii) a Feature Extraction Module that includes two ViTs for sketch and image data, a transformer for extracting tokens of sentences of each training category, finally (iii) a Cross-modal Alignment Module that exchanges the token features of both text-sketch

忙碌 发表于 2025-3-29 18:11:17

Exploring Interpretable Semantic Alignment for Multimodal Machine Translationalysis of the results demonstrates the effectiveness and interpretability of our model, which is highly competitive compared to the baseline. Further exploration of extractors in MMT shows that a large multimodal pre-trained model can provide more fine-grained semantic alignment, thus giving it an a

GEN 发表于 2025-3-29 22:35:09

Modal Fusion-Enhanced Two-Stream Hashing Network for Cross Modal Retrievalon matrices between modalities. Subsequently, by adjusting the similarity weights of the fusion matrix between modalities, we shorten the distances between the most similar instance pairs and increase the distances between the most dissimilar instance pairs, thereby generating hash codes with higher

sigmoid-colon 发表于 2025-3-30 01:26:51

Text Visual Question Answering Based on Interactive Learning and Relationship Modeling (RPRET) layer is introduced to model the relative position relationship between different modalities in the image, thereby improving the performance of answering the question related to spatial position relationships. The proposed method outperforms various state-of-the-art models on two public dat

insurrection 发表于 2025-3-30 05:47:34

http://reply.papertrans.cn/17/1677/167619/167619_50.png
页: 1 2 3 4 [5] 6
查看完整版本: Titlebook: Artificial Neural Networks and Machine Learning – ICANN 2024; 33rd International C Michael Wand,Kristína Malinovská,Igor V. Tetko Conferenc