affluent 发表于 2025-3-30 09:24:47

C. Toker,B. Uzun,F. O. Ceylan,C. Iktennabled through a bidirectional cross-attention mechanism. The approach offers multiple advantages - (a) easy to implement on standard ML accelerators (GPUs/TPUs) via standard high-level operators, (b) applicable to standard ViT and its variants, thus generalizes to various tasks, (c) can handle diff

Adulate 发表于 2025-3-30 13:40:48

,Learning Pseudo 3D Guidance for View-Consistent Texturing with 2D Diffusion, on learned .seudo .D .uidance. The key idea of P3G is to first learn a coarse but consistent texture, to serve as a global semantics guidance for encouraging the consistency between images generated on different views. To this end, we incorporate pre-trained text-to-image diffusion models and multi

Subdue 发表于 2025-3-30 17:41:31

http://reply.papertrans.cn/25/2424/242301/242301_53.png

Ventilator 发表于 2025-3-30 22:06:58

,SparseRadNet: Sparse Perception Neural Network on Subsampled Radar Data,o combine features from both branches. Experiments on the RADIal dataset show that our SparseRadNet exceeds state-of-the-art (SOTA) performance in object detection and achieves close to SOTA accuracy in freespace segmentation, meanwhile using sparse subsampled input data.

endoscopy 发表于 2025-3-31 04:19:30

http://reply.papertrans.cn/25/2424/242301/242301_55.png

血统 发表于 2025-3-31 08:56:53

http://reply.papertrans.cn/25/2424/242301/242301_56.png

ALTER 发表于 2025-3-31 13:00:09

,Explain via Any Concept: Concept Bottleneck Model with Open Vocabulary Concepts,ifier on the downstream dataset; (3) Reconstructing the trained classification head via any set of user-desired textual concepts encoded by CLIP’s text encoder. To reveal potentially missing concepts from users, we further propose to iteratively find the closest concept embedding to the residual par

使闭塞 发表于 2025-3-31 15:31:35

http://reply.papertrans.cn/25/2424/242301/242301_58.png

捐助 发表于 2025-3-31 19:57:09

,Missing Modality Prediction for Unpaired Multimodal Learning via Joint Embedding of Unimodal Modelscts the missing embedding through prompt tuning, leveraging information from available modalities. We evaluate our approach on several multimodal benchmark datasets and demonstrate its effectiveness and robustness across various scenarios of missing modalities.

metropolitan 发表于 2025-4-1 00:13:11

,Improving Diffusion Models for Authentic Virtual Try-on in the Wild, layer. In addition, we provide detailed textual prompts for both garment and person images to enhance the authenticity of the generated visuals. Finally, we present a customization method using a pair of person-garment images, which significantly improves fidelity and authenticity. Our experimental
页: 1 2 3 4 5 [6] 7
查看完整版本: Titlebook: Computer Vision – ECCV 2024; 18th European Confer Aleš Leonardis,Elisa Ricci,Gül Varol Conference proceedings 2025 The Editor(s) (if applic