Sedative 发表于 2025-3-30 11:36:06
http://reply.papertrans.cn/25/2424/242306/242306_51.pngCRUMB 发表于 2025-3-30 14:49:23
http://reply.papertrans.cn/25/2424/242306/242306_52.png无能性 发表于 2025-3-30 19:12:08
http://reply.papertrans.cn/25/2424/242306/242306_53.pnginvert 发表于 2025-3-30 23:45:28
http://reply.papertrans.cn/25/2424/242306/242306_54.pngaccomplishment 发表于 2025-3-31 03:17:34
http://reply.papertrans.cn/25/2424/242306/242306_55.pngCUR 发表于 2025-3-31 07:00:33
,VideoMamba: State Space Model for Efficient Video Understanding,o domain. The proposed VideoMamba overcomes the limitations of existing 3D convolution neural networks (CNNs) and video transformers. Its linear-complexity operator enables efficient long-term modeling, which is crucial for high-resolution long video understanding. Extensive evaluations reveal Videoconference 发表于 2025-3-31 10:23:45
,SAFNet: Selective Alignment Fusion Network for Efficient HDR Imaging,ethods have achieved great success by either following the alignment and fusion pipeline or utilizing attention mechanism. However, the large computation cost and inference delay hinder them from deploying on resource limited devices. In this paper, to achieve better efficiency, a novel Selective Al栖息地 发表于 2025-3-31 14:39:27
,Heterogeneous Graph Learning for Scene Graph Prediction in 3D Point Clouds,her exploit context information or emphasize knowledge prior to model the scene graph in a fully-connected homogeneous graph framework. However, these methods may lead to indiscriminate message passing among graph nodes (i.e., objects), resulting in sub-optimal performance. In this paper, we proposeDEMN 发表于 2025-3-31 17:44:30
,Reason2Drive: Towards Interpretable and Chain-Based Reasoning for Autonomous Driving,ning tasks essential for highly autonomous vehicle behavior. Despite their potential, research in autonomous systems is hindered by the lack of datasets with annotated reasoning chains that explain the decision-making processes in driving. To bridge this gap, we present Reason2Drive, a benchmark datCLEFT 发表于 2025-4-1 01:10:27
,Omniview-Tuning: Boosting Viewpoint Invariance of Vision-Language Pre-training Models,ess to distribution shifts of 2D images. However, their robustness under 3D viewpoint variations is still limited, which can hinder the development for real-world applications. This paper successfully addresses this concern while keeping VLPs’ original performance by breaking through two primary obs