征税 发表于 2025-4-1 03:16:56
,BAM-DETR: Boundary-Aligned Moment Detection Transformer for Temporal Sentence Grounding in Videos, allows the model to focus on desirable regions, enabling precise refinement of moment predictions. Further, we propose a quality-based ranking method, ensuring that proposals with high localization qualities are prioritized over incomplete ones. Experiments on three benchmarks validate the effectivCounteract 发表于 2025-4-1 07:56:09
http://reply.papertrans.cn/25/2424/242319/242319_62.png