MemDreamer: Decoupling Perception and Reasoning for Long Video Understanding via Hierarchical Graph Memory and Agentic Retrieval Mechanism Paper • 2606.07512 • Published 13 days ago • 38
Where to Look: Can Foundation Models Reach a Target Viewpoint Through Active Exploration? Paper • 2606.01247 • Published 18 days ago • 31
Eliciting Complex Spatial Reasoning in MLLMs through Wide-Baseline Matching Paper • 2606.03577 • Published 16 days ago • 16
Eliciting Complex Spatial Reasoning in MLLMs through Wide-Baseline Matching Paper • 2606.03577 • Published 16 days ago • 16
AGILE: Hand-Object Interaction Reconstruction from Video via Agentic Generation Paper • 2602.04672 • Published Feb 4 • 1
Eliciting Complex Spatial Reasoning in MLLMs through Wide-Baseline Matching Paper • 2606.03577 • Published 16 days ago • 16