MMRefine: Unveiling the Obstacles to Robust Refinement in Multimodal Large Language Models Paper • 2506.04688 • Published Jun 5, 2025 • 3
Does Audio Matter for Modern Video-LLMs and Their Benchmarks? Paper • 2509.17901 • Published Sep 22, 2025
MambaMia: A State-Space-Model-Based Compression for Efficient Video Understanding in Large Multimodal Models Paper • 2506.13564 • Published Jun 16, 2025
K-BrowseComp: A Web Browsing Agent Benchmark Grounded in Korean Contexts Paper • 2606.02404 • Published 10 days ago • 55
Decentralized Instruction Tuning: Conflict-Aware Splitting and Weight Merging Paper • 2606.01717 • Published 10 days ago • 21
Decentralized Instruction Tuning: Conflict-Aware Splitting and Weight Merging Paper • 2606.01717 • Published 10 days ago • 21
Decentralized Instruction Tuning: Conflict-Aware Splitting and Weight Merging Paper • 2606.01717 • Published 10 days ago • 21
K-BrowseComp: A Web Browsing Agent Benchmark Grounded in Korean Contexts Paper • 2606.02404 • Published 10 days ago • 55
Grounding World Simulation Models in a Real-World Metropolis Paper • 2603.15583 • Published Mar 16 • 154