DeepResearchEval: An Automated Framework for Deep Research Task Construction and Agentic Evaluation Paper • 2601.09688 • Published 4 days ago • 109
The Molecular Structure of Thought: Mapping the Topology of Long Chain-of-Thought Reasoning Paper • 2601.06002 • Published 9 days ago • 47
EnvScaler: Scaling Tool-Interactive Environments for LLM Agent via Programmatic Synthesis Paper • 2601.05808 • Published 9 days ago • 35
Agentic Rubrics as Contextual Verifiers for SWE Agents Paper • 2601.04171 • Published 11 days ago • 10
Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss Paper • 2512.23447 • Published 20 days ago • 94
Scaling Laws for Code: Every Programming Language Matters Paper • 2512.13472 • Published Dec 15, 2025 • 10
GenEnv: Difficulty-Aligned Co-Evolution Between LLM Agents and Environment Simulators Paper • 2512.19682 • Published 27 days ago • 15
DAComp: Benchmarking Data Agents across the Full Data Intelligence Lifecycle Paper • 2512.04324 • Published Dec 3, 2025 • 154
PaperDebugger: A Plugin-Based Multi-Agent System for In-Editor Academic Writing, Review, and Editing Paper • 2512.02589 • Published Dec 2, 2025 • 69
Revisiting the Necessity of Lengthy Chain-of-Thought in Vision-centric Reasoning Generalization Paper • 2511.22586 • Published Nov 27, 2025 • 6
From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence Paper • 2511.18538 • Published Nov 23, 2025 • 291
REASONEDIT: Towards Reasoning-Enhanced Image Editing Models Paper • 2511.22625 • Published Nov 27, 2025 • 46
PRInTS: Reward Modeling for Long-Horizon Information Seeking Paper • 2511.19314 • Published Nov 24, 2025 • 6
Budget-Aware Tool-Use Enables Effective Agent Scaling Paper • 2511.17006 • Published Nov 21, 2025 • 30
M3-Bench: Multi-Modal, Multi-Hop, Multi-Threaded Tool-Using MLLM Agent Benchmark Paper • 2511.17729 • Published Nov 21, 2025 • 16
GeoVista: Web-Augmented Agentic Visual Reasoning for Geolocalization Paper • 2511.15705 • Published Nov 19, 2025 • 95
OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe Paper • 2511.16334 • Published Nov 20, 2025 • 92
Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning Paper • 2511.16043 • Published Nov 20, 2025 • 108