Tencent-IMO
Tencent-IMO
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
about 1 month ago
Can LLMs Guide Their Own Exploration? Gradient-Guided Reinforcement Learning for LLM Reasoning
upvoted
a
paper
4 months ago
VOGUE: Guiding Exploration with Visual Uncertainty Improves Multimodal
Reasoning
upvoted
a
paper
4 months ago
CLUE: Non-parametric Verification from Experience via Hidden-State
Clustering
Organizations
None yet