Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
1
29
19
sian cao
sonald
Follow
Mi6paulino's profile picture
1 follower
ยท
3 following
sonald
sonald
AI & ML interests
AI, big data, OS
Recent Activity
upvoted
a
paper
1 day ago
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization
upvoted
an
article
14 days ago
Deriving the DPO Loss from First Principles
upvoted
an
article
16 days ago
Deriving the PPO Loss from First Principles
View all activity
Organizations
spaces
2
Sort:ย Recently updated
Sleeping
Calculator Tool
๐
Answer questions using a simple tool
Runtime error
Chatdemo
๐
models
0
None public yet
datasets
0
None public yet