VibeThinker: 3B param model that beats Opus 4.5 on reasoning with novel SFT+GRPO

⚓ IT    📅 2026-06-23    👤 surdeus    👁️ 1      

surdeus

Comments 🏷️ IT_feed