Info This post is auto-generated from RSS feed Hacker News. Source: VibeThinker: 3B param model that beats Opus 4.5 on reasoning with novel SFT+GRPO