Study identifies weaknesses in how AI systems are evaluated
📅 2025-11-08 ⚓ Hacker News 🌐 Source 🖼️ Load Image