Super Mario AI Showdown: Claude 3.5 Crushes the Competition
It turns out Super Mario Bros. is the ultimate challenge for artificial intelligence. Researchers recently pitted top AI models against each other in the Mushroom Kingdom, and the results were eye-opening.
Claude 3.5 emerged as the clear winner, outperforming Google’s Gemini and OpenAI’s GPT-4 in both score and gameplay smoothness.
How the AI Played Mario
The AI models weren’t playing the classic Mario game we grew up with. Instead, they used the “Gaming agent” emulator, a system that allows AI to control Mario in real-time through Python commands based on live screenshots.
Why is Super Mario Bros. so Hard for AI?
Real-time reactions are crucial. Strong reasoning-based models like GPT-4 were too slow to be truly effective. A delay of even a second could send Mario plummeting to his doom. This shows that speed is as important as intelligence in real-time gameplay.
What This Means for the Future of AI
This new benchmark is reshaping our understanding of AI’s capabilities. Unlike turn-based games like Pokémon, watching AI play Mario is more than just entertainment. It’s a fascinating glimpse into how AI can handle complex, split-second decisions in real-world scenarios.
Watch this video on Youtube