Anthropic used Pokémon to benchmark its newest AI model

Last updated: February 24, 2025 7:47 pm

By MT HANNACH

2 Min Read

Disclosure: This website may contain affiliate links, which means I may earn a commission if you click on the link and make a purchase. I only recommend products or services that I personally use and believe will add value to my readers. Your support is appreciated!

Anthropic used Pokémon to compare its new AI model. Yes, really.

In a blog job Posted on Monday, Anthropic said that he had tested his latest model, Claude 3.7 SONNETOn the game Boy Classic Pokémon Red. The company has equipped the model with basic memory, the input of screen pixels and function calls to press the buttons and navigate around the screen, allowing it to play Pokémon continuously.

A unique characteristic of Claude 3.7 Sonnet is his ability to engage in “extended reflection”. Like O3 -Mini of Openai and R1 from Deepseek, Claude 3.7 Sonnet can “reason” through difficult problems by applying more IT – and taking more time.

It was useful in Pokémon Red, apparently.

Compared to a previous version of Claude, Claude 3.0 Sonnet, who did not leave the house at Pallet Town where the story begins, Claude 3.7 Sonnet managed to fight three Pokémon gym leaders and won their badges.

Anthropic Pokémon Red — **Image credits:**Anthropic

Now it is not clear how much computer was necessary for Claude 3.7 Sonnet to reach these milestones – and how long each took. Anthropic only said that the model had carried out 35,000 shares to reach the last gym leader, Surge.

It will surely not take long before certain enterprising developers discover it.

Pokémon Red is more a toy reference than anything. However, there East A long story Games used for comparative analysis purposes. In recent months only, a number of new applications and platforms have arisen to test the game capacities of models on titles ranging from Street-combatant has Pictionary.

Anthropic used Pokémon to benchmark its newest AI model

Leave a Reply Cancel reply

Follow US

Must Read

GE Vernova declares $0.25 dividend

US dollar falls after Fed governor signals possible July interest rate cuts

Japanese leader joins regional allies in skipping NATO summit

Oil sinks as traders bet Iran’s attack in Qatar will lead to de-escalation

HSBC’s return-to-office push risks denting CEO’s savings plan

More Links

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.

Leave a Reply Cancel reply

Follow US

Must Read

The Daily Newsletter