Debates over AI benchmarking have reached Pokémon

  • staffstaff
  • AI
  • April 14, 2025
  • 0 Comments

Not even Pokémon is safe from AI benchmarking controversy. Last week, a post on X went viral, claiming that Google’s latest Gemini model surpassed Anthropic’s flagship Claude model in the original Pokémon video game trilogy. Reportedly, Gemini had reached Lavender Town in a developer’s Twitch stream; Claude was stuck at Mount Moon as of late […]

  • Related Posts

    OpenAI’s new reasoning AI models hallucinate more

    OpenAI’s recently launched o3 and o4-mini AI models are state-of-the-art in many respects. However, the new models still hallucinate, or make things up — in fact, they hallucinate more than…

    Continue reading
    ChatGPT is referring to users by their names unprompted, and some find it ‘creepy’

    Some ChatGPT users have noticed a strange phenomenon recently: Occasionally, the chatbot refers to them by name as it reasons through problems. That wasn’t the default behavior previously, and several…

    Continue reading

    Leave a Reply

    Your email address will not be published. Required fields are marked *

    You Missed

    OpenAI’s new reasoning AI models hallucinate more

    • By staff
    • April 18, 2025
    • 3 views

    ChatGPT is referring to users by their names unprompted, and some find it ‘creepy’

    • By staff
    • April 18, 2025
    • 3 views

    TikToker sues Roblox over her Charli XCX ‘Apple’ dance

    • By staff
    • April 18, 2025
    • 3 views

    ChatGPT will now use its ‘memory’ to personalize web searches

    • By staff
    • April 18, 2025
    • 4 views

    Is the SPAC back?

    • By staff
    • April 18, 2025
    • 3 views

    The Nintendo Switch 2 will still cost $450 in the US, despite tariffs

    • By staff
    • April 18, 2025
    • 3 views