Meta’s benchmarks for its new AI models are a bit misleading

  • staffstaff
  • AI
  • April 6, 2025
  • 0 Comments

One of the new flagship AI models Meta released on Saturday, Maverick, ranks second on LM Arena, a test that has human raters compare the outputs of models and choose which they prefer. But it seems the version of Maverick that Meta deployed to LM Arena differs from the version that’s widely available to developers. […]

  • Related Posts

    RLWRLD raises $14.8M to build a foundational model for robotics

    As robotics has advanced, industry has steadily adopted more robots to automate away many kinds of grunt work. More than 540,000 new industrial robots were installed worldwide in 2023, taking…

    Continue reading
    Debates over AI benchmarking have reached Pokémon

    Not even Pokémon is safe from AI benchmarking controversy. Last week, a post on X went viral, claiming that Google’s latest Gemini model surpassed Anthropic’s flagship Claude model in the…

    Continue reading

    Leave a Reply

    Your email address will not be published. Required fields are marked *

    You Missed

    RLWRLD raises $14.8M to build a foundational model for robotics

    • By staff
    • April 15, 2025
    • 0 views

    Debates over AI benchmarking have reached Pokémon

    • By staff
    • April 14, 2025
    • 1 views

    Google’s newest AI model is designed to help study dolphin ‘speech’

    • By staff
    • April 14, 2025
    • 2 views

    OpenAI’s new GPT-4.1 AI models focus on coding

    • By staff
    • April 14, 2025
    • 1 views

    OpenAI plans to phase out GPT-4.5, its largest-ever AI model, from its API

    • By staff
    • April 14, 2025
    • 1 views

    Google Classroom gives teachers an AI feature for quiz questions

    • By staff
    • April 14, 2025
    • 1 views