Even some of the best AI can’t beat this new benchmark

  • staffstaff
  • AI
  • January 24, 2025
  • 0 Comments

The nonprofit Center for AI Safety (CAIS) and Scale AI, a company that provides a number of data labeling and AI development services, have released a challenging new benchmark for frontier AI systems. The benchmark, called Humanity’s Last Exam, includes thousands of crowdsourced questions touching on subjects like mathematics, humanities, and the natural sciences. To make […]

© 2024 TechCrunch. All rights reserved. For personal use only.

  • Related Posts

    SoundCloud backtracks on AI-related terms-of-use updates

    SoundCloud says it’s revising its terms after widespread backlash over a clause related to AI model training. Earlier this year, SoundCloud quietly updated its usage policies, adding wording that many users…

    Continue reading
    OpenAI pledges to publish AI safety test results more often

    OpenAI is moving to publish the results of its internal AI model safety evaluations more regularly in what the outfit is saying is an effort to increase transparency. On Wednesday,…

    Continue reading

    Leave a Reply

    Your email address will not be published. Required fields are marked *

    You Missed

    SoundCloud backtracks on AI-related terms-of-use updates

    • By staff
    • May 14, 2025
    • 0 views

    OpenAI pledges to publish AI safety test results more often

    • By staff
    • May 14, 2025
    • 1 views

    Stability AI releases an audio-generating model that can run on smartphones

    • By staff
    • May 14, 2025
    • 2 views

    DeepMind claims its newest AI tool is a whiz at math and science problems

    • By staff
    • May 14, 2025
    • 2 views