Crowdsourced AI benchmarks have serious flaws, some experts say

  • staffstaff
  • AI
  • April 22, 2025
  • 0 Comments

AI labs are increasingly relying on crowdsourced benchmarking platforms such as Chatbot Arena to probe the strengths and weaknesses of their latest models. But some experts say that there are serious problems with this approach from an ethical and academic perspective. Over the past few years, labs including OpenAI, Google, and Meta have turned to […]

  • Related Posts

    The US is reviewing Benchmark’s investment into Chinese AI startup Manus 

    Manus AI is one of the hottest AI agent startups around, recently raising $75 million at a half-billion-dollar valuation in a round led by Benchmark. But two unnamed sources told…

    Continue reading
    Google I/O 2025: What to expect, including updates to Gemini and Android 16

    Google I/O, Google’s biggest developer conference of the year, is nearly upon us. Scheduled for May 20 to 21 at the Shoreline Amphitheatre in Mountain View, I/O will showcase product announcements…

    Continue reading

    Leave a Reply

    Your email address will not be published. Required fields are marked *

    You Missed

    The US is reviewing Benchmark’s investment into Chinese AI startup Manus 

    • By staff
    • May 9, 2025
    • 1 views

    Google I/O 2025: What to expect, including updates to Gemini and Android 16

    • By staff
    • May 9, 2025
    • 2 views

    SoundCloud changes policies to allow AI training on user content

    • By staff
    • May 9, 2025
    • 3 views

    CoreWeave reportedly looks to raise $1.5B in debt as IPO disappoints

    • By staff
    • May 9, 2025
    • 3 views

    Meta’s speeding up the ‘Mad Men to Math Men’ pipeline

    • By staff
    • May 9, 2025
    • 3 views

    This is your last chance to exhibit at TechCrunch Sessions: AI — don’t miss out

    • By staff
    • May 9, 2025
    • 5 views