Laude Institute launches Slingshots grants for AI evaluation

The Laude Institute just dropped its first batch of Slingshots grants, targeting one of AI's thorniest problems: how to actually measure what these systems can do. The accelerator program is backing 15 projects focused on AI evaluation, offering the kind of compute power and engineering support that most academic researchers can only dream of.

The Laude Institute just made a major play in the AI evaluation space with its debut Slingshots program, and the timing couldn't be more critical. As AI capabilities explode across every sector, the industry is wrestling with a fundamental question: how do you actually measure what these systems can do?

The institute announced 15 projects on Thursday, each tackling different pieces of the AI evaluation puzzle. Unlike traditional academic grants that leave researchers scrambling for compute resources, Slingshots offers the full package - funding, massive compute power, and dedicated engineering support that would make most university labs jealous.

The catch? Recipients need to deliver something concrete, whether that's a startup, open-source code, or another tangible artifact. It's a hybrid model that bridges the gap between academic research and Silicon Valley's move-fast mentality.

Several projects in the cohort should ring bells for anyone following AI development. Terminal Bench is back with its command-line coding benchmark, while the ARC-AGI project continues its long-running quest to create meaningful AGI tests.

But the really interesting action is happening with the newer approaches. Formula Code, a collaboration between CalTech and UT Austin researchers, is building evaluations specifically for AI agents' code optimization skills. Meanwhile, Columbia's BizBench wants to create comprehensive benchmarks for "white-collar AI agents" - the kind that might soon be handling your expense reports or client emails.

The star power extends beyond just the projects. SWE-Bench co-founder John Boda Yang is leading CodeClash, a dynamic competition-based framework that builds on his previous success in AI code evaluation. Yang's worried about something that should keep the entire industry up at night: benchmarks becoming proprietary company tools rather than shared scientific standards.

"I do think people continuing to evaluate on core third-party benchmarks drives progress," Yang told TechCrunch. "I'm a little bit worried about a future where benchmarks just become specific to companies."

Laude Institute launches Slingshots grants for AI evaluation

More in AI Research

Google Supercharges NotebookLM with 8x Context Window, Custom Chat Goals

Google DeepMind launches AI for Math Initiative with top universities

Google Research Unveils AI Cancer Tool, Quantum Breakthrough

AI Models Get 'Brain Rot' From Social Media Training Data

Trending Now

Tesla promises Cybercab production by April 2025

Airbnb Stock Jumps 5% on Q3 Revenue Beat, Strong Q4 Outlook

OpenAI hits $20B ARR, commits $1.4T for AI infrastructure

Tesla Shareholders Approve Historic $1 Trillion Musk Payout

Tesla Shareholders Approve Musk's Historic $1T Pay Package

People Also Ask

Google Awards $5.6M to 84 Researchers in AI Safety Push

Google surveys 7,000 European teens on AI, finds generation gap

More Articles

OpenAI Caught Its AI Models Deliberately Lying - And It's Wild

OpenAI reveals how 10% of adults actually use ChatGPT daily

Laude Institute launches Slingshots grants for AI evaluation

More in AI Research

Google Supercharges NotebookLM with 8x Context Window, Custom Chat Goals

Google DeepMind launches AI for Math Initiative with top universities

Google Research Unveils AI Cancer Tool, Quantum Breakthrough

AI Models Get 'Brain Rot' From Social Media Training Data

Trending Now

Tesla promises Cybercab production by April 2025

Airbnb Stock Jumps 5% on Q3 Revenue Beat, Strong Q4 Outlook

OpenAI hits $20B ARR, commits $1.4T for AI infrastructure

Tesla Shareholders Approve Historic $1 Trillion Musk Payout

Tesla Shareholders Approve Musk's Historic $1T Pay Package

People Also Ask

What is the Laude Institute Slingshots program?

How many projects did Slingshots fund in its first batch?

What makes Slingshots different from traditional academic grants?

Which notable AI evaluation projects are part of Slingshots?

Why is AI evaluation becoming critical for the industry?

What concern does SWE-Bench co-founder have about AI benchmarks?

Google Awards $5.6M to 84 Researchers in AI Safety Push

Google surveys 7,000 European teens on AI, finds generation gap

More Articles

OpenAI Caught Its AI Models Deliberately Lying - And It's Wild

OpenAI reveals how 10% of adults actually use ChatGPT daily