OpenAI just dropped a bombshell benchmark showing GPT-5 performs at human-expert levels in 40.6% of professional tasks across nine key industries. The GDPval test, spanning healthcare to finance, marks the closest AI has come to matching human economic output - a critical milestone toward artificial general intelligence that could reshape how millions of professionals work.
OpenAI just fired the latest shot in the AI arms race, and this one hits close to home for working professionals everywhere. The company's new GDPval benchmark reveals that GPT-5 now performs at or above human expert levels in over 40% of professional tasks - a dramatic leap that brings artificial general intelligence uncomfortably close to reality.
The test results landed Thursday like a carefully aimed disruption bomb. OpenAI pitted its GPT-5-high model against seasoned professionals across 44 occupations, from investment bankers crafting competitor analyses to nurses documenting patient care. The AI didn't just participate - it won or tied with human experts 40.6% of the time.
That number becomes even more striking when you consider the trajectory. GPT-4o, released just 15 months ago, managed only a 13.7% success rate. "The rate of progress is really encouraging," OpenAI evaluations lead Tejal Patwardhan told TechCrunch. The nearly triple improvement suggests we're not looking at incremental gains but exponential leaps toward human-level AI.
Anthropic inadvertently stole some thunder here - their Claude Opus 4.1 actually outscored GPT-5 with a 49% success rate. But OpenAI quickly threw shade, suggesting Claude's advantage comes from "pleasing graphics" rather than raw analytical capability. It's the kind of diplomatic slight that reveals how intensely these companies are watching each other's benchmarks.
The GDPval test itself represents a clever approach to measuring AI's economic impact. Rather than abstract academic problems, OpenAI focused on the nine industries that contribute most to America's GDP - healthcare, finance, manufacturing, government, and others. They asked experienced professionals to blindly compare AI-generated reports with human work, then vote for the winner.
"People in those jobs can now use the model to offload some of their work and do potentially higher value things," OpenAI chief economist Dr. Aaron Chatterji explained in an interview. It's the optimistic spin on what could be an uncomfortable reality for millions of knowledge workers.