The Pentagon is now testing OpenAI's first open-weight models in years on sensitive military computers, marking a significant shift for the ChatGPT maker. The gpt-oss-120b and gpt-oss-20b models can run locally without internet connections, opening doors to air-gapped defense applications. But early results from military contractors suggest OpenAI still lags behind competitors in key capabilities, even as defense officials welcome having options from the industry leader.
When OpenAI quietly released its gpt-oss-120b and gpt-oss-20b models in August, the tech world took notice. But perhaps no sector was more excited than the US military, which suddenly had access to cutting-edge AI that could run on classified systems without internet connections. The gpt-oss models represent OpenAI's first open-weight release since GPT-2, breaking years of closed-source development that locked out government agencies requiring air-gapped operations. Doug Matty, chief digital and AI officer for what the Trump administration now calls the Department of War, tells WIRED the Pentagon plans to integrate generative AI into battlefield systems and back-office functions. The shift couldn't come at a better time for OpenAI, which reversed its military ban last year and signed a $200 million Pentagon deal alongside Elon Musk's xAI, Anthropic, and Google. Early military testing reveals mixed results. Lilt, an AI translation company working with US intelligence, found OpenAI's models underperform in multilingual tasks and struggle with limited computing power. "They process only text, and the military needs to sort through images and audio," CEO Spence Green explains to WIRED. The company still relies on Meta's Llama and Google's Gemma for core operations. But other defense contractors are seeing promise. EdgeRunner AI successfully modified gpt-oss models by feeding them military documents, achieving sufficient performance for a virtual personal assistant that doesn't need cloud connectivity. The US Army and Air Force will begin testing EdgeRunner's modified model this month, according to CEO Tyler Saltsman's October research paper. The performance gap reflects broader industry dynamics. Nicolas Chaillan, former chief software officer for the US Air Force and founder of Ask Sage, argues open-source models "hallucinate and make incorrect predictions more often than the best commercial models." He describes the quality difference as "going from PhD level to a monkey" in a conversation with . Yet demand for independence is driving adoption. Pete Warden, who runs transcription developer Moonshine, says defense contacts have grown wary of big tech dependence after watching how Musk leveraged Starlink to influence government decisions. "Independence from suppliers is key," Warden tells . The military's AI platform Ask Sage now offers access to 125 open-source models alongside 25 closed options, giving defense agencies unprecedented choice. Kyle Miller from Georgetown's Center for Security and Emerging Technology notes open models provide "accessibility, control, customizability, and privacy that is simply not available with closed models" - crucial for drone and satellite applications where internet connectivity isn't guaranteed. For , the military pivot serves multiple strategic purposes. Open models can cultivate expertise in technology while allowing secretive operations that avoid public scrutiny. The company declined to comment on defense applications, but the timing aligns with its broader push into enterprise and government markets. The competitive landscape remains fluid. Until gpt-oss launch, was the only major AI company offering cutting-edge open models to military partners. Now , , and are developing specialized cloud networks for sensitive government tasks, while bets on local deployment capabilities.












