UC San Diego's Hao AI Lab just got its hands on one of NVIDIA's most powerful systems, and it's already reshaping how researchers think about serving large language models in production. The Hao AI Lab, which has been quietly influencing how companies like NVIDIA architect their AI infrastructure, is now using the DGX B200 to push the boundaries of fast, low-latency AI responses. The research coming out of UC San Diego doesn't stay in the lab—it's already powering real-world systems.
The Hao AI Lab at UC San Diego just leveled up its research capabilities with access to one of NVIDIA's most powerful AI systems. The team received a DGX B200 that's now housed at the university's San Diego Supercomputer Center, giving researchers immediate access to enterprise-grade computing power most academic labs can only dream about.
Here's what makes this significant: the Hao AI Lab isn't just consuming AI infrastructure—it's literally designing how AI infrastructure should work. The lab's research on DistServe, a novel approach to serving large language models, directly influenced the architecture of NVIDIA Dynamo, an open-source framework now deployed in production systems worldwide. This new DGX B200 is essentially giving the team the hardware to push that research even further.
"DGX B200 is one of the most powerful AI systems from NVIDIA to date, which means that its performance is among the best in the world," said Hao Zhang, an assistant professor at UC San Diego's Halıcıoğlu Data Science Institute. "It enables us to prototype and experiment much faster than using previous-generation hardware." Translation: the team can now test more ideas, faster, with more compute—which is exactly what drives breakthrough research.
Two projects are already moving forward at full speed. FastVideo focuses on training video generation models that can produce a five-second video from a text prompt in roughly five seconds. That's the kind of real-time capability that changes what's possible in production systems. The team's also tapping NVIDIA H200 GPUs alongside the DGX B200 for this research phase, effectively throwing serious horsepower at the problem.
The second project, Lmgame-Bench, sounds deceptively playful—it's a benchmarking suite that tests large language models using popular video games like Tetris and Super Mario Bros. But there's serious research underneath. By measuring how different LLMs handle game-playing tasks, the team can compare model performance in real-time gameplay scenarios. Users can test one model or pit two against each other, giving researchers a new dimension for understanding model capabilities beyond traditional benchmarks.












