A new AI infrastructure player just emerged at NVIDIA GTC 2026 with a bold pitch: give robots and wearables the ability to remember what they see. Memories.ai is building what it calls a large visual memory model, designed to index and retrieve video-recorded memories for physical AI systems. The announcement positions the startup at the intersection of two hot markets - AI infrastructure and the exploding physical AI sector that includes everything from humanoid robots to AI-powered glasses.
Memories.ai is tackling one of physical AI's thorniest problems: how do you give a robot or wearable device the ability to remember and retrieve what it has seen? The startup's answer, unveiled at NVIDIA GTC 2026, is a large visual memory model that treats video recordings like a searchable database of experiences.
The challenge is more complex than it sounds. While large language models have conquered text and modern computer vision systems can identify objects in real-time, building a system that can efficiently store, index, and retrieve visual memories across hours or days of continuous recording remains largely unsolved. That's the gap Memories.ai is betting on.
According to TechCrunch, the platform is specifically designed for physical AI applications - think AI-powered glasses that need to remember where you left your keys, or warehouse robots that must recall the layout of inventory from previous shifts. The infrastructure layer sits between the camera sensor and the AI application, handling the heavy lifting of turning continuous video streams into queryable memory.
The timing couldn't be better. The physical AI market is exploding as companies race to embed intelligence into everything from factory robots to consumer wearables. Meta's Ray-Ban smart glasses already let users ask questions about what they're looking at, while startups like Humane and Rabbit have launched AI-powered devices that record and process the world around them. But most of these systems lack persistent, searchable visual memory - they process what they see in the moment, then forget it.












