Google just opened the doors to Project Genie, its experimental AI tool that turns text prompts and photos into explorable game worlds. Starting Thursday, Google AI Ultra subscribers in the U.S. get early access to the prototype, which fuses Google DeepMind's latest Genie 3 world model with image generator Nano Banana Pro and Gemini. The move signals Google's aggressive push into the world model race as competitors like World Labs and Runway accelerate their own launches.
Google DeepMind is betting that the future of AI isn't just about generating images or video - it's about creating entire worlds you can step into and explore. Project Genie, now available to Google AI Ultra subscribers in the U.S., represents the company's first major consumer play in the emerging world model wars.
The experimental prototype combines three of Google's AI technologies: the Genie 3 world model unveiled last August, the Nano Banana Pro image generator, and Gemini. Together, they transform simple text descriptions or uploaded photos into interactive 3D environments that users can navigate in first or third person. Want to explore a marshmallow castle floating in the clouds? Done. A claymation wonderland with chocolate rivers? Project Genie will generate it in seconds.
"I think it's exciting to be in a place where we can have more people access it and give us feedback," Shlomi Fruchter, research director at DeepMind, told TechCrunch in an interview, barely containing his enthusiasm about the launch.
The timing isn't accidental. Five months after Genie 3's research preview, Google is racing to gather user feedback and training data while competitors circle. Fei-Fei Li's World Labs dropped its first commercial product, Marble, late last year. AI video startup Runway launched its own world model in December. And former Meta chief scientist Yann LeCun's new venture AMI Labs is building world models from the ground up.
World models - AI systems that create internal representations of environments and predict future states - have become the industry's latest obsession. Many researchers, including DeepMind's leadership, view them as critical stepping stones toward artificial general intelligence. But the near-term business case is clearer: video games, entertainment experiences, and eventually training robots in simulation before they touch the real world.
Project Genie's interface is straightforward. You start with a "world sketch" by typing prompts for both the environment and a main character. Nano Banana Pro generates an initial image that you can tweak - though the modifications don't always stick. Ask for green hair and you might get purple. Feed the system a photo of your office and it'll rearrange the furniture in ways that feel sterile and video-game-like rather than photorealistic.
Once you approve the image, Genie 3 takes over, generating an explorable world in seconds. Users navigate with WASD keys and arrow controls, jumping with the spacebar. You get 60 seconds of world generation and exploration - a hard limit driven by compute costs.
"The reason we limit it to 60 seconds is because we wanted to bring it to more users," Fruchter explained. "Basically when you're using it, there's a chip somewhere that's only yours and it's being dedicated to your session."
Because Genie 3 uses an auto-regressive architecture - generating each frame based on previous ones - it demands dedicated compute resources that don't come cheap. Extending sessions beyond a minute would either blow DeepMind's budget or drastically limit how many people could access the tool.
The experience reveals both promise and rough edges. Project Genie shines with whimsical, artistic prompts. Claymation castles with puffy marshmallow towers and chocolate moats materialize beautifully. Anime-style forests and watercolor landscapes look genuinely impressive. But photorealistic worlds consistently fall flat, coming out more like dated video game graphics than cinematic experiences.
Physics and interactions present another challenge. Characters routinely walk through walls and solid objects. Navigation feels clunky, with keys often unresponsive or sending you careening in unintended directions. One attempt to walk across a room devolved into chaotic zigzagging, like steering a shopping cart with a broken wheel.
Safety guardrails are firmly in place following Google's December cease-and-desist from Disney, which accused the company's AI models of copyright infringement. Project Genie blocks anything resembling Disney IP - even innocent prompts for mermaids or ice queens get rejected. Nudity and other sensitive content are similarly off-limits.
Still, the model shows intriguing capabilities. When fed a photo of a desk with a stuffed toy, Project Genie animated the toy moving through the space, with other objects occasionally reacting as it passed. The system's memory generally holds up too - returning to previously generated areas usually shows the same layout, though occasionally a duplicate coffee mug or chair sneaks in.
DeepMind researchers were refreshingly candid about the prototype's limitations. Fruchter acknowledged that improved realism, better physical interactions, and more user control over actions and environments are all on the roadmap.
"We don't think about [Project Genie] as an end-to-end product that people can go back to everyday, but we think there is already a glimpse of something that's interesting and unique and can't be done in another way," he said.
The move to open Project Genie to paying subscribers - even in its rough state - reflects the pressure building in the world model space. World Labs raised significant funding with its vision of generating 3D worlds from 2D images. Runway's world model added native audio and expanded creative possibilities. With AMI Labs now in the mix, the race to define this category is heating up fast.
For Google, Project Genie represents both a research experiment and a strategic probe. Every marshmallow castle and claymation forest that users generate feeds data back into DeepMind's training pipelines. Every clunky navigation sequence and wall-clipping glitch reveals where the technology still needs work. And every minute of user engagement tests whether there's genuine consumer appetite for AI-generated worlds beyond the initial novelty.
The 60-second limit might feel restrictive, but it's revealing. If users keep coming back despite the constraints, DeepMind will know it's onto something. If engagement drops off after the first few whimsical experiments, that's valuable signal too. Either way, Google's moved from research preview to real-world testing faster than most expected - and that urgency says everything about where the company thinks this technology is headed.
Project Genie's launch marks Google DeepMind's transition from research lab curiosity to consumer-facing experiment, warts and all. The 60-second sessions and rough edges reveal a technology still finding its footing, but the underlying ambition is clear. As world models evolve from generating whimsical marshmallow castles to training autonomous robots, Google's betting that early user feedback will give it an edge over fast-moving competitors. For now, Project Genie offers a fascinating glimpse into a future where AI doesn't just create content - it creates entire realities you can step into, explore, and maybe eventually shape in ways we haven't imagined yet.