OpenAI is quietly building a generative music tool that creates songs from text and audio prompts, according to sources familiar with the project. The move puts the ChatGPT maker on a collision course with established players like Suno and Google in the rapidly expanding AI music space, potentially adding another dimension to OpenAI's multimodal AI ambitions.
OpenAI is making its next big bet on generative AI, and this time it's targeting your playlist. The company behind ChatGPT is developing a music generation tool that could transform how creators add soundtracks to videos and build musical compositions, sources tell The Information.
The tool works by taking text descriptions and audio prompts to generate original music - think "create upbeat jazz guitar for a cooking video" or "add string accompaniment to this existing vocal track." It's the kind of functionality that could make Suno, the current leader in AI music generation, very nervous about its market position.
What's particularly interesting is OpenAI's approach to training data. The company has partnered with students from the prestigious Juilliard School to annotate musical scores, creating what could be one of the most sophisticated training datasets in the AI music space. This mirrors OpenAI's strategy with other modalities - invest heavily in high-quality training data to create superior outputs.
The timing makes sense given OpenAI's broader multimodal push. The company recently launched Sora, its text-to-video generator, and having native music generation would create a powerful content creation suite. Imagine prompting: "Create a 30-second product demo video with upbeat background music" and getting both visuals and audio in one go.
But OpenAI isn't starting from scratch here. The company actually built generative music models years ago, before ChatGPT made it a household name. Those early experiments have been overshadowed by the text and image generation breakthroughs, but they provided valuable groundwork. More recently, OpenAI has been developing sophisticated audio models focused on speech synthesis and recognition.
The competitive landscape is heating up fast. Google has its own music generation experiments, while companies like have built entire businesses around AI-generated music. Suno can already create full songs with lyrics from simple text prompts, and has attracted millions of users creating everything from birthday songs to viral TikTok tracks.












