What Is OpenAI’s Jukebox and What Can You Do With It?

Generative AI is slowly spreading to evermore disciplines in the creative industry. It kicked off with AI art generators and then spread to writing with AI-generated text. Now, we can add music to that list.

In the near future, AI-generated music, spawned from scratch, will become a reality. In fact, it’s already a possibility with Jukebox, OpenAI’s music-making AI model. It’s not yet available in an easy-to-use application, and it doesn’t sound good enough yet, but the algorithmic bones are there.

4

Here is what you need to know about OpenAI’s Jukebox and what you can do with it.

Jukebox: AI That Generates Music as Raw Audio

Jukebox is a neural net that can generate music in raw audio form when you give it input like genre, artist, or lyrics. It was released in April 2020 by OpenAI, the same company that brought us the AI art generator named Dall-E, and the AI chatbot called ChatGPT.

Unlike Dall-E, which spread rapidly across the world and made AI a fevered topic of news and media, Jukebox didn’t register a wide array of interest following its release. One reason for this is that it doesn’t have a user-friendly web application—at least, not yet.

Man playing an organ.

you may find the code on theOpenAI website, alongside an in-depth explanation of how the encoding and decoding process works.

Another likely reason is that it takes an enormous amount of time and computing power. To give you an idea, just one minute’s worth of audio can take 9 hours to render. You will need a willingness to explore the model in its code form, plus a lot of patience if you want to see what an AI model can do to generate music.

Screenshot of the Jukebox sample explorer showing a list of AI generated samples

Or, you may skip to theJukebox Sample Explorer. This is where OpenAI has posted its experiments from generating songs in the likeness of Ella Fitzgerald or 2Pac.

To be clear,other AI music toolsexist to help you generate a song, but they don’t generate audio from scratch. Instead, they are either combining pre-recorded samples or creating MIDI information that is put through a digital synthesizer.

AI model image

What Does Jukebox Sound Like?

The results of Jukebox are recognizable but strange. It’s not difficult to understand the shape of the song and the genre it belongs to, but the quality of the results makes it sound as if you’re listening to some of the earliest recorded music: that is, muffled with plenty of noise.

It’s safe to say, Jukebox doesn’t produce the kind of high-fidelity sound you would hear from a pair of good headphones. It’s more akin to hearing music from a radio station that isn’t fully tuned to the right frequency. Some songs are re-renditions while others are continuations of existing songs. There’s also a category for novel artists and styles, and unseen lyrics.

Samsung Galaxy S24 Ultra and iPhone 11 next to each other

Despite the quality of sound, early experimenters describe being awed by the eerie beauty and bizarre nature of the music created by Jukebox. “Like a soundtrack to documentation about an unknown country with an unknown culture”, writesMerzmench on Medium.

Currently, the results are far from good enough to copy, or even replace, music created by humans, but the technology is moving rapidly and, soon enough, models like Jukebox will be able to accomplish those feats too.

How OpenAI’s Jukebox Was Trained

Part of how Jukebox is able to create music that’s never before existed is that it’s trained on the music of real musicians. OpenAI explains that:

“To train this model, we crawled the web to curate a new dataset of 1.2 million songs (600,000 of which are in English), paired with the corresponding lyrics and metadata from LyricWiki.”

Crawling for data is a practice used by some AI companies to create a set of data that an AI model can use to learn from, and make decisions when generating an image, text—or in this case—music. Datasets created by crawling are controversial because consent isn’t gained from the owners of the data in the first place. Although, some platforms allow you toopt your content out of datasets.

You might think that 1.2 million songs are a lot, but by comparison, Dall-E 2 was trained on hundreds of millions of image-text pairs from the internet. With that in mind, Jukebox has its limitation.

Its relatively small training pool can’t capture the wealth and diversity of human music. OpenAI has stated that it’s largely trained on Western music, representing a clear bias in what music it’s capable of generating.

What Can You Do With Jukebox?

So, with its limitations in mind, what can you do with Jukebox? A quick way to answer that question is to say what you can’t do with Jukebox.

Because it takes close to half a day to render one minute of music, it’s not very useful for producing music. At least, not in the traditional sense. Normally, musicians move back and forth between playing around on an instrument (improvising) and planning the structure of a song. The same sort of experimenting isn’t possible with Jukebox.

Since it’s not easy to craft a song with Jukebox at this stage, you can think of it more as a novel way to generate music samples. Once you’ve generated audio that you like, you can use it in your creative projects as you might normally do.

The video below is the result of someone using music created with Jukebox to underscore a short montage video.

Artificial intelligence has a wide range of applications outside creative applications as well, which is why it’s worthunderstanding what AI is and the dangers it poses.

Are You Moved by AI Music?

The music generated by Jukebox isn’t easy to dismiss, and for all its strangeness and eerie, human-machine quality, it does, in the end, sound like music. While the music industry has been using AI tools for some time now, the possibility to generate music as raw audio is only now a reality.

But while the models like Jukebox exist, they have yet to be packaged into a commercial tool and still fall short of the capabilities of human musicians.

Do you fear that artificial intelligence could soon replace you at work? Fear not! Here’s why AI can’t replace humans in the workspace.

Your phone’s camera app doesn’t show this, so it’s easy to miss.

You can block out the constant surveillance and restore your privacy with a few quick changes.

Sometimes the smallest cleaning habit makes the biggest mess.

Don’t let aging hardware force you into buying expensive upgrades.

Tor spoiled me forever.

Technology Explained

PC & Mobile