Change language

Artificial intelligence Open AI learned to write songs with vocals

OpenAI has unveiled Jukebox, an open-source artificial intelligence system that can generate complete songs with music, meaningful lyrics and vocals.

Researchers have trained Jukebox on 1.2 million pieces of music in almost all genres (for example, ristenk). Now the AI can create its own songs, which are often similar to the works of the artists it was trained on. Jukebox is able to mimic a particular genre of music and can recreate the style of a particular artist.

Artificial intelligence Open AI learned to write songs with vocals

"Our models can create songs from a wide variety of musical genres, such as rock, hip-hop and jazz. They can mimic the melody, rhythm and sound of a wide variety of instruments, as well as vocals that will go along with the music."

OpenAI has been working on creating music for several years. The company's previous development, MuseNet, was capable of creating MIDI tracks, but before Jukebox there was no AI that could write songs with full vocal parts in different genres.

According to The Next Web, despite Jukebox's superiority over other music neural networks, the project is far from perfect. The AI still lacks the skill to reproduce a standard song with choruses and repeating motifs. In addition, Jukebox requires huge computational resources. Because of this, using OpenAI's new development in a home or studio setting is not yet possible.

"We have shared Jukebox with several musicians, and these musicians are not yet able to apply it to their creative process," the company points out.

Some musicians also point out that Jukebox could cause copyright issues.

"The new OpenAI tool, which automatically generates songs in the style of world-famous celebrities and includes reproducing their voices, is not only a technologically impressive and very exciting project, but also a terrifying phenomenon in terms of copyright law. Have Kanye West, Katy Perry, Aretha Franklin, Elvis Presley and other artists given OpenAI permission to use their audio recordings as training material for this algorithm? I don't think so," musician Cheri Hu tweeted.

Nevertheless, the developers hope that in the future Jukebox will be able not only to imitate compositions, but also to create entirely new tracks that will be indistinguishable from the work of real musicians.

For now it takes Jukebox about nine hours to write one minute of a song. You can listen to the compositions written by the AI on their website.

Automated music creation with artificial intelligence

As more and more people spend time at home, creating, listening to and using music in various projects is becoming more important in their lives. The early successes in creating, producing and editing music using artificial intelligence are stunning and will accelerate this trend even further.

However, automatically generating music is quite a challenge for many reasons. The biggest obstacle is that a simple three-minute song, which a group of people can easily memorize, contains too many variables for a computer. In addition, there is as yet no perfect way to train an artificial intelligence to be a musician.

And the goal itself is also far from obvious to us developers. Are we trying to create music out of thin air or from some form of input values? Or do we want to create a system that can accompany a person as they play?

We believe there is currently no reason for musicians to worry about their career prospects. It is unlikely that artificial intelligence will fully automate the industry in 2021. But creating professional-quality music will clearly become easier and cheaper in the near future.

Let's take a look at three companies that are trying to automatically generate music, and assess the possibility that a data scientist will soon be awarded the first Grammy.

OpenAI's Jukebox - for the futurists

OpenAI is one of the companies founded by that deer-loving, rocket and car inventor named Elon Musk. OpenAI has several creative projects, the most notable of which is GPT-3, dedicated to literature. But as a music lover, I've given a special place in my heart to Jukebox.

"We present Jukebox, a neural network that generates music, including primitive singing, in the form of raw sound across different genres and styles of artists."
- OpenAI

The basic idea is that they take raw sound and encode it using convolutional neural networks (CNNs). Think of it as a way to compress a large number of variables down to a smaller number. Such a measure is necessary because there are 44 100 variables in just one second of sound, and there are many in a song. Then they do the process of generating this reduced set of variables and uncompressing it back down to 44,100.

Amper Music - for everyone

A completely different approach is taken by Amper Music. The Amper generator creates the music itself and does not allow a human to control the process. It does this by using so-called descriptors.

"Descriptors are musical algorithms that reproduce a particular style of music. One descriptor might be created to play New York punk rock and another to play laid-back beach folk."
- Amper Music.

When generating, you can choose two parameters: the length of the song and a set of characteristics for the descriptor. I chose "playful futuristic documentary," and the result is quite nice and potentially usable. After that, the suggestion is to choose a set of tools to go with the descriptor. I settled on forks and knives.

The difference in approach using the mechanical piano tape example

The approach used in Amper most likely combines the first two solutions, and it's hard for me to speculate exactly how it works.

The main difference between AIVA and Jukebox is the nature of the data structure (the way music is stored). To understand the difference between Jukebox and AIVA, we must first understand the difference between audio recording and the MIDI standard. In our case, MIDI can be understood as a set of multiple tapes for a mechanical piano (pianola), where the tape is essentially a single instrument.

Piano tape is one of the oldest specialized data structures. Its foundation was laid in 1896. Originally developed for automatic piano playing, today it serves as a canvas for music.
 

Shop

Gifts for programmers

Learn programming in R: courses

$FREE
Gifts for programmers

Best Python online courses for 2022

$FREE
Gifts for programmers

Best laptop for Fortnite

$399+
Gifts for programmers

Best laptop for Excel

$
Gifts for programmers

Best laptop for Solidworks

$399+
Gifts for programmers

Best laptop for Roblox

$399+
Gifts for programmers

Best computer for crypto mining

$499+
Gifts for programmers

Best laptop for Sims 4

$

Latest questions

PythonStackOverflow

Common xlabel/ylabel for matplotlib subplots

1947 answers

PythonStackOverflow

Check if one list is a subset of another in Python

1173 answers

PythonStackOverflow

How to specify multiple return types using type-hints

1002 answers

PythonStackOverflow

Printing words vertically in Python

909 answers

PythonStackOverflow

Python Extract words from a given string

798 answers

PythonStackOverflow

Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

606 answers

PythonStackOverflow

Python os.path.join () method

384 answers

PythonStackOverflow

Flake8: Ignore specific warning for entire file

360 answers

News


Wiki

Python | How to copy data from one Excel sheet to another

Common xlabel/ylabel for matplotlib subplots

Check if one list is a subset of another in Python

How to specify multiple return types using type-hints

Printing words vertically in Python

Python Extract words from a given string

Cyclic redundancy check in Python

Finding mean, median, mode in Python without libraries

Python add suffix / add prefix to strings in a list

Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

Python - Move item to the end of the list

Python - Print list vertically