Musico: AI Generated Music

[+] throw149102|3 years ago|reply

To my ear, they all sound about okay for 4 seconds, until my brain recognizes that there's no tension being built or story being told. It's like every track is 4 seconds of music followed by 4 seconds of music followed by 4 seconds of music rather than a track with a real sense of progression.

Many have said in this thread already that maybe we ought to expect that a ml approach in the next few months/years could be much better. I'm not so confident that it will happen so soon. Audio might end up being a much harder problem than visuals, for a variety of different reasons. Having the time domain built into the medium requires some concept of memory, and even modern neural nets seem to struggle remembering what they said before the most recent prompt.

Once again though, its not impossible. Just requires the right techniques and enough people focused on it.

[+] lm28469|3 years ago|reply

The thing is even if you can make a machine reproduce it, it's missing the human component, and the fact that you (I) know it's not human made already degrades the experience.

What AI gives you is a mash up, a mix of people's intent, a mix of people's feelings. What I want is the result of a singular person expressing his singularity though his work, I don't want the "average of the best" music or the "average of the best picture". This is good for content creation, when you need to pump out the maximum amount of "content" for people to "consume" (see marvel, netflix&co), but not for art

Art that leave a mark is always weird/quirky/personal/deep/&c. the fact that a machine can replicate the result removes the most interesting part of the equation, the human part. It's like making your own bread vs buying supermarket bread, the later is cheaper and faster, it might even taste better if you fucked it up, but it's a complete different experience

[+] dwringer|3 years ago|reply

Several years ago I implemented some very basic rules outlining some distance metrics between two chords and ran some multiobjective evolutionary algorithms to generate, say, a 16 bar progression while trying to minimize these distances between any two subsequent jumps. I added a couple or three other objective functions for judging the progressions by my idea of structure (i.e., starting and ending on the same chord), and found the results to be very promising.

With enough of a sophisticated rule system (which could be built from existing music) an AI should be able to optimize for tension building or storytelling quite easily. Of course it will only optimize for the definition of tension building or storytelling that it understands, via statistical methods or being told by the programmer explicitly. In the latter case the programmer is just doing one of the things composers always do, while in the former, the generated content is interesting - or not - in much the same way (IMHO) as transformer-based language generators like GPT-3.

[+] Tenoke|3 years ago|reply

Most real tracks don't have tension buildup or progression. That you judge music based on it mostly just speaks of your preferences. As far as I heard the tracks were coherent and not just 4s snippets glued together. Having said that, I don't think they were exceptional or anything.

[+] Rochus|3 years ago|reply

So far so true. But there are also better examples. Bachbot was rather good, or some of the examples of MuseNet (https://openai.com/blog/musenet/).

[+] PaulDavisThe1st|3 years ago|reply

102% on-point recent video from composer David Bruce ("The DALL-E2 of Music?")

https://www.youtube.com/watch?v=QN0DDD7B3oU

Bruce makes the point that early "AI" image generation was pretty shit, but a couple of specific developments (especially diffusion modelling) changed all that remarkably quickly; the corollary is that we might expect that for music too.

[+] z9znz|3 years ago|reply

Well, from listening to several on the first page, I say musicians and composers have nothing to worry about in terms of AI competition for some time.

As amazing as AI image generation is, AI music generation (to me) is not.

[+] rockarage|3 years ago|reply

Yes, I'm just going to quote my previous comment from 7 years ago (about a similar startup)

https://news.ycombinator.com/item?id=10707049

"This is a business I know well. Jukedeck is an example of how founders and investors do not conduct appropriate market research. There is a limited market demand for low-cost royalty-free music for videos. One could argue there is an oversupply* of royalty-free music relative to buyers. The quality is not good enough to disrupt the billion dollar Production music industry that is top heavy, a relative small amount of creators at the top get the majority of the money, the rest compete for the little that is left. Jukedeck has raised enough money ($3million #) to be around for a few years if they control their burn rate. But Jukedeck in its current form, is just another music startup destined for the Deadpool."

[+] blondin|3 years ago|reply

i am sorry but we (i make music in my spare time) need to worry.

to echo what the comment above is saying. not because this AI generated music is good. but because it is good enough for the vast majority! and this concerns other fields as well. the consumption economy has set the bar for music and entertainment in general very low. most people don't know (or think) that the bar is low. because they have not experienced anything better and probably never will.

what's gonna happen next will be sad.

[+] barbariangrunge|3 years ago|reply

Midjourney improved so much over the summer, it was staggering. Arms no longer grow and point wherever the wind blows them. Tables are covered in identifiable objects instead of nonsense shapes.

The gap between 'okay' and 'really good' is smaller than we think

[+] avodonosov|3 years ago|reply

Maybe you are biased because you know that's AI?

A blind test would be better. A musical Turing test - whether you like music and can't tell if that's AI or real composer.

[+] babyshake|3 years ago|reply

> for some time.

As we have seen with text-to-image, sometimes AI models can improve at an astonishing rate.

[+] hallux|3 years ago|reply

These things make me sad. Not that the results are anywhere close to good enough to replace a human artist yet, but eventually it'll probably be at a level that's good enough for most people, and that'll be the day when music will be truly disposable. I imagine an endless stream of music equivalent to the average Netflix-produced movie. Perfect for people who want music to play all the time without actively listening to any of it.

[+] garyrob|3 years ago|reply

One of the aspects of Orwell's 1984 that makes it a dystopia is that popular music for the proles is composed by machines...

[+] PaulDavisThe1st|3 years ago|reply

> that'll be the day when music will be truly disposable

I would say we reached that day quite some time ago, due to a variety of factors, none of which included AI/ML/correlation analysis.

[+] frankzander|3 years ago|reply

Music is disposable ... just today. Just take a listen at the most songs in charts ... senseless crap. At the end always better like this AI crap at least to my ear. Don't worry because AI would not replace a big part of music if ever. Maybe AI would be used for generating catchy hook lines but that's all. The end selection of what's catchy and what's boring will always do a human.

[+] bhedgeoser|3 years ago|reply

Now we just need something similar to https://github.com/nogasm with a different kind of "implant" that monitors my dopamine levels and optimizes based on that.

[+] allears|3 years ago|reply

I guess I have no taste. I had a lot of fun playing with the "live" music generator. Some of the settings produced interesting, catchy riffs.

As a musician and a listener, I vastly prefer real instruments played by real musicians, preferably acoustic. So I wouldn't actually listen to this stuff, but then I don't generally listen to any kind of electronic music. But this AI thing generates the same kind of shallow, bland, sometimes catchy stuff I hear all the time that's supposedly made by "creators."

[+] dalmo3|3 years ago|reply

Surprisingly good. Not Good good but I expected a lot worse.

I think there's great potential for that kind of music in video games, if they can procedurally generate it on the fly based on the current game state (think e.g. roguelikes).

[+] justusw|3 years ago|reply

I assume this is using a lot of soft synths and samples? Whatever they are, they sound really cheap. I hope I don't back myself into some Luddite corner here and this later turns out to be really impressive, like we are seeing in the image generation space. But given that we are living in a market of overabundance of stock photos, royalty free music, etc., I am sure that this is not endangering anything anytime soon.

And while comparing it to image generation, with Stable Diffusion and other models, a human has to be in the loop in order to generate the prompts, so we can't entirely replace them here either. How about an AI music generator that creates phrases, rhythms or sounds/VST presets based on a prompt for me?

If an average person can choose between listening to this, or listening to a musician that they have a personal connection to on their favorite streaming service, I wonder which one will be picked most of the time?

[+] anigbrowl|3 years ago|reply

Fawed but interesting. Experimental page should have nudge/spin again option on each section; sometimes the results are musically very good and an interesting segue from the previous section, sometimes they're just kinda ass.

Also I think it's a mistake to just look at music (even electronic dance music) as a series of transformations that journey from A to B. A lot out-there music relies on fairly safe tonalities and very straightforward musical structures, to provide an anchor for wild timbral experimentation. Listeners can enjoy the departure from conventional sonic reality in somewhat the same way as a theme park roller coaster balances existential terror and reliable predictability.

[+] midenginedcoupe|3 years ago|reply

Wow, this is terrible. Why would you release this in this state?

Edit: Wait, the Streams have had human involvement!?

>streams are ”harvested” by human operators: after choosing some initial settings, the operator lets the engine run and collects the score it composes.

> Without changing this score, the operator then produces a sound file for it, by choosing the electronic instruments that play the score, making a simple mix and recording the result, using standard music production software.

Imagine this being the curated best they could do, even after explicitly choosing instruments after the fact to best fit and then mixing it.

[+] BFay|3 years ago|reply

I highly highly recommend listening to the track called "Test_energylevel", it is absolutely bonkers, and more interesting than any of the other tracks I clicked on here. (You have to click on "Explore" and then scroll down a bit to find it)

It starts with a choir of ambient vocals singing "it's a sunshine", there's some bird noises and traffic sounds, a snippet of organ synth flourishes.

Then it all gets started - guitar, sitar, horns, strings a vocal duet singing "Shine your sweet loving down on me"

There's actually a ton going on, every few bars it changes things up, there's clever little harpsichord.

Then a male singer starts proclaiming "It's a sunshine daaaayyy", backed up by a chorus of "yeah, yeah, yeah"s. Honestly it's kind of catchy

The last 30 seconds or so are truly cursed, there's a voice in the right speaker moaning "wide eyed retina. mostly logical", which gets delayed, bitcrushed, and pingponged between the speakers.

Wow! I wonder why this specific track has so much going on compared to the others.

[+] djmips|3 years ago|reply

I can't find that track. Any link you might have?

[+] random_upvoter|3 years ago|reply

It's going to be an interesting lesson for humankind to gradually come to realization that anything worth reading, worth watching or worth listening to cannot be born from algorithm. Any science that is based on averaging a gazillion things can only produce things that are average at best.

[+] radiojasper|3 years ago|reply

What I learned from this is that AI can't compose music.

[+] Nursie|3 years ago|reply

I wonder how people will react when this does become "good enough".

I'm thinking of those that consider themselves prompt artists or prompt engineers, and consider DALL-E/SD to be merely tools that creatives use to create their art, just like photoshop, and how dare you insinuate that the work isn't my original creation...

If I type in "Symphony, Beethoven, dramatic strings, romantic theme turning dark then triumphant, tenor flute solo in third movement" and such a thing is produced... am I a musician? A classical composer perhaps?

I'm not trying to say that there is no act of creation on the part of the user of such systems, but I do think it's an interesting area of discussion, because to me there is some sort of qualitative difference here.

[+] classichasclass|3 years ago|reply

If folks remember Ballblazer by Lucasfilm Games, Russ Lieblich wrote a music player that would string together riffs like weird chiptune jazz solos. The walking bass was the best part. It was constrained by the template but could come up with some surprising harmonies.

[+] marban|3 years ago|reply

Sounds like a toddler and a Corgi jamming away on a Roland Juno.

[+] bagelbruno|3 years ago|reply

Can we just not.

[+] 6stringmerc|3 years ago|reply

This is a wonderfully concise response and I plan to use it verbatim in the future to evidence my sincere and perhaps bottomless disdain for something. Bravo phrasing.

[+] jzb|3 years ago|reply

Interesting and listenable in a “don’t mind this, but forgettable” way. It’s currently okay as disposable background music.

But it seems very far from generating music that I’d actually connect with. Years, at least, away from creating a solid song, much less something that talented songwriters should worry about.

[+] xbar|3 years ago|reply

It's years, at least, away from creating anything that I'd connect with. But it's only a couple years away from filling all the lobbies and elevators of the world.

[+] catach|3 years ago|reply

"Years away" is shockingly closer than most of us would have assumed just a year ago, I think.

[+] ghoomketu|3 years ago|reply

I think this sort of music is really good for video games or repetitive loopy music like in the earlier video games. I even love this during programming sessions as there is a certain quality to such tracks that help you gain more focus quickly.

[+] Rodeoclash|3 years ago|reply

Interesting, I think it needs better samples though - got a real early 2000s midi feel to it.

[+] omwow|3 years ago|reply

Seems they're trying to corner the market for more annoying elevator music.

99 comments