top | item 33182286

(no title)

bloep | 3 years ago

Indeed, there is lots of denial or ignorance in this thread (ignorance in the technical sense). AudioLM already produced impressive results and it's a tiny fraction of what is already possible because performance simply improves with scale. One can probably solve music generation today with a ~$1B budget for most purposes like film or game music, or personalized soundtracks. This is not science fiction.

discuss

p1esk|3 years ago

I don't see a lot of progress in AudioLM compared to results from 2018: https://storage.googleapis.com/magentadata/papers/maestro/in...

What's more interesting and concerning - listen carefully to the first piano continuation example from AudioLM, notice the similarity of the last 7 seconds to Moonlight sonata: https://youtu.be/4Tr0otuiQuU?t=516

I'm afraid we will see a lot of this with music generation models in the near future.

bloep|3 years ago

There are quite simple tricks to avoid repetition/copying in NNs, e.g. by (1) training a model to predict the "popularity" of the main model's outputs and penalizing popular/copied productions by backpropping through that model so as to decrease the predicted popularity, or (2) by conditioning on random inputs (LLMs can be prompted with imaginary "ID XXX" prefixes before each example to mitigate repetitions), or (3) by increasing temperature or optimizing for higher entropy. LLM outputs are already extremely diverse and verbatim copying is not a huge issue at all. The point being, all evidence points to this not being a show stopper if you massage these evolutionary methods for long enough in one or more of the various right ways.