top | item 31348900

(no title)

I've had pretty good luck recently with a mix of SUNDAE (https://arxiv.org/abs/2112.06749) and coconet (https://arxiv.org/abs/1903.07227) and/or Music Transformer based internal models recently for modeling very small datasets of polyphonic "midified" music. Research paper hopefully soon to come... Not sure what your pipeline looks like, but those papers might be worth putting on your radar. And as you mention, symbolic music datasets are both surprisingly small, surprisingly low quality, and generally a huge pain to work with. Cool stuff - I like the sax!

For anyone unfamiliar with diffusion models (and coconet / OrderlessNADE), one of the really nice properties of them as opposed to "standard" autoregressive (GPT / RNN) style models, is that you should be able to specify any part, and fill in any other part - rather than being forced to specify the "past" and predict only the "future". The coconet "doodle" is a good example of this interface at work (https://www.google.com/doodles/celebrating-johann-sebastian-...)

XLNet had some of this promise too (https://arxiv.org/abs/1906.08237) but I never had much luck with it as a pure generator. Autoregressive Diffusion models (https://openreview.net/forum?id=Lm8T39vLDTE) have similar properties, but I haven't had time to sus out the subtle differences yet.

discuss

kastnerkyle|3 years ago

Also, if you are seeking more melodies / catchy tunes to model NES-MDB (https://github.com/chrisdonahue/nesmdb) might fit the bill. The modeling paper using that dataset + transformer XL - LakhNES (https://cseweb.ucsd.edu/~jmcauley/pdfs/ismir19.pdf) is a pretty nice paper, with interesting samples. I have been wondering what a diffusion based take on this data might sound like, and it might be a good fit if you are looking for stuff to try.

zone411|3 years ago

Thanks! I wasn't familiar with a couple of these, I will definitely check them out.