Very ambitious and looks like quite an accomplishment! I've been working recently on several DSP projects, including for web and embedded applications, and I definitely appreciate writing performant code without having to write C++ by hand.
I couldn't tell from your post and linked material how the runtime works — I see that the high-level graph is handed over to the runtime, but is it interpreted? Compiled? Does it require platform-specific binaries?
For my current projects I settled on Faust [1], which has the advantage of compiling to C++ source files (which can then be used with any platform/framework that supports C++), but that has the disadvantage that swapping components in and out (as you describe in the linked article) is not so easy.
I've used Faust quite a bit myself, and it has been an inspiration of mine for this project. But indeed, it has limitations that were prohibitive for me, and I'd rather not have to context switch from a DSL for my DSP into some completely different language for the rest of my app if I can avoid it.
> I couldn't tell from your post and linked material how the runtime works — I see that the high-level graph is handed over to the runtime, but is it interpreted? Compiled? Does it require platform-specific binaries?
Yea, check out this page of the docs: https://docs.elementary.audio/guides/native_rendering It is interpreted dynamically by the runtime, and currently does require platform-specific binaries but I'm also building the runtime such that end users can embed the functionality into any app they need
Does this support writing audio sample processing code in JavaScript? From the examples it looks like JavaScript code connects together pre-existing audio processing nodes, but it's unclear if you can write your own nodes in JavaScript.
Also, the use of the term "pure function" confused me. Pure functions have no side-effects (https://en.wikipedia.org/wiki/Pure_function). Why is this necessary? It seems like non-pure functions could generate the graph of nodes too, so restricting ourselves to pure functions seems like a limitation. Or does the term "pure function" actually mean "in functional programming style" (i.e. functions, not objects) in this blog post?
> Does this support writing audio sample processing code in JavaScript?
Currently, no. My intention is that the library of pre-existing nodes is both complete enough and low-level enough that 95% of things end users may wish to write can be done through this composition. That said, the embedded runtime engine is designed to be extended with your own C++ processors if you have an application which particularly needs that level of customization.
And why pure functions? Functions with side effects don't compose the way that pure functions do, because you can't rely on deterministic output given a consistent input. If you choose, you can still write your own non-pure functions to assemble your signal graphs, though I personally wouldn't recommend it :) The Elementary library will always aim to be strictly pure functions.
What's the story for building VST plugins? How much c++ glue code do you need to write?
The amount of effort required to build a VST plugin is very off-putting. IMHO it's hard to justify doing unless it's a commercial project. I think we would see more idiosyncratic and creative plugins if building one took weeks instead of years.
As shown in the introduction video, with the `core.on('midi')` callback, MIDI voice triggering is not sample accurate. It's quite hard to invoke arbitrary user code in js and still hit a sample accurate deadline!
That said, the embedded runtime itself supports sample accurate interaction in this way if you're willing to get into a bit of C++.
Currently, building a VST plugin requires just about as much C++ glue as it takes to write the interface. The next step for Elementary though is a solution which addresses that bit too. My goal in the near term is a platform to write VSTs with no C++ at all.
I agree very much with your last comment there, and that is a major impetus for building this tool. I'd love to see what kind of plugins come out when it's that much quicker and easier!
OP here– good point, that's not well explained in the article! And thanks to the others who have answered already, you're pretty much spot on.
The core library, referenced here as `el`, contains a set of factory functions for describing various underlying processing nodes. So when you write `el.lowshelf()` you're just using one of the builtins (like how you might use a div, span, or button in a web app with React)
From checking the video, it's the default export of one of the project's core libraries. I gather it's meant to be in scope for all use of the project.
My only fear is that on a first glympse it seems that it is biased towards someone familiar with React and its quirks - down to a `state` variable that gets passed around through callbacks. I would personally prefer a more "vanilla Javascript" approach. But what constitutes "vanilla" these days?
Incredible! I have a toy webaudio livecoding project that I've been dying to get working efficiently in VST-form. Looks like Elementary could be a great way to do this.
Honest question, as I'm not really deep into this:
What are the differences to the JS runtime Reaper is integrating in its DAW, despite being able to run it outside of Reaper?
Would've loved it if it wasn't a JUCE wrapper. Nevertheless, some great work here. The same approach can be applied to any non-JUCE stuff too, which would be kickass! :)
Why do you say that? I've built the core of the engine entirely from scratch, entirely indepenent of juce. The CLI tool does use juce for opening the audio device, but that's about it.
How does the dual licensing AGPLv3 plus commercial work? If one creates a derivative are the contributions also dual licensed automatically, or how is it done?
I'd say there are two primary differences between this and any current web audio implementation that I'm aware of:
First is the model/API for actually writing your app. Web Audio is very imperative and, in my experience, doesn't handle dynamic behavior terribly well. Elementary is designed both for functional/declarative expression and for change over time.
The second piece is that Elementary is built around a core runtime which can be embedded anywhere. Web Audio is, as its name might suggest, quite coupled to the browser environment. My goal is that if you've written an audio app that runs at the command line with Elementary, you could package that same thing up and ship it as a VST plugin, on embedded linux hardware, etc.
EDIT: I didn't mean to sound so overconfident or showing off my credentials. This is my personal view given what I know, but there are things I don't know (like good vs. bad OS audio APIs, how to tune kernels and pick good audio drivers, and whether you can get away with allocations, mutexes, or a JITted/GC'd language on the audio thread). It's possible that writing audio code in JS or non-real-time C++ won't cause problems to users in practice, as long as it runs on a dedicated thread and doesn't contend with the GUI for the event loop, and processes notes synchronously and deterministically, but I haven't tried. For various opinions on this from people who have done more experimentation than me, see https://news.ycombinator.com/item?id=27128087 and http://www.rossbencina.com/code/real-time-audio-programming-....
----
I've spent many years working on computer music and DSP in general, and the last two years writing interactive audio apps like DAWs and trackers (traditionally written in real-time code).
Personally I feel that not only should the signal processing be handled in compiled code, but the process of allocating voices for incoming notes and sweeping filter parameters for existing notes (like a VST/VSTi plugin) should be real-time/allocation-free and deterministic.
Additionally for a sequencer or DAW-like program playing a scripted project file, the process of scanning through a document and determining the timing of events should also be real-time and deterministic, requiring it be synchronous with the code driving synthesis/notes/envelopes. (Some Web Audio music demos I've seen on HN, like the 64-step drum sequencer[1] or The Endless Acid Banger[2], are not deterministic. On a slow computer, when the browser's JS engine stutters, notes stop being triggered even though existing notes sustain.) I think that requires that the "document" is stored in C/C++/Rust structs and containers, rather than GC'd objects like in JS.
(Processing user input depends on low latency without latency spikes rather than determinism and synchronous processing, but I don't know if browsers are good at that either.)
At this point, I find it significantly easier to write a GUI in native toolkits like Qt, than to learn and write bindings for a GC'd language to access native data structures. And unfortunately there is a limited selection of mature native toolkits that are both not buggy (Qt is buggy) and has accessibility and internationalization, and optionally theming and native widgets. I still believe that writing a GUI in a native language can become a better user and developer experience than browsers, if more people invest effort and resources into better paradigms or implementations to fix native's weaknesses compared to browsers (buggy libraries, no billion-dollar megacorps funding web browsers, apparently people nowadays think that React using a virtual DOM and recomputing parts of the UI that don't change is a strength), while maintaining its strengths (less resources required, more predictable memory and CPU usage, and trivial binding to native code).
What's the current state of native desktop GUIs? Qt is nearly good enough, but is run by a company trying to antagonize the open-source community and rip off commercial users, binds you to C++ (which is less pleasant than Rust), suffers from bugs (some neglected, deep-seated API design flaws), and handles reactivity poorly (though QProperty and transactions[3] promise to improve that aspect). GTK has cross-language support, but in gtk-rs, GTK's refcounted design and inheritance and Rust's move-based design and composition are fighting against each other. There's other older APIs like WxWidgets, FLTK, FOX, etc, many of which I personally dislike as a user. Flutter is promising, but still buggy, the upper layers are virtual-DOM-based (I have reservations), and feels foreign on desktops (many missing keyboard shortcuts, one app (FluffyChat's unfinished desktop port) is missing right-click menus, has broken keybinds, and has painfully slow animations).
Is there an alternative? Where would you draw the seam between a GC'd GUI and a realtime audio engine, to minimize the boilerplate glue code, and ensure that note processing and audio synthesis are real-time and deterministic (does this require being synchronous with each other?)?
I've seen a lot of software with a non-real-time or nondeterministic audio engine/sequencer (even though I personally think it's bad for users and unacceptable when I design a program). For example, ZorroTracker has an audio processing path written in JS (a RtAudio thread calling into V8 running in parallel with the GUI thread, calling JS bindings to native .dll/.so audio generators), coupled with a (planned) sequencer written in JS and operating on JS objects, giving up on real-time. As I've mentioned, several browser-based audio projects I've seen on HN generate audio in Web Audio (real-time), but trigger notes asynchronously in non-real-time JS code.
> Eventually it dawned on me that the task of composing audio processing blocks is itself free of those realtime performance constraints mentioned above: it's only the actual rendering step that must meet such requirements.
Given my current understanding of real-time programming, I think that the task of feeding inputs into audio processing blocks is not free of realtime performance constraints. In any program with an audio sequencer, I'm not aware of how to get deterministic playback with predictable CPU runtime (which I do not think is worth compromising), without worrying about "memory management, object lifetimes, thread safety, etc.", preallocating memory for synthesis, and picking a way to communicate with the UI (eg. lock-free queues for processing commands sequentially, and atomics or triple buffers for viewing the latest state). If you have a solution, do let me know!
off-topic but the author's negative sentiments toward c++ is interesting to me as it is quite opposite with me.
I use c++ mainly but when I have to use javascript at work, i really hate it for whatever reason. wonder if other c++ programmers feel the similar way.
[+] [-] matheist|4 years ago|reply
I couldn't tell from your post and linked material how the runtime works — I see that the high-level graph is handed over to the runtime, but is it interpreted? Compiled? Does it require platform-specific binaries?
For my current projects I settled on Faust [1], which has the advantage of compiling to C++ source files (which can then be used with any platform/framework that supports C++), but that has the disadvantage that swapping components in and out (as you describe in the linked article) is not so easy.
[1] https://faust.grame.fr/
[+] [-] trypwire|4 years ago|reply
I've used Faust quite a bit myself, and it has been an inspiration of mine for this project. But indeed, it has limitations that were prohibitive for me, and I'd rather not have to context switch from a DSL for my DSP into some completely different language for the rest of my app if I can avoid it.
> I couldn't tell from your post and linked material how the runtime works — I see that the high-level graph is handed over to the runtime, but is it interpreted? Compiled? Does it require platform-specific binaries?
Yea, check out this page of the docs: https://docs.elementary.audio/guides/native_rendering It is interpreted dynamically by the runtime, and currently does require platform-specific binaries but I'm also building the runtime such that end users can embed the functionality into any app they need
[+] [-] _def|4 years ago|reply
[+] [-] ur-whale|4 years ago|reply
Let's see (and hear) a drum module built with the framework.
[+] [-] trypwire|4 years ago|reply
In the mean time, check out https://github.com/nick-thompson/elementary for some examples that you can `npm install && npm start` to hear
[+] [-] stefanha|4 years ago|reply
Also, the use of the term "pure function" confused me. Pure functions have no side-effects (https://en.wikipedia.org/wiki/Pure_function). Why is this necessary? It seems like non-pure functions could generate the graph of nodes too, so restricting ourselves to pure functions seems like a limitation. Or does the term "pure function" actually mean "in functional programming style" (i.e. functions, not objects) in this blog post?
[+] [-] trypwire|4 years ago|reply
Currently, no. My intention is that the library of pre-existing nodes is both complete enough and low-level enough that 95% of things end users may wish to write can be done through this composition. That said, the embedded runtime engine is designed to be extended with your own C++ processors if you have an application which particularly needs that level of customization.
And why pure functions? Functions with side effects don't compose the way that pure functions do, because you can't rely on deterministic output given a consistent input. If you choose, you can still write your own non-pure functions to assemble your signal graphs, though I personally wouldn't recommend it :) The Elementary library will always aim to be strictly pure functions.
[+] [-] crucialfelix|4 years ago|reply
There was still work to do to bring it to it's full potential. I think audio apps are a great fit for this virtual audio graph paradigm.
Congratulations on the release, looks great!
It's the dryadic components here: https://crucialfelix.github.io/supercolliderjs/#/packages/su...
The examples repository has more.
Work and babies cut me short from finishing it though.
[+] [-] fancy_hammer|4 years ago|reply
Is voice triggering with midi sample accurate?
What's the story for building VST plugins? How much c++ glue code do you need to write?
The amount of effort required to build a VST plugin is very off-putting. IMHO it's hard to justify doing unless it's a commercial project. I think we would see more idiosyncratic and creative plugins if building one took weeks instead of years.
[+] [-] trypwire|4 years ago|reply
As shown in the introduction video, with the `core.on('midi')` callback, MIDI voice triggering is not sample accurate. It's quite hard to invoke arbitrary user code in js and still hit a sample accurate deadline!
That said, the embedded runtime itself supports sample accurate interaction in this way if you're willing to get into a bit of C++.
Currently, building a VST plugin requires just about as much C++ glue as it takes to write the interface. The next step for Elementary though is a solution which addresses that bit too. My goal in the near term is a platform to write VSTs with no C++ at all.
I agree very much with your last comment there, and that is a major impetus for building this tool. I'd love to see what kind of plugins come out when it's that much quicker and easier!
[+] [-] moritzwarhier|4 years ago|reply
Is el related to the user interface or does it already describe signal processing in some declarative way?
At first glance, I thought the actual processing is invoked when core.render is called.
Anyway, maybe I just have to read more about the project or watch the video.
[+] [-] trypwire|4 years ago|reply
The core library, referenced here as `el`, contains a set of factory functions for describing various underlying processing nodes. So when you write `el.lowshelf()` you're just using one of the builtins (like how you might use a div, span, or button in a web app with React)
[+] [-] acjohnson55|4 years ago|reply
[+] [-] fenomas|4 years ago|reply
[+] [-] otikik|4 years ago|reply
My only fear is that on a first glympse it seems that it is biased towards someone familiar with React and its quirks - down to a `state` variable that gets passed around through callbacks. I would personally prefer a more "vanilla Javascript" approach. But what constitutes "vanilla" these days?
[+] [-] moritzwarhier|4 years ago|reply
[+] [-] maxbendick|4 years ago|reply
[+] [-] FrankyFire|4 years ago|reply
[+] [-] frabert|4 years ago|reply
[+] [-] Aarvay|4 years ago|reply
[+] [-] trypwire|4 years ago|reply
[+] [-] stagas|4 years ago|reply
[+] [-] JohnCurran|4 years ago|reply
One note: minor typo here: “I want the signal flow through my applicatioin to look like S."
Applicatioin -> application
[+] [-] trypwire|4 years ago|reply
[+] [-] hootbootscoot|4 years ago|reply
[+] [-] trypwire|4 years ago|reply
First is the model/API for actually writing your app. Web Audio is very imperative and, in my experience, doesn't handle dynamic behavior terribly well. Elementary is designed both for functional/declarative expression and for change over time.
The second piece is that Elementary is built around a core runtime which can be embedded anywhere. Web Audio is, as its name might suggest, quite coupled to the browser environment. My goal is that if you've written an audio app that runs at the command line with Elementary, you could package that same thing up and ship it as a VST plugin, on embedded linux hardware, etc.
[+] [-] nyanpasu64|4 years ago|reply
----
I've spent many years working on computer music and DSP in general, and the last two years writing interactive audio apps like DAWs and trackers (traditionally written in real-time code).
Personally I feel that not only should the signal processing be handled in compiled code, but the process of allocating voices for incoming notes and sweeping filter parameters for existing notes (like a VST/VSTi plugin) should be real-time/allocation-free and deterministic.
Additionally for a sequencer or DAW-like program playing a scripted project file, the process of scanning through a document and determining the timing of events should also be real-time and deterministic, requiring it be synchronous with the code driving synthesis/notes/envelopes. (Some Web Audio music demos I've seen on HN, like the 64-step drum sequencer[1] or The Endless Acid Banger[2], are not deterministic. On a slow computer, when the browser's JS engine stutters, notes stop being triggered even though existing notes sustain.) I think that requires that the "document" is stored in C/C++/Rust structs and containers, rather than GC'd objects like in JS.
(Processing user input depends on low latency without latency spikes rather than determinism and synchronous processing, but I don't know if browsers are good at that either.)
At this point, I find it significantly easier to write a GUI in native toolkits like Qt, than to learn and write bindings for a GC'd language to access native data structures. And unfortunately there is a limited selection of mature native toolkits that are both not buggy (Qt is buggy) and has accessibility and internationalization, and optionally theming and native widgets. I still believe that writing a GUI in a native language can become a better user and developer experience than browsers, if more people invest effort and resources into better paradigms or implementations to fix native's weaknesses compared to browsers (buggy libraries, no billion-dollar megacorps funding web browsers, apparently people nowadays think that React using a virtual DOM and recomputing parts of the UI that don't change is a strength), while maintaining its strengths (less resources required, more predictable memory and CPU usage, and trivial binding to native code).
What's the current state of native desktop GUIs? Qt is nearly good enough, but is run by a company trying to antagonize the open-source community and rip off commercial users, binds you to C++ (which is less pleasant than Rust), suffers from bugs (some neglected, deep-seated API design flaws), and handles reactivity poorly (though QProperty and transactions[3] promise to improve that aspect). GTK has cross-language support, but in gtk-rs, GTK's refcounted design and inheritance and Rust's move-based design and composition are fighting against each other. There's other older APIs like WxWidgets, FLTK, FOX, etc, many of which I personally dislike as a user. Flutter is promising, but still buggy, the upper layers are virtual-DOM-based (I have reservations), and feels foreign on desktops (many missing keyboard shortcuts, one app (FluffyChat's unfinished desktop port) is missing right-click menus, has broken keybinds, and has painfully slow animations).
Is there an alternative? Where would you draw the seam between a GC'd GUI and a realtime audio engine, to minimize the boilerplate glue code, and ensure that note processing and audio synthesis are real-time and deterministic (does this require being synchronous with each other?)?
I've seen a lot of software with a non-real-time or nondeterministic audio engine/sequencer (even though I personally think it's bad for users and unacceptable when I design a program). For example, ZorroTracker has an audio processing path written in JS (a RtAudio thread calling into V8 running in parallel with the GUI thread, calling JS bindings to native .dll/.so audio generators), coupled with a (planned) sequencer written in JS and operating on JS objects, giving up on real-time. As I've mentioned, several browser-based audio projects I've seen on HN generate audio in Web Audio (real-time), but trigger notes asynchronously in non-real-time JS code.
> Eventually it dawned on me that the task of composing audio processing blocks is itself free of those realtime performance constraints mentioned above: it's only the actual rendering step that must meet such requirements.
Given my current understanding of real-time programming, I think that the task of feeding inputs into audio processing blocks is not free of realtime performance constraints. In any program with an audio sequencer, I'm not aware of how to get deterministic playback with predictable CPU runtime (which I do not think is worth compromising), without worrying about "memory management, object lifetimes, thread safety, etc.", preallocating memory for synthesis, and picking a way to communicate with the UI (eg. lock-free queues for processing commands sequentially, and atomics or triple buffers for viewing the latest state). If you have a solution, do let me know!
[1]: https://news.ycombinator.com/item?id=27112573
[2]: https://news.ycombinator.com/item?id=26870666
[3]: https://doc-snapshots.qt.io/qt6-dev/template-typename-t-qpro...
[+] [-] v-yadli|4 years ago|reply
[+] [-] 12thwonder|4 years ago|reply
I use c++ mainly but when I have to use javascript at work, i really hate it for whatever reason. wonder if other c++ programmers feel the similar way.