top | item 28043523

(no title)

Agreed. I also wish fewer libraries started their own supervision tree, and instead gave you a child spec to drop into your supervision tree. There's definitely use-cases where shipping libraries as an application makes sense, but oftentimes that sort of design causes problems for me, because it means not being able to start multiple copies of the dependency with different configurations.

I think Phoenix PubSub is a perfect example of how libraries should be structured, in that you just need to drop the module + options into your supervision tree, and you have the freedom of starting multiple independent copies of the tree, in different contexts, and with their own configurations: https://hexdocs.pm/phoenix_pubsub/Phoenix.PubSub.html#module...

discuss

derefr|4 years ago

Alternately, the dependency can start its own supervision tree with any global processes/tables hanging off it from the beginning; and then export a Mod:start_link/1 function which clients will call, which will 1. start a child tree owned/managed by the dep's supervision tree; but which then 2. links that child subtree's root into the caller as well.

Such deps are integrated, by adding a stub GenServer that calls Mod:start_link/1 in its init/2 callback; and then adding a child-spec for that stub GenServer in your client app's supervision hierarchy.

The ssh daemon module in the stdlib works this way. Most connection-pooler abstractions (e.g. pg2, gproc) do as well.

QuinnWilton|4 years ago

Yes! This is a great approach, and I'd be happy to see more examples like this in the wild. This is similar to the same way Phoenix PubSub works, with the PubSub application starting a pg scope as part of its supervision tree, that client PubSub servers can join if configured to use the pg adapter.

I was a little bit flippant in my initial comment, but my main criticism was of libraries that don't support any sort of hooks like this into their supervision strategy, and instead rely entirely on a global and static supervision tree, usually configured using app config.

unknown|4 years ago

[deleted]

dnautics|4 years ago

I'm 50-50 on that one (used to agree with you more but have since retraced a bit). This may be an overly nitpicky detail, but I you sort of want your own sup tree to not necessarily have a different-ly scoped "microservice" tied to it in terms of failure domains, and also just plain visual organization in your observer/livedashboard. For the 90% use case (e.g. http process pools) an indepentent sup tree is correct, but to your points,

1. it would be nice to have a choice. The library-writer should think about their users and choose which case is more correct. And make it opt-out and easy (let's say 2-3 loc) to implement the "other case", and spelled out explicitly in the readme/docs landing page.

2. PubSub indeed made (IMO) the correct choice when it migrated over from being its own sup tree to moving into the app's sup tree.

Thank you for listening to my TED talk.

jolux|4 years ago

This is pretty much exactly how I feel and I appreciate that Ranch gives you this option.

jolux|4 years ago

You do sometimes have to be careful about how you handle configuration with embedding multiple copies of other supervision trees though: https://ninenines.eu/docs/en/ranch/2.0/guide/embedded/

sandbags|4 years ago

IIYC you're suggesting that what I am depending upon here is convenient but problematic?

My understand is not yet sophisticated enough to follow your point about "not being able to start mutiple copies of the dependency with different configurations".

Do you have any explanatory examples that could help me (and presumably others like me)? Thanks. m@t

QuinnWilton|4 years ago

Problematic is probably too strong of a term, and I think I'd use the word inflexible instead.

I want to be clear though: my issue isn't with applications -- the functionality you're talking about is powerful and useful -- it's purely with the tendency of starting a static and global supervision tree as part of a dependency: see some of the other comments in this thread for some neat examples of how applications like ssh and pg2 handle supervision.

When libraries are written like this, they usually start everything up automatically, and pull from their application environment in order to configure everything. This means that this configuration is global and shared amongst all consumers of the library.

Imagine an HTTP client, for example, that provides a config key for setting the default timeout. This key would be shared among all callers, and so if multiple libraries depended on this client, their configurations would override each other.

Fortunately, Elixir now recommends against libraries setting app config, so this problem is partially mitigated, but it's still a concern within your app: if I'm calling two different services, I want to use different timeouts for each, based on their SLA, so having a global timeout isn't helpful.

Instead, in this situation, I'd prefer something like what Finch provides, where I'm able to start different HTTP pools within my supervision tree, for different use-cases, and each can be configured independently: https://github.com/keathley/finch#usage

Another approach would be to do something like what ssh does, and have the Finch application start a pool supervisor automatically, but then provide functions for creating new pools against that supervisor, and linking or monitoring them from the caller.

There's a few other techniques you can use too, with different tradeoffs and benefits: like Ecto's approach of requiring that you define your own repo and add that to your tree. Chris Keathley describes some of those ideas here: https://keathley.io/blog/reusable-libraries.html

Global trees like this are also harder to test, especially if they rely on hardcoded unique names, and usually restrict you to synchronous tests, since you can't duplicate the tree for every test and run them independently of each other.

Again though, I want to stress that running processes in the library's application is not my problem: it's just not having any control over when or how those processes are started.

I'm just responding on my phone, and I need to run for a few hours, but feel free to ask for more info or reach out. I'm always happy to talk about this stuff! I enjoyed your article, and I apologize if my initial comment came across as an attack on your core points.

Fire-Dragon-DoL|4 years ago

If a library is not written like that, it's poorly designed. It could provide one global started version as a commodity, but not being able to start it multiple times would be a big no.

QuinnWilton|4 years ago

This has historically been fairly common among a lot of the early Elixir libraries, and I'd imagine that's a byproduct of many of the early adopters coming from the Ruby ecosystem, and not having prior experience with the patterns used in Erlang. I think some of the early confusion surrounding how application config should be used also led to some misguided decisions early on.

Fortunately it's something that I've seen improve over time, but it's a pain-point I've run into with a lot of dependencies, so I try to call it out when I see it.