I'd just call the function once by avoiding the global; construct your database access object at the start of your asynchronous main method and dependency inject it to other tasks.
His asyncpg example doesn't make much sense to me. What if there was a config change with a bad password? I would like to know this immediately on startup, else my rolling deploy is going to bring down all the previously well configured instances, and by the time we lazily try to connect to postgres it's too late.
I'm not a big python user, but I do find it kind of surprising there isn't an awaitable and thread safe mutex in the stdlib.
Can you clarify what you mean by dependency injection in python? Did you mean a DI framework or something more informal?
I've seen DI frameworks in python but not really used them. At a glance they don't strike me as pythonic. Rolling your own kind of inversion of control can result in unruly "config" or "context" objects that bring difficulties as well.
Been coming across lot of these issues. Asyncio requires slightly different thought processes.
As soon as you have an `await` anywhere in the code, you've got to assume that your code will be re-entered. Lots of asyncio.Locks all over the place for me.
Glad people are bringing this up. I had to learn this on my own.
> As soon as you have an `await` anywhere in the code, you've got to assume that your code will be re-entered.
At least the re-entry points are explicitly marked with `await`. IMO that's the main benefit of async-await (stackless coroutines) over stackful coroutines or threads, which allow your code to be suspended and re-entered almost anywhere.
Of course the drawback of async-await is the "function color" issue [0], in which it's difficult for functions that don't suspend to call those which do.
Every time I find something that seems unnecessarily awk in asyncio, I eventually find out there's a good reason. But plenty of things that are written with it aren't using it exactly right.
> Unfortunately this has a serious downside: asyncio locks are associated with the loop where they were created. Since the lock variable is global, maybe_initialize() can only be called from the same loop that loaded the module. asyncio.run() creates a new loop so it’s incompatible.
I work on several async projects, but I never had to use multiple event loops. What are use cases for using multiple event loops?
There may be other use cases, but it can be a useful pattern for mixing async code into a non-async project. In the specific places where using async for some task makes sense, you would just spawn a thread with an event loop, then push work into the new loop from non-async code using run_coroutine_threadsafe.
There is more than one way to make awaitables in asyncio -- at the core, this is about sharing a single future, for which there's a joyfully boring native standard constructor.
For example, when working w/ immutable GPU dataframes to represent our user's datasets, we often get into variants where loading a dataset may take a bit and thus get multiple services requesting it before ETL is done. So, we want to only trigger the parser once per file and have any subsequent calls wait on the first one:
datasets = {}
async def load_once(name):
if not (name in datasets): # sync, many
fut = asyncio.create_future() # sync, once
datasets[name] = fut # sync, once
fut.set_result(await load(name)) # async, once
return await datasets[name] # async, many
Unfortunately, this naive method is buggy, I have had to debug and fix this exact code in production :)
The issue is with exception safety - first, this does not handle exceptions in load() properly, but that is a trivial fix.
The more insidious problem is due to the fact that Python future are cancellable - and exceptions cancel futures.
What this means is that if two callers call load_once() in parallel, and the first caller encounters an exception (eg. from calling something else in parallel), the load() future will be cancelled for _all_ callers (eg. the second one), and will remain in a permanently wedged state.
How about we just use actors instead? Preemptable actors are the only good concurrency model I've ever come across. Everything else has massive problems
Actors aren’t a panacea either - your logic ends up more spread out. You’re still able to shoot yourself in the foot quite easily too, e.g. when deciding whether to use a “pull” or “push” model for concurrency.
I found async testing in python to be annoying, although i found a couple of libraries to make it nicer (pytest-async and i forget the name of the other).
Async await scales well to codebases with millions of lines and thousands of developers. As a result, large companies and ecosystems have mostly adopted async/await and the tooling and runtimes in those languages is now much more mature.
If you're using cpython since python 3.2, you don't need to lock. You can use `dict.setdefaut` or another similar method that is guaranteed to be atomic.
This can be a lot simpler. Just set "one_time_setup" to a single instance of the method, and all calls are waiting for the exact same run.
If that doesnt work, then set it to an 'asyncio.event`, and run the one_time_setup "in the background" (create_task), and when its done it marks the event as complete.
Go offers this out of the box via the sync.Once function. Do other languages? Kind of surprised python doesn’t as this sort of pattern is common in applications dealing with concurrency
Erlang has features for this baked in. What's more, if initialization of any subcomponent fails (say one of its dependencies hadn't completed booting yet due to race condition), if the author made it throw, the dependent subcomponent will automatically restart itself and try again. There are also one line strategies for trying again later, etc, so you don't even have to worry about blocking to prevent those race conditions.
> Kind of surprised python doesn’t as this sort of pattern is common in applications dealing with concurrency
Apple’s Grand Central Dispatch concurrency library has dispatch_once [0], which does something similar. It relies on non-standard “block” extensions to C, which are a way of defining lambda functions, and in practice you only see it used in Apple platforms.
lazy init in kotlin and scala is essentially the same thing.
the good thing with go's sync.Once is that it's implemented as a library instead of something in the language itself, so it's easy for curious user to see how it's actually implemented. they even have comments there pointing out wrong implementations, which I have seen people make the exact same mistake during code reviews (in other language).
I would add a note that if you are running in a cluster environment like Kubernetes this won’t work because your containers would be running potentially in different machines. In those scenarios you would need another service just for the locks.
On k8s, for example running multiple parallel jobs that need to initialize only once, It used to work for me the redis redlock (it's around with multiple implementations). The first job takes the lock while initializing, the rest just waits the release, to start working on prepared items by the first.
On asyncio, caches, we used a lock to prevent dogpiling on cache initialization.. prevent multiple tasks cashing the same in parallel.
[+] [-] alexchamberlain|5 years ago|reply
[+] [-] dilatedmind|5 years ago|reply
I'm not a big python user, but I do find it kind of surprising there isn't an awaitable and thread safe mutex in the stdlib.
[+] [-] orf|5 years ago|reply
[+] [-] np_tedious|5 years ago|reply
I've seen DI frameworks in python but not really used them. At a glance they don't strike me as pythonic. Rolling your own kind of inversion of control can result in unruly "config" or "context" objects that bring difficulties as well.
[+] [-] danielscrubs|5 years ago|reply
[+] [-] cheez|5 years ago|reply
As soon as you have an `await` anywhere in the code, you've got to assume that your code will be re-entered. Lots of asyncio.Locks all over the place for me.
Glad people are bringing this up. I had to learn this on my own.
[+] [-] pansa2|5 years ago|reply
At least the re-entry points are explicitly marked with `await`. IMO that's the main benefit of async-await (stackless coroutines) over stackful coroutines or threads, which allow your code to be suspended and re-entered almost anywhere.
Of course the drawback of async-await is the "function color" issue [0], in which it's difficult for functions that don't suspend to call those which do.
[0] http://journal.stuffwithstuff.com/2015/02/01/what-color-is-y...
[+] [-] nurettin|5 years ago|reply
So when your function is not reentrant, params = await command.get() runs in a loop inside a task ( command.put_nowait(params) is called elsewhere)
you can also use this to distribute tasks to different class methods
[+] [-] alpineidyll3|5 years ago|reply
[+] [-] OrangeTux|5 years ago|reply
I work on several async projects, but I never had to use multiple event loops. What are use cases for using multiple event loops?
[+] [-] itayperl|5 years ago|reply
[+] [-] lmeyerov|5 years ago|reply
For example, when working w/ immutable GPU dataframes to represent our user's datasets, we often get into variants where loading a dataset may take a bit and thus get multiple services requesting it before ETL is done. So, we want to only trigger the parser once per file and have any subsequent calls wait on the first one:
And then throw in an async lru.. :)[+] [-] jaen|5 years ago|reply
The issue is with exception safety - first, this does not handle exceptions in load() properly, but that is a trivial fix.
The more insidious problem is due to the fact that Python future are cancellable - and exceptions cancel futures.
What this means is that if two callers call load_once() in parallel, and the first caller encounters an exception (eg. from calling something else in parallel), the load() future will be cancelled for _all_ callers (eg. the second one), and will remain in a permanently wedged state.
Fixing that is, well, quite a bit more code...
[+] [-] smabie|5 years ago|reply
[+] [-] CraigJPerry|5 years ago|reply
I found async testing in python to be annoying, although i found a couple of libraries to make it nicer (pytest-async and i forget the name of the other).
[+] [-] odiroot|5 years ago|reply
It's very boilerplate-y (Mopidy uses Pykka) though and takes some time getting used to coming from other frameworks.
[+] [-] mgraczyk|5 years ago|reply
[+] [-] mgraczyk|5 years ago|reply
[+] [-] 6a74|5 years ago|reply
Ticket: https://bugs.python.org/issue13521
Patch: https://hg.python.org/cpython/rev/90572ccda12c
[+] [-] sicromoft|5 years ago|reply
[+] [-] nhumrich|5 years ago|reply
If that doesnt work, then set it to an 'asyncio.event`, and run the one_time_setup "in the background" (create_task), and when its done it marks the event as complete.
[+] [-] waterside81|5 years ago|reply
[+] [-] dnautics|5 years ago|reply
> Kind of surprised python doesn’t as this sort of pattern is common in applications dealing with concurrency
Well yeah, python was not designed for that.
[+] [-] ninkendo|5 years ago|reply
[0] https://developer.apple.com/documentation/dispatch/1447169-d...
[+] [-] fishywang|5 years ago|reply
the good thing with go's sync.Once is that it's implemented as a library instead of something in the language itself, so it's easy for curious user to see how it's actually implemented. they even have comments there pointing out wrong implementations, which I have seen people make the exact same mistake during code reviews (in other language).
[+] [-] natch|5 years ago|reply
https://pastebin.com/E9KWCmky
[+] [-] zeronone|5 years ago|reply
[+] [-] reedwolf|5 years ago|reply
Please write classes, people!
[+] [-] zbentley|5 years ago|reply
"global" is a fine way to do that when you need it. Simple and says what it means.
[+] [-] Bloggerzune|5 years ago|reply
[deleted]
[+] [-] rburhum|5 years ago|reply
[+] [-] jordic|5 years ago|reply