You could also just send output to separate files by re-opening stdout/stderr and use https://www.vanheusden.com/multitail/index.html (or GNU screen or tmux or whatever) to multiplex a terminal in a more organized way. This also solves an "open problem" in the article of stray prints.
If you really want to share one terminal / stdout but also prevent timing-based record splitting, you could also send outputs to FIFOs / named pipes and then have a simple record boundary honoring merge program like https://github.com/c-blake/bu/blob/main/funnel.nim with doc at https://github.com/c-blake/bu/blob/main/doc/funnel.md As long as "record format" is shared (e.g. newline-terminated) this can also solve the stray print problem.
When you say "multiplex a terminal" with tmux, do you mean splitting screens so the same terminal window has multiple shells prompts in it? I'm trying to understand how that would be used to address the problem in the demo at the bottom of the post.
I've got the impression that in practice, what keeps developers from letting trivially parallelizable tasks run in parallel is a) the overhead of dealing with poor parallelization primitives and b) the difficulty in properly showing the status of parallel invocations.
Having good features to support this in standard libraries would go a long way to incentivizing devs to actually parallelize.
In a way, it's a subset of the distributed systems tracing problem - you have multiple tasks running in parallel on the same node, but they will have been initiated as different (sub) tasks, and should be tracked by the specific task via which they were initiated. So systems like OpenTelemetry and Honeycomb can be great for this, allowing you to see events in aggregate as well as in the context of a trace that propagates between different threads and systems.
But there's so much complexity there that IMO it's best left outside of standard libraries - and it's indeed a daunting amount of new vocabulary for newcomers. I'm not aware of simpler abstractions on top of the broader telemetry ecosystem for monitoring simple parallelization, but arguably there should be one that keeps things quite simple.
Agreed. I have been using joblib for a good few months. It is fine, but I still haven't figured out basic things like printing the status of process-based jobs.
[Parsl is much better, e.g., logging is built-in, but it can be a little overwhelming.]
Yes, totally agree. I’ve written some code and I’d rather convert it to C using cpython before I paralyze it. Python is horrible for both these things, and you may not even get a better speed increase because of the overhead. It’s like use cpython get 10-100x better speed with a few lines of code, or spend my whole day in a horrible mess of data structures and getting my functions to work with map properly with maybe nothing to show for it.
Yes this is 100% the type of thing that should be in a standard library but also the type of thing I have no doubt Python steering would feel better belongs in a 3rd party library.
We do see some cool stuff under the hood from core Python devs but interest in further quality of life features seems to be lacking.
Funny enough Glibc has a lock for threaded printing internally. You can disable the lock in glibc with the __fsetlocking function and the FSETLOCKING_BYCALLER parameter.
I had a threaded server that we were debugging which would only dump state correctly if we deleted a printf right before in a different thread. Really confused me until I figured this out.
I like the self-built approach especially for the learning value.
If you’re using this in a CLI tool you’re writing in Python you might be using the library rich anyway, which provides this functionality as well including some extra features.
What I don't like about rich is that, dependencies and all, its installed size comes out to around 20 MB. 9 MB of that is due to its dependency on pygments for syntax highlighting, which a lot of people probably don't even want/need.
If anyone knows of a smaller, more focused library providing something similar to rich's Live Display functionality, I'd appreciate it.
You probably don't want to run this gist directly. Looks like a risk of a runaway process creation, depending on the platform. The process creation code should be guarded by `if __name__ == "__main__"`.
I prefer using a purpose-built tool like GNU parallel. Parallel's only purpose is to run things in parallel and collect the results together. The advantage is you only have to learn to use it once, rather than learn to do this again and again in all the different languages/tools you might use.
I can’t help but notice you have not explained how to handle progress reporting in gnu parallel.
Also do you really use parallel from software trying to parallelise its internal workload? Note that in tfa the workers are an implementation detail of a wider program.
I don't like the wild use of globals, even if they are "guarded" by locks. And then, oh boy there're locks!
But, it surely works, so that's nice. It would be cool to have a small lib that solves this nicely :thinking:...
I've used a separate printing thread printing everything from a queue, and had the other threads push everything they want to print to this queue. Is there some advantage to doing it like in the post over the queue method?
tqdm has a position parameter which allows offsetting concurrent progress bars. It should work automatically for intra-process concurrency anyway. I don’t know if it works correctly with multiple processes though.
[+] [-] cb321|2 years ago|reply
If you really want to share one terminal / stdout but also prevent timing-based record splitting, you could also send outputs to FIFOs / named pipes and then have a simple record boundary honoring merge program like https://github.com/c-blake/bu/blob/main/funnel.nim with doc at https://github.com/c-blake/bu/blob/main/doc/funnel.md As long as "record format" is shared (e.g. newline-terminated) this can also solve the stray print problem.
[+] [-] ryan-duve|2 years ago|reply
[+] [-] tgsovlerkhgsel|2 years ago|reply
Having good features to support this in standard libraries would go a long way to incentivizing devs to actually parallelize.
[+] [-] btown|2 years ago|reply
https://opentelemetry.io/docs/languages/python/getting-start... https://docs.honeycomb.io/getting-data-in/opentelemetry/pyth...
But there's so much complexity there that IMO it's best left outside of standard libraries - and it's indeed a daunting amount of new vocabulary for newcomers. I'm not aware of simpler abstractions on top of the broader telemetry ecosystem for monitoring simple parallelization, but arguably there should be one that keeps things quite simple.
[+] [-] dr_kiszonka|2 years ago|reply
[Parsl is much better, e.g., logging is built-in, but it can be a little overwhelming.]
[+] [-] RandomWorker|2 years ago|reply
[+] [-] agumonkey|2 years ago|reply
[+] [-] appplication|2 years ago|reply
We do see some cool stuff under the hood from core Python devs but interest in further quality of life features seems to be lacking.
[+] [-] acover|2 years ago|reply
For a one off project it seems simpler to just write an html UI.
[+] [-] rciorba|2 years ago|reply
An alternative would be to have only the main process do the updating and have the workers message it about progress, using a queue.
[+] [-] convivialdingo|2 years ago|reply
I had a threaded server that we were debugging which would only dump state correctly if we deleted a printf right before in a different thread. Really confused me until I figured this out.
[+] [-] ametrau|2 years ago|reply
[+] [-] tekknolagi|2 years ago|reply
[+] [-] wolfskaempf|2 years ago|reply
If you’re using this in a CLI tool you’re writing in Python you might be using the library rich anyway, which provides this functionality as well including some extra features.
https://rich.readthedocs.io/en/stable/progress.html
[+] [-] morningsam|2 years ago|reply
If anyone knows of a smaller, more focused library providing something similar to rich's Live Display functionality, I'd appreciate it.
[+] [-] vijucat|2 years ago|reply
https://docs.python.org/3/library/logging.handlers.html#logg...
[+] [-] tekknolagi|2 years ago|reply
[+] [-] pynchia|2 years ago|reply
You need to consume the iterator that map returns.
Please use Python as it is supposed to be used, not as Fortran.
[+] [-] ShamelessC|2 years ago|reply
[+] [-] tekknolagi|2 years ago|reply
[+] [-] isoprophlex|2 years ago|reply
Edit: never mind. Ignore that. There is a link on the page which I overlooked, with a more complete example. https://gist.githubusercontent.com/tekknolagi/4bee494a6e4483...
Cool stuff. Now I'm eager to find a way to make this work for multiple tqdm progress bars, running in parallel.
[+] [-] ibejoeb|2 years ago|reply
[+] [-] tekknolagi|2 years ago|reply
[+] [-] ametrau|2 years ago|reply
[+] [-] tclover|2 years ago|reply
[+] [-] tekknolagi|2 years ago|reply
[+] [-] globular-toast|2 years ago|reply
[+] [-] masklinn|2 years ago|reply
Also do you really use parallel from software trying to parallelise its internal workload? Note that in tfa the workers are an implementation detail of a wider program.
[+] [-] prosaole|2 years ago|reply
[+] [-] atoav|2 years ago|reply
[+] [-] nargella|2 years ago|reply
[+] [-] LtWorf|2 years ago|reply
https://pythondialog.sourceforge.io/images/screenshots/mixed...
[+] [-] hackan|2 years ago|reply
[+] [-] echoangle|2 years ago|reply
[+] [-] huac|2 years ago|reply
[+] [-] masklinn|2 years ago|reply
[+] [-] cl3misch|2 years ago|reply
[+] [-] rnmmrnm|2 years ago|reply
[+] [-] tekknolagi|2 years ago|reply