Full disclosure: I'm a software performance nerd who is building a deeper understanding of hardware using whatever tools are
I work with a lot of assembly traces emitted from CPU sims in Verilator. These traces can be several gigabytes in size, with tens of millions of entries. I've found the typical interactive matplotlib backends to not work the best with tens of millions of points on my M1 Macbook Pro. This is no slight against Matplotlib, or cairo, or agg, or the default macOS backend—they prioritize cross-platform "It Just Works" support over extreme performance.
But it does irk me that there seems to be such a gap between the amount of data we can visualize in the gaming domain in comparison to the scientific domain[1]. The difference between 50 million and 5 billion is 100X. I'd be ecstatic if I could get within 10X difference. I understand that this gap exists for lots of reasons, of which I'd be happy to hear in detail in the replies to this comment.
Game engines spend a whole lot of work to not draw things. Even if you have an 8K monitor, that's still only around 33 million pixels. If the data set you want to visualize is 5 billion elements, and if you decided that you were going to use every single pixel on the screen, you'd still have to collapse those 5 billion elements in a visual space of less than %1 the size.
In a video game it's easy to figure out what not to show - just hide everything that would normally be too small or far away to see. But in a scientific domain, what to show and what to hide might be a much more interesting question.
On the other hand, if you're just looking for a plot of a few billion elements with the ability to zoom in and out, then a straightforward decimation of the data and drawing it to the screen can work. In the past I've written custom tools to do just that, but at this point in life I would probably throw it at something like SciChart and let it take care of that.
there are indeed several reasons, but in this case I suspect the problem isn't even with the matplotlib (well, most of it). You mention traces that are GB's in size and that's the first big problem.
For reference, the newest Zelda game is a total of 18.2GB of storage, and of course Zelda isn't trying to audit the entire game on every load. Games spend a lot of time making sure their assets are lean, and at the end it is deployed in some binary format to further reduce its impact on a game. trace files that focus on human readability lose this compressibility.
So regardless of how fast the graphical plotting capabilities are, I imagine such an app is CPU bound from simply trying to parse that trace data. That'd be the first place I'd look to optimize (you know, without properly profiling your app. Probably something I'd say in an interview setting).
You can't draw that data on the screen in any meaningful manner without zooming.
At various zoom level what you actually want is a low pass filter over a slice of the data, such that the low pass filter matches the display. E.g. I can display 1080 pixels across, but there's 1 billion points. Ok... meaningful data is limited.
Optionally you could do what other device displays do and show a sort of fuzzy intensity gradient at each display x axis over y points to represent the number of actual samples at that point.
It's kind of crazy honestly. There's many options to display billions of points across 1080 pixel wide displays but none of them really show you the reality. Only some subset of information of the reality there.
There exists scientific visualization tool that goes far beyond what you can visualize in gaming domain, e.g. some applications are deployed on supercomputers so that you can change parameters interactively with the visualization.
Gaming is highly specialized applications, so you should also be comparing to highly specialized applications in scientific domain.
It’s worth noting that, as far as I know, nanite involves preprocessing the mesh to convert it into nanite’s special format. You could do the same thing with data easily. The issue is taking a large batch data you’ve never seen before and displaying it efficient in real time, which nanite doesn’t do either.
I'm not sure I fully understand why some projects so proudly avoid using standard library features and e.g. favour a more manual approach to memory/data management. What are some good key reasons for this?
> avoids ... C++ headers
Same question for this point - but is it the same reason as for avoiding STL containers or something else?
Sprinkling small malloc and free-s everywhere, where every object has their own individual lifetime to be managed, is very often not the best approach. Using STL containers often means using that prolific allocation strategy -- it just makes the numerous small allocations implicit. A region based approach is often far simpler conceptually, and more performant. This is a decent article on the topic: https://www.rfleury.com/p/untangling-lifetimes-the-arena-all...
C++ is a big language and there is a lot of diversity of opinion in what a sensible subset of the language is, with numerous concerns influencing those decisions -- one common one is compile times. Using C only headers generally ensures a much lower ceiling for compile times compared to using typical C++ headers. And C++ headers are slightly more likely to use the prolific allocation strategy from above. As such, C++ headers require far more scrutiny before acceptance, in my view.
Historically, in high-performance resource-constrained applications (eg: console video games), developers would use a "stripped down" version of C++ (eg: no RAII, no exceptions), which often included avoiding STL. It's less true these days, but still somewhat. In truth, STL will never be fully catering to these uses; that's not its goal.
One reason STL was avoided is that control over memory allocation was quite poor. std::allocator was somewhat conceptually broken for a long time, only recently remedied in C++14 and 17. So, there's still about 20 years of code out there from the days of old bad C++ (:
Another reason is that early STL implementations were poor, and not well-suited to high-performance use cases. Again, this has gotten better (some), but the legacy lives on.
So, if you are designing a library to be used in video games (such as in TFA), and want wide adoption, then avoiding STL, RAII, exceptions is still generally a good idea, IMO.
One important reason is compile time. C++ stdlib headers are so entangled which each other that each one brings in tens of thousands of lines of complex template code, which may increase the compilation time for a single source file to seconds, and it's getting worse with each new C++ version.
Also, it's not like writing your own growable array is rocket science, and a simple, specialized version can be done in a few dozen to a few hundred lines of code.
STL also has some kinda broken features, eg it’s hard to work with a std::vector<T> because everything is different when T=bool and then you get confusing errors if you happen to do that accidentally. std::vector<char> is also a bit broken though that’s the language’s fault rather than STL’s.
C++ compiler inlining capabilities have had to become magically powerful to keep up with the layers of small functions under every STL container method. Meanwhile, in some applications (games) high performance in debug builds is necessary.
Same for headers, compile times matter. I'm very much looking forward to widespread, fully-modulized standard libraries.
When working in Unity, I was honestly shocked how much performance gain I got out of a small app by moving away from LINQ to using custom allocated containers. That convenience has a cost, and games can't afford such a cost.
I’m using this from Julia and both the user and developer experience is great.
It’s much more limited than publication style plotting libraries but the instant 60fps reactivity is amazing.
This is awesome. ImGui is the only native UI library I've seen that makes intuitive sense to me. Other things are all bogged down in their own frameworks rather than just providing a simple interface for people to create UI elements with. I think most ImGui users are in game development, so I wonder what kind of uses this would be good for. Maybe some debugging information, but personally I've never needed aggregate statistics for debugging info
Years ago I wrote an oscilloscope-style monitor for an ATI load cell using ImGUI in the early days. This library would have been extremely useful. Kudos!
There’s no caching. Most people advocating or theorizing about caching haven’t measured the performances of dear imgui or implot. Computers are incredibly powerful when you don’t waste their resources.
My scientific programmers with only beginner to intermediate python knowledge struggled with libraries for visualization. I might recommend the python bindings for this to them..
[+] [-] dwrodri|2 years ago|reply
I work with a lot of assembly traces emitted from CPU sims in Verilator. These traces can be several gigabytes in size, with tens of millions of entries. I've found the typical interactive matplotlib backends to not work the best with tens of millions of points on my M1 Macbook Pro. This is no slight against Matplotlib, or cairo, or agg, or the default macOS backend—they prioritize cross-platform "It Just Works" support over extreme performance.
But it does irk me that there seems to be such a gap between the amount of data we can visualize in the gaming domain in comparison to the scientific domain[1]. The difference between 50 million and 5 billion is 100X. I'd be ecstatic if I could get within 10X difference. I understand that this gap exists for lots of reasons, of which I'd be happy to hear in detail in the replies to this comment.
1:https://youtu.be/eviSykqSUUw
[+] [-] thadt|2 years ago|reply
In a video game it's easy to figure out what not to show - just hide everything that would normally be too small or far away to see. But in a scientific domain, what to show and what to hide might be a much more interesting question.
On the other hand, if you're just looking for a plot of a few billion elements with the ability to zoom in and out, then a straightforward decimation of the data and drawing it to the screen can work. In the past I've written custom tools to do just that, but at this point in life I would probably throw it at something like SciChart and let it take care of that.
[+] [-] johnnyanmac|2 years ago|reply
For reference, the newest Zelda game is a total of 18.2GB of storage, and of course Zelda isn't trying to audit the entire game on every load. Games spend a lot of time making sure their assets are lean, and at the end it is deployed in some binary format to further reduce its impact on a game. trace files that focus on human readability lose this compressibility.
So regardless of how fast the graphical plotting capabilities are, I imagine such an app is CPU bound from simply trying to parse that trace data. That'd be the first place I'd look to optimize (you know, without properly profiling your app. Probably something I'd say in an interview setting).
[+] [-] bfrog|2 years ago|reply
At various zoom level what you actually want is a low pass filter over a slice of the data, such that the low pass filter matches the display. E.g. I can display 1080 pixels across, but there's 1 billion points. Ok... meaningful data is limited.
Optionally you could do what other device displays do and show a sort of fuzzy intensity gradient at each display x axis over y points to represent the number of actual samples at that point.
It's kind of crazy honestly. There's many options to display billions of points across 1080 pixel wide displays but none of them really show you the reality. Only some subset of information of the reality there.
[+] [-] KolenCh|2 years ago|reply
Gaming is highly specialized applications, so you should also be comparing to highly specialized applications in scientific domain.
[+] [-] rcme|2 years ago|reply
[+] [-] semi-extrinsic|2 years ago|reply
https://holoviews.org/user_guide/Large_Data.html
[+] [-] wdfx|2 years ago|reply
I'm not sure I fully understand why some projects so proudly avoid using standard library features and e.g. favour a more manual approach to memory/data management. What are some good key reasons for this?
> avoids ... C++ headers
Same question for this point - but is it the same reason as for avoiding STL containers or something else?
[+] [-] dundarious|2 years ago|reply
C++ is a big language and there is a lot of diversity of opinion in what a sensible subset of the language is, with numerous concerns influencing those decisions -- one common one is compile times. Using C only headers generally ensures a much lower ceiling for compile times compared to using typical C++ headers. And C++ headers are slightly more likely to use the prolific allocation strategy from above. As such, C++ headers require far more scrutiny before acceptance, in my view.
[+] [-] turtledragonfly|2 years ago|reply
One reason STL was avoided is that control over memory allocation was quite poor. std::allocator was somewhat conceptually broken for a long time, only recently remedied in C++14 and 17. So, there's still about 20 years of code out there from the days of old bad C++ (:
Another reason is that early STL implementations were poor, and not well-suited to high-performance use cases. Again, this has gotten better (some), but the legacy lives on.
Here's a good one to read: https://github.com/electronicarts/EASTL/blob/master/doc/Desi...
So, if you are designing a library to be used in video games (such as in TFA), and want wide adoption, then avoiding STL, RAII, exceptions is still generally a good idea, IMO.
[+] [-] flohofwoe|2 years ago|reply
Also, it's not like writing your own growable array is rocket science, and a simple, specialized version can be done in a few dozen to a few hundred lines of code.
[+] [-] _gabe_|2 years ago|reply
> Same question for this point - but is it the same reason as for avoiding STL containers or something else?
ABI compatibility and the ability to easily create FFIs for other languages are the strongest appeals afaik.
[+] [-] dan-robertson|2 years ago|reply
[+] [-] unknown|2 years ago|reply
[deleted]
[+] [-] corysama|2 years ago|reply
Same for headers, compile times matter. I'm very much looking forward to widespread, fully-modulized standard libraries.
[+] [-] johnnyanmac|2 years ago|reply
[+] [-] wthomp|2 years ago|reply
[+] [-] ssivark|2 years ago|reply
[+] [-] shortrounddev2|2 years ago|reply
[+] [-] andsoitis|2 years ago|reply
As soon as you need to support multi-line text or Unicode text, you will run into a hard wall fast.
Oh and accessibility. No accessibility.
There are other issues too, but these two are typically deal breakers for any serious (or even hobby) application with a non-trivial UI.
Some prior discussion: https://news.ycombinator.com/item?id=24987964
[+] [-] ToppDev|2 years ago|reply
Screenshot: https://i.imgur.com/8Mc04NB.png
Repository: https://github.com/UniStuttgart-INS/INSTINCT
[+] [-] Solvency|2 years ago|reply
D3, Domo, Highcharts, Tableau, or dozens of other charting and dashboarding options?
[+] [-] fdkz|2 years ago|reply
Screenshot of a program that uses this library: https://fdkz.net/static/images/20171215-aniplot.png
[+] [-] djoshea|2 years ago|reply
Screenshot: https://cloud.githubusercontent.com/assets/77752/5382993/9f6...
[+] [-] orx|2 years ago|reply
[+] [-] ocornut|2 years ago|reply
[+] [-] eska|2 years ago|reply
[+] [-] blondin|2 years ago|reply
[+] [-] applied_heat|2 years ago|reply