top | item 40077833

(no title)

V1ndaar | 1 year ago

And not only that, in many cases they will tell you (if they reply) "oh, we can't find the source of that plot anymore". Happened to me quite a few times (although in physics).

I'm pretty sure I'm not the only one who's written themselves a mini tool to even extract data from a bitmap plot based on the axes. Involves some manual steps (cropping mainly), but is very convenient for the cases where people not even use vector graphics, but sometimes even just screenshots of plots... Do I like it? Hell no! It's why I've put quite some effort in doing it better for my PhD thesis.

discuss

order

godelski|1 year ago

Yeah it's very annoying especially these days when there's no real excuse to not have a copy. You can easily store all code and data for free and in an accessible manner. Even just GitHub for 90+% is good enough. Hugging face helps, and there's many other ways too.

I remember my first year in grad school I was trying to replicate a work by a very prestigious university. It definitely wasn't reproducible from text but I did my best. Couldn't get close to their claims so I email the lead author (another grad student). No response. Luckily my advisor knew their advisor. Got a meeting and then I got sent code. It was nothing like what they claimed in the paper so I have no idea what they gave me. Anyways, my paper never got published because I couldn't beat them. It is what it is.

WanderPanda|1 year ago

So be fair, sometimes (e.g. in the case of scatter plots with many dots) pdf renderers become very slow and/or mess up the rendering. In this case the easiest option is rasterizing it (for performance and consistency of the appearance)

V1ndaar|1 year ago

That is certainly true (and why added a general "embed plot data as bitmap into SVG/PDF" option to https://github.com/Vindaar/ggplotnim that works not only for raster heatmaps). But realistically such plots are often not ideal anyway (too many data points in a plot is often a sign that a different type of plot would be better; typically one that aggregates in some way) and it's just another argument to make the data for plots available as well.

jszymborski|1 year ago

If you have the misfortune of having to use Word for writing manuscripts and/or have scatter plots with a good number of points, SVGs will ruin your day in my experience.

(Yes, I'd much rather use LaTeX)

mirekrusin|1 year ago

Somebody tell them that huggingface, github, gitlab, codeberg etc exist.