top | item 10777883

(no title)

angdis | 10 years ago

At the end of the day, the whole point of ggplot is to produce a graphical representation of some aspect of the data. How much information can you possibly cram into ONE graphic and have it be readable by a human? Your problem is really a data reduction problem and not a plotting/graphics problem.

discuss

order

tenfingers|10 years ago

The data that goes into the plot is unrelated to it's visual complexity.

The "problem" is that ggplot also takes care of the transformation/reduction step for you.

For example, a KDE plot can source potentially a limitless amount of data while still generating a very simple plot. Likewise for most smoothers.

However, if I have to produce the kde/smoothed line myself, I lose almost all advantages of using ggplot (I have to manually calculate the visual density, scaling and attaching labels is another PITA).

On top of that, as other have said, ggplot really struggles already with thousands of entries. A simple 5x5 faceted scatterplot with ~10k points might take seconds to render on recent hardware. When I plot data interactively for exploration, I might do this hundreds of times a day. I lose all the convenience just in the time wasted for rendering.

hunterratliff1|10 years ago

This is a very valid point that I feel we often overlook. Most folks don't think like a statistician, and over-complicating figures is the best way to render them useless. All is lost if your audience can't understand what you're trying to convey (: