Daniel Z. tweeted about my post from last week. In particular, he took a deeper look at the chart of energy demand that put all hourly data onto the same plot, originally published at the StackOverflow blog:
I noted that this is not a great chart particularly since what catches our eyes are not the key features of the underlying data. Daniel made a clearly better chart:
This is a dot plot, rather than a line chart. The dots are painted in light gray, pushed to the background, because readers should be looking at the orange line. (I'm not sure what is going on with the horizontal scale as I could not get the peaks to line up on the two charts.)
What is this orange line? It's supposed to prove the point that the apparent dark band seen in the line chart does not represent the most frequently occurring values, as one might presume.
Looking closer, we see that the gray dots do not show all the hourly data but binned values.
We see vertical columns of dots, each representing a bin of values. The size of the dots represents the frequency of values of each bin. The orange line connects the bins with the highest number of values.
Daniel commented that
"The visual aggregation doesn't in fact map to the most frequently occurring values. That is because the ink of almost vertical lines fills in all the space between start and end."
Xan Gregg investigated further, and made a gif to show this effect better. Here is a screenshot of it (see this tweet):
The top chart is a true dot plot so that the darker areas are denser as the dots overlap. The bottom chart is the line chart that has the see-saw pattern. As Xan noted, the values shown are strangely very well behaved (aggregated? modeled?) - with each day, it appears that the values sweep up and down consistently. This means the values are somewhat evenly spaced on the underlying trendline, so I think this dataset is not the best one to illustrate Daniel's excellent point.
It's usually not a good idea to connect lots of dots with a single line.
[P.S. 3/21/2022: Daniel clarified what the orange line shows: "In the posted chart, the orange line encodes the daily demand average (the mean of the daily distribution), rounded, for displaying purposes, to the closed bin. Bin size = 1000. Orange could have encode the daily median as well."]