« November 2018 | Main | January 2019 »

The ebb and flow of an effective dataviz showing the rise and fall of GE

Wsj_ebbflowGE_800A WSJ chart caught my eye the other day – I spotted someone looking at it in a coffee shop, and immediately got a hold of a copy. The chart plots the ebb and flow of GE’s revenues from the 1980s to the present.

What grabbed my attention? The less-used chart form, and the appealing but not too gaudy color scheme.

The chart presents a highly digestible view of the structure of GE’s revenues. We learn about GE’s major divisions, as well as how certain segments split from or merged with others over time. Major acquisitions and divestitures are also depicted; if these events are the main focus, the designer should find ways to make these moments stand out more.

An interesting design decision concerns the sequence of the divisions. One possible order is by increasing or decreasing importance, typically indicated by proportional revenues. This is complicated by the changing nature of the business over the decades. So financial services went from nothing to the largest division by far to almost disappearing.

The sequencing need not be data-driven; it can be design-constrained. The merging and splitting of business units are conveyed via linking arrows. Longer arrows are unsightly, and meshes of arrows are confusing.

On this chart, the long arrow pointing from the orange to the gray around 2004 feels out of place. What if the financial services block is moved to the right of the consumer block? That will significantly shorten the long arrow. It won’t create other entanglements as the media block is completely disjoint and there are no other arrows tying financial services to another division.



To improve readability, the bars are spaced out horizontally. The addition of whitespace distorts the proportionality. So, in 2001, the annotation states that financial services (orange) accounted for “about half of the revenues,” which is directly contradicted by the visual perception – readers find the orange bar to be clearly shorter than the total length of the other bars. This is a serious deficiency of the chart form but this chart conveys the "ebb and flow" very well.

NYT hits the trifecta with this market correction chart

Yesterday, in the front page of the Business section, the New York Times published a pair of charts that perfectly captures the story of the ongoing turbulence in the stock market.

Here is the first chart:


Most market observers are very concerned about the S&P entering "correction" territory, which the industry arbitrarily defines as a drop of 10% or more from a peak. This corresponds to the shortest line on the above chart.

The chart promotes a longer-term reflection on the recent turbulence, using two reference points: the index has returned to the level even with that at the start of 2018, and about 16 percent higher since the beginning of 2017.

This is all done tastefully in a clear, understandable graphic.

Then, in a bit of a rhetorical flourish, the bottom of the page makes another point:


When viewed back to a 10-year period, this chart shows that the S&P has exploded by 300% since 2009.

A connection is made between the two charts via the color of the lines, plus the simple, effective annotation "Chart above".

The second chart adds even more context, through vertical bands indicating previous corrections (drops of at least 10%). These moments are connected to the first graphic via the beige color. The extra material conveys the message that the market has survived multiple corrections during this long bull period.

Together, the pair of charts addresses a pressing current issue, and presents a direct, insightful answer in a simple, effective visual design, so it hits the Trifecta!


There are a couple of interesting challenges related to connecting plots within a multiple-plot framework.

While the beige color connects the concept of "market correction" in the top and bottom charts, it can also be a source of confusion. The orientation and the visual interpretation of those bands differ. The first chart uses one horizontal band while the chart below shows multiple vertical bands. In the first chart, the horizontal band refers to a definition of correction while in the second chart, the vertical bands indicate experienced corrections.

Is there a solution in which the bands have the same orientation and same meaning?


These graphs solve a visual problem concerning the visualization of growth over time. Growth rates are anchored to some starting time. A ten-percent reduction means nothing unless you are told ten-percent of what.

Using different starting times as reference points, one gets different values of growth rates. With highly variable series of data like stock prices, picking starting times even a day apart can lead to vastly different growth rates.

The designer here picked several obvious reference times, and superimposes multiple lines on the same plotting canvass. Instead of having four lines on one chart, we have three lines on one, and four lines on the other. This limits the number of messages per chart, which speeds up cognition.

The first chart depicts this visual challenge well. Look at the start of 2018. This second line appears as if you can just reset the start point to 0, and drag the remaining portion of the line down. The part of the top line (to the right of Jan 2018) looks just like the second line that starts at Jan 2018.


However, a closer look reveals that the shape may be the same but the magnitude isn't. There is a subtle re-scaling in addition to the re-set to zero.

The same thing happens at the starting moment of the third line. You can't just drag the portion of the first or second line down - there is also a needed re-scaling.

Appreciating population mountains

Tim Harford tweeted about a nice project visualizing of the world's distribution of population, and wondered why he likes it so much. 

That's the question we'd love to answer on this blog! Charts make us emotional - some we love, some we hate. We like to think that designers can control those emotions, via design choices.

I also happen to like the "Population Mountains" project as well. It fits nicely into a geography class.

1. Chart Form

The key feature is to adopt a 3D column chart form, instead of the more conventional choropleth or dot density. The use of columns is particularly effective here because it is natural - cities do tend to expand vertically upwards when ever more people cramp into the same amount of surface area. 


Imagine the same chart form is used to plot the number of swimming pools per square meter. It just doesn't make the same impact. 

2. Color Scale

The designer also made judicious choices on the color scale. The discrete, 5-color scheme is a clear winner over the more conventional, continuous color scale. The designer made a deliberate choice because most software by default uses a continuous color scale for continuous data (population density per square meter).


Also, notice that the color intervals in 5-color scale is not set uniformly because there is a power law in effect - the dense areas are orders of magnitude denser than the sparsely populated areas, and most locations are low-density. 

These decisions have a strong influence on the perception of the information: it affects the heights of the peaks, the contrasts between the highs and lows, etc. It also injects a degree of subjectivity into the data visualization exercise that some find offensive.

3. Background

The background map is stripped of unnecessary details so that the attention is focused on these "population mountains". No unnecessary labels, roads, relief, etc. This demonstrates an acute awareness of foreground/background issues.

4. Insights on the "shape" of the data 

The article makes the following comment:

What stands out is each city’s form, a unique mountain that might be like the steep peaks of lower Manhattan or the sprawling hills of suburban Atlanta. When I first saw a city in 3D, I had a feel for its population size that I had never experienced before.

I'd strike out population size and replace with population density. In theory, the sum of the areas of the columns in any given surface area gives you the "population size" but given the fluctuating heights of these columns, and the different surface areas (sprawls) of different cities, it is an Olympian task to estimate the volumes of the population mountains!

The more salient features of these mountains, most easily felt by readers, are the heights of the peak columns, the sprawl of the cities, and the general form of the mass of columns. The volume of the mountain is one of the tougher things to see. Similarly, the taller 3D columns hide what's behind them, and you'd need to spin and rotate the map to really get a good feel.

Here is the contrast between Paris and London, with comparable population sizes. You can see that the population in Paris (and by extension, France) is much more concentrated than in the U.K. This difference is a surprise to me.


5. Sourcing

Some of the other mountains, especially those in India and China, look a bit odd to me, which leads me to wonder about the source of the data. This project has a very great set of footnotes that not only point to the source of the data but also a discussion of its limitations, including the possibility of inaccuracies in places like India and China. 


Check out Population Mountains!






A second take on the rural-urban election chart

Yesterday, I looked at the following pictograms used by Business Insider in an article about the rural-urban divide in American politics:


The layout of this diagram suggests that the comparison of 2010 to 2018 is a key purpose.

The following alternate directly plots the change between 2010 and 2018, reducing the number of plots from 4 to 2.


The 2018 results are emphasized. Then, for each party, there can be a net add or loss of seats.

The key trends are:

  • a net loss in seats in "Pure rural" districts, split by party;
  • a net gain of 3 seats in "rural-suburban" districts;
  • a loss of 10 Democratic seats balanced by a gain of 13 Republican seats.


Another experiment with enhanced pictogram

In a previous post, I experimented with an idea around enhancing pictograms. These are extremely popular charts used to show countable objects. I found another example in Business Insider's analysis of the mid-term election results. Here is an excerpt of a pair of pictograms that show the relative performance of Republicans and Democrats in districts that are classified as "Pure Rural" or "Rural-Suburban":


(Note that there is an error in the bottom left chart. There should be 24 blue squares not 34! In the remainder of the post, I will retain this error so that the revisions are comparable to the original.)

There are quite a few dimensions going on in this deceptively simple chart. There is the red domination of these rural districts to the tune of 75 to 80% share. There is the further weakening of Democrats from 2010 to 2018.  There is a shift of seats out of pure rural areas (- 13) and into rural-suburban (+14) from 2010 to 2018.

Anyone who learn of the above trends probably did so by reading off the data tables on the sides. It's a given that those tables, or simple bar charts can be more effective with this dataset.

What I like to explore is the pictogram, assuming that we are required to use a pictogram. Can the pictogram be enhanced to overcome some of its weaknesses?

The defining characteristic of the pictogram is the presence of individual units, which means the reader can count the units. This feature is also its downfall. In most pictograms, it is a bear to count the units. Try counting out the blue and red squares in the above image - and don't cheat by staring at the data tables!

My goal is to enhance the pictogram by making it easier for readers to count the units. The strategy is to place cues so that the units can be counted in larger groups like 5 or 10. Also, when possible, exploit symmetry.

Here is an example:


The squares are arranged to facilitate comparing the 2010 and 2018 numbers. So for rural-suburban, there were 10 fewer blue squares and +10+3 = +13 red squares.

This post to be continued in the next post ....