Powerful photos visualizing housing conditions in Hong Kong

I was going to react to Alberto's post about the New York Times's article about economic inequality in Hong Kong, which is proposed as one origin to explain the current protest movement. I agree that the best graphic in this set is the "photoviz" showing the "coffins" or "cages" that many residents live in, because of the population density. 

Nyt_hongkong_apartment_photoviz

Then I searched the archives, and found this old post from 2015 which is the perfect response to it. What's even better, that post was also inspired by Alberto.

The older post featured a wonderful campaign by human rights organization Society for Community Organization that uses photoviz to draw attention to the problem of housing conditions in Hong Kong. They organized a photography exhibit on this theme in 2014. They then updated the exhibit in 2016.

Here is one of the iconic photos by Benny Lam:

Soco_trapped_B1

I found more coverage of Benny's work here. There is also a book that we can flip on Vimeo.

In 2017, the South China Morning Post (SCMP) published drone footage showing the outside view of the apartment buildings.

***

What's missing is the visual comparison to the luxury condos where the top 1 percent live. For these, one can  visit the real estate sites, such as Sotheby's. Here is their "12 luxury homes for sales" page.

Another comparison: a 1000 sq feet apartment that sits between those extremes. The photo by John Butlin comes from SCMP's Post Magazine's feature on the apartment:

Butlin_scmp_home

***

Also check out my review of Alberto's fantastic, recent book, How Charts Lie.

Cairo_howchartslie_cover

 

 


Measles babies

Mona Chalabi has made this remarkable graphic to illustrate the effect of the anti-vaccine movement on measles cases in the U.S.: (link)

Monachalabi_measles

As a form of agitprop, the graphic seizes upon the fear engendered by the defacing red rash of the disease. And it's very effective in articulating its social message.

***

I wasn't able to find the data except for a specific year or two. So, this post is more inspired by the graphic than a direct response to it.

I think the left-side legend should say "1 case of measles in someone who was not vaccinated" (as opposed to 1 case of measles in aggregate).

The chart encodes the data in the density of the red dots. What does the density of the red dots signify? There are two possibilities: case counts or case rates.

2013 is a year in which I could find data. In 2013, the U.S. saw 187 cases of measles, only 4 of them in someone who was vaccinated. In other words, there are 49 times as many measles cases among the unvaccinated as the vaccinated.

But note that about 90 percent of the population (using 13-17 year olds as a proxy) are vaccinated. The chance of getting measles in the unvaccinated is 0.8 per million, compared to 0.002 per million in the vaccinated - 422 times higher.

The following chart shows the relative appearance of the dot densities. The bottom row which compares the relative chance of getting measles is the more appropriate metric, and it looks much worse.

Junkcharts_monachalabi_measles

***

Mona's instagram has many other provocative graphics.

 


Pretty circular things

National Geographic features this graphic illustrating migration into the U.S. from the 1850s to the present.

Natgeo_migrationtreerings

 

What to Like

It's definitely eye-catching, and some readers will be enticed to spend time figuring out how to read this chart.

The inset reveals that the chart is made up of little colored strips that mix together. This produces a pleasing effect of gradual color gradation.

The white rings that separate decades are crucial. Without those rings, the chart becomes one long run-on sentence.

Once the reader invests time in learning how to read the chart, the reader will grasp the big picture. One learns, for example, that migrants from the most recent decades have come primarily from Latin America (orange) or Asia (pink). Migrants from Europe (green) and Canada (blue) came in waves but have been muted in the last few decades.

 

What's baffling

Initially, the chart is disorienting. It's not obvious whether the compass directions mean anything. We can immediately understand that the further out we go, the larger numbers of migrants. But what about which direction?

The key appears in the legend - which should be moved from bottom right to top left as it's so important. Apparently, continent/country of origin is coded in the directions.

This region-to-color coding seems to be rough-edged by design. The color mixing discussed above provides a nice artistic effect. Here, the reader finds out that mixing is primarily between two neighboring colors, thus two regions placed side by side on the chart. Thus, because Europe (green) and Asia (pink) are on opposite sides of the rings, those two colors do not mix.

Another notable feature of the chart is the lack of any data other than the decade labels. We won't learn how many migrants arrived in any decade, or the extent of migration as it impacts population size.

A couple of other comments on the circular design.

The circles expand in size for sure as time moves from inside out. Thus, this design only works well for "monotonic" data, that is to say, migration always increases as time passes.

The appearance of the chart is only mildly affected by the underlying data. Swapping the regions of origin changes the appearance of this design drastically.

 

 

 

 

 


The art of contaminating data

Schwab_indexfundassets_sm

This is one of those innocent-looking charts that could have been a poster child for artistic embellishment. The straightforward time-series chart is deemed too boring. The designer shows admirable constraint in inserting “information-free” content, such as the dense gridlines (graph paper) and the 3D effect (ticker).

Seem harmless but not really.

Here I turn off the color.

Redo_schwab_indexassets_bw_sm

After the 3D effect is applied, the reader no longer knows whether to look at the top or bottom edge of the ticker.

This view makes this point even clearer.

Jc_redo_schwab_indexassets_bw2_sm

The art contaminates the data.


Light entertainment: Making art by making data

Chris P. sent in this link to a Wired feature on "infographics."

The first entry is by Giorgia Lupi and Stefanie Posavec.

Wired_Stefanie-Data-Final

These are fun images and I enjoy looking at it as hand-drawn art. But it's a stretch to call them "data visualization," "data," or "data analysis," which are all tags used by the Wired editing staff.

(PS. Wired chose a particular example of their work. There are many examples of Lupi's work that strike a balance between handicraft and data communications.)

 


Mapping the two Americas

If you type "two Americas map" into Google image search, you get the following top results:

Google_twoAmericasmaps

Designers overwhelmingly pick the choropleth map as the way to depitct the two nations.

Now, look at these maps from the New York Times (link):


Nytimes_election2016_mapDem

and this:

Nytimes_election2016_mapRep

I believe the background is a relief map. Would like to see one where the color is based on the strength of support for Democrats or Republicans.

The pair of maps is extremely effective at bringing out the story about the splitting of the U.S. population. From a design standpoint, I really like it.

I love, love, love the cute annotations everywhere on the page. I imagine the designer had fun coming up with them.

Nytimes_election2016_mapRep_inset

Pittsburgh Puddle, Cleveland Cove, Cincinnati Slough, ...

***

There is an artistic (or data journalistic) license behind the way the data are processed. Most likely, a 50% cutoff is applied to determine which map a county sits atop. The analysis is at the county level so there is neccessarily some simplification... in fact, this aggregation is needed to make the "islands" and other features contiguous.

I am a bit sad that at this moment, we are so focused on what sets us apart, and not what binds us together as a nation.

 

PS. Via twitter, Maciej reacted negatively to these maps: "Horribly tendentious map visualization from the NYT makes the candidate who won more votes look like a tiny minority."

This is a good illustration of selecting the chart form to bring out one's message. If the goal of the chart is to show that Clinton has more votes, I agree that these maps fail to convey that message.

What I believe the NYT designer wants to point out is that the supporters of Clinton are clustered into these densely populated urban areas, leaving the Republicans with most of the land mass. (Like I said above, because of the 50% cutoff criterion, we are over-simplifying the picture. There are definitely Democrats living somewhere in Trump's nation, and likewise Republicans residing in Clinton strongholds.)


Raining, data art, if it ain't broke

Via Twitter, reader Joe D. asked a few of us to comment on the SparkRadar graphic by WeatherSpark.

At the time of writing, the picture for Baltimore is very pretty:

Sparkradar

The picture for New York is not as pretty but still intriguing. We are having a bout of summer and hence the white space (no precipitation):

Sparkradar_newyork

Interpreting this innovative chart is a tough task - this is a given with any innovative chart. Explaining the chart requires all the text on this page.

The difficulty of interpreting the SparkRadar chart is twofold.

Firstly, the axes are unnatural. Time runs vertically, defying the horizontal convention. Also, "now" - the most recent time depicted - is at the very bottom, which tempts readers to read bottom to top, meaning we are reading time running backwards into the past. In most charts, time run left to right from past to present (at least in the left-right-centric part of the world that I live in.)

Location has been reduced to one dimension. The labels "Distance Inside" and "Distance from Storm" confuse me - perhaps those who follow weather more closely can justify the labels. Conventionally, location is shown in two dimensions.

The second difficulty is created by the inclusion of irrelevant data (aka noise). The square grid prescribes a fixed box inside which all data are depicted. In the New York graphic, something is going on in the top right corner - far away in both time and space - how does it help the reader?

***

Now, contrast this chart to the more standard one, a map showing rain "clouds" moving through space.

Bing_precipitationradar_baltimore

(From Bing search result)

The standard one wins because it matches our intuition better.

Location is shown in two dimensions.

Distance from the city is shown on the map as scaled distance.

Time is shown as motion.

Speed is shown as speed of the motion. (In SparkRadar, speed is shown by the slope of imaginary lines.)

Severity is shown by density and color.

Nonetheless, a panel of the new charts make great data art.

 

 


After seeing this chart, my mouth needed a rinse

The credit for today's headline goes to Andrew Gelman, who said something like that when I presented the following chart at his Statistical Graphics class yesterday:

Fidelityad_consumerstaples_adj_smWith this chart (which appeared in a large ad in the NY Times), Fidelity Investment wants to tell potential customers to move money into the consumer staples category because of "greater return" and "lower risk". You just might wonder what a "consumer staple" is. Toothbrushes, you see.

There are too many issues with the chart to fit into one blog post. My biggest problem concerns the visual trickery used to illustrate "greater" and "lower". The designer wants to focus readers on the two orange brushes: return for consumer staples is higher, and risk is lower, you see.

The "greater" (i.e. right-facing) toothbrush is associated with longer brushes and higher elevation; the "lower" (left-facing) toothbrush, with shorter brushes and lower elevation.

But looking carefully at the scales reveals that the return ranges from 6% to 14% and the risk ranges from 10% to 25%. So larger numbers are depicted by shorter brushes and lower elevation, exactly the opposite of one's expectation. The orange brushes happen to  represent the same value of 14.3% but the one on the right is at least four times as large as the one on the left. As the dentist says, time to rinse out!

The vertical axis represents ranking of the investment categories in terms of decreasing return and/or risk so on both toothbrushes, the axis should run from 1 to 10.

***

How would the dentist fix this?

The first step is to visit the Q corner of the Trifecta Checkup. The purpose of this chart is for investors to realize that (using the chosen metrics) consumer durables have the best combination of risk and return. In finance, risk is measured as the volatility of return. So, in effect, all the investors care about is the probability of getting a certain level of return.

The trouble with any chart that shows both risk and return is that readers have no way of going from the pair of numbers to the probability of getting a certain level of return.

The fix is to plot the probability of returns directly.

Redo_fidelity_staples

In the above sketch, I just assumed a normal probability model, which is incorrect; but it is not hard to substitute this with an empirial distribution, if you obtain the raw data.

Unlike the original chart, it does not appear that consumer staples is a clearcut winner.

 

 


Fixing the visual versus fixing the story

It's great for me when my friend Alberto Cairo lent a helping hand (link). Here is the original chart showing deaths in African and Middle East countries due to recent unrest:

Cairo_arabspring_timeline

This is Cairo's redesign:

Cairo_arabspring_redo

There is no doubt the new version brings out the data more clearly. I like the cropping of the continent. I'd color-code the countries using the same legend as above.

I'm troubled by the concept of the original chart. I struggle to find any interesting correlation of deaths, whether with time, with government reaction, or with geography. Of the three, I think geography is the most correlated so a good design should bring that out. (Of course, geographical bias is expected and thus rather boring.)

If the intention of the chart is to answer the question of what factors affect deaths, then the wrong variables are being utilized.

So, as regards the Trifecta Checkup, Cairo solved the V problem while the D problem remains.