« June 2015 | Main | August 2015 »

Milan EXPO: further thoughts

I promised to blog more about the Milan EXPO so this is it.

My first reactions were recorded here. (link)

This post is primarily intended for those who are planning a visit.



One of the smartest design decisions is to line everything up along one street (the Decumano). It will take some genius to get lost even though there are many dozens of buildings. Once you get to the far end of the Decumano, there is a smaller road that runs perpendicular to it, which houses the buildings that showcase individual regions of Italy. This smaller road leads to the Tree of Life structure, where I found those delightful, swirling chairs. Here they are again:




The EXPO site is in the Milan suburbs. It is easily accessible by the Metro (subway) or by train. Either means of transportation takes about 20 minutes. The train takes riders right to the entrance, saving 10 minutes of walking from the subway stop, but depending on your origin, the train may be inconvenient. I later discovered that there are two subway exits: one exit links to an overpass while the other one to an underpass. Choose carefully if under/over makes a difference for you.

You need to carry a printed copy of your ticket. Your bags will be scanned. Liquids are allowed and are also scanned. This process is painless unless you fight with the crowds that appear at 7 pm because of reduced-price entry. Most pavilions close by 9 pm, leaving only restaurants open.



The food is great if you bring realistic expectations. You’re at a fair, not a gourmet food market. I was very happy with what I ate, and here are some highlights.

Eataly is there in a big way. They have 10 or 12 restaurants, representing different regions in Italy. Eataly is this high-end supermarket / restaurant chain that started in Italy and also now have stores in New York, Boston and Chicago. Not spectacular but way better than your average meal. If you want Italian food, you won’t go wrong here. I particularly like the Tuscany (Toscana) menu, serving two of my favorites: panzarella (bread salad), and pici (an extra-thick spaghetti) with duck ragu. You have to walk all the way to the back of the Eataly row to find the Toscana section.



Inside the Pavilions. You can fill yourself by sampling snacks as you run around the pavilions. I recommend this strategy because your schedule will be dominated by trying to get into certain pavilions (or more pavilions). The food is going to be hit or miss. Austria (left) has great stuff. France looks good. Belgium serves pub grub and beer. Holland has food trucks, mostly fast food. I liked the summer rolls in the Vietnam pavilion (right).



Vietnam and Belgium

Russia was giving away caviar on toast, which attracted a mob. Heard Chile has good food. Mexico has a food line. If you like cannoli, go to the “Civil Society” building and visit the Sicilian vendor.


You can always go to McDonald’s for American fast food. There are also various places where you can get Italian fast food, such as simple pastas and pizzas.

Several pavilions have proper sit-down restaurants. I can’t vouch for them as I didn’t try them. The French pavilion for example has a restaurant upstairs. I think Russia also has a restaurant.

Gelato. When I am in Italy, I am eating gelato every day. Gelato is godsend on these hot summer days. There are many places to get gelato at the EXPO. My favorite is Pernigotti, which has a booth in the chocolate area. I also got gelato behind the Israel pavilion. There is a small stand outside the Italy Pavilion. Also across from the Italy Pavilion, the Love It food store serves gelato on the far side. Granita (slushed ice drinks) would have been even better but I didn’t find any worth mentioning here.


Espresso. The safe and great options include Lavazza and Illy. Lavazza is in the Italian regions street, which runs perpendicular to the Decumano. Lavazza has some great-looking tarts and cakes, in addition to coffee. Illy is in the coffee exhibition area.


Other Pavilions

I also enjoyed France (most on-subject), Morocco, Slow Food, and especially the chocolate area.

I didn’t make it to Japan, Kazakhstan, China and Italy. Those attracted excellent reviews but the lines were too long. Several countries (Japan, Kazakhstan, etc.) produce staged experiences, which means once you are inside, you have to spend at least 30-45 minutes.

Have fun!



I try hard to not hate all hover-overs. Here is one I love

One of the smart things Noah (at WNYC) showed to my class was his NFL fan map, based on Facebook data.

This is the "home" of the visualization:


The fun starts by clicking around. Here are the Green Bay fans on Facebook:


Also, you can see these fans relative to other teams in the same division:


A team like Jacksonville has a tiny footprint:



What makes this visualization work?

Notice the "home" image and those straight black lines. They are the "natural" regions of influence, if you assume that all fans root for the team that they are physcially closest to. 

To appreciate this, you have to look at a more generic NFL fan map (this is one from Deadspin):


This map is informative but not as informative as it ought to be. The reference point provided here are the state boundaries but we don't have one NFL team per state. Those "Voronoi" boundaries Noah added are more reasonable reference points to compare to the Facebook fan data.

When looking at the fan map, the most important question you have is what is each team's region of influence. This work reminds me of what I wrote before about the Beer Map (link). Putting all beer labels (or NFL teams) onto the same map makes it hard to get quick answers to that question. A small-multiples presentation is more direct, as the reader can see the brands/teams one at a time.

Here, Noah makes use of interactivity to present these small multiples on the same surface. It's harder to compare multiple teams but that is a secondary question. He does have two additions in case readers want to compare multiple teams. If you click instead of mousing over a team, the team's area of influence sticks around. Also, he created tabs so you can compare teams within each division.

I usually hate hover-over effects. They often hide things that readers want (creating what Noah calls "scavenger hunts"). The hover-over effect is used masterfully here to organize the reader's consumption of the data.


Moving to the D corner of the Trifecta checkup. Here is Noah's comment on the data:

Facebook likes are far from a perfect method for measuring NFL fandom. In sparsely-populated areas of the country, counties are likely to have a very small sample size. People who like things on Facebook are also not a perfect cross-section of football fans (they probably skew younger, for example). Other data sources that could be used as proxies for fan interest (but are subject to their own biases) are things like: home game attendance, merchandise sales, TV ratings, or volume of tweets about a team.


United Nations gets dataviz

The UN, as I noted before, is getting into the dataviz game. Here is an announcement about a Data Viz Challenge that has just started. Flood them with ideas!


I am writing to invite you and your network of students and colleagues to participate in the United Nations data visualization challenge for the 2015 World Statistics Day.

The #WSD2015 challenge is to build an info-graphic or dynamic visualization featuring the latest data from the 2015 Millennium Development Goals report. This challenge is particularly geared towards designers, programmers and data scientists who are passionate about contributing open source software and data visualizations to promote peace, development, human rights and environmental sustainability. The winners will be announced during the United Nations World Statistics Day 2015 and featured on some UN websites. The deadline for submissions is Sunday, 20 September 2015.

In addition, please note that the Unite Ideas website will feature other challenges in the months to come. I will personally send you an e-mail to let you know when these new challenges are posted.

Feel free to share this invitation with your students, colleagues and friends, and to contact me with any questions or ideas related to data science and visualization at the United Nations.

Best regards,

Jorge Martinez Navarrete
Project Leader, Unite Ideas
Twitter: @UN_UniteIdeas

Incomprehensible, and even insidious

A reader Alex V. nominated this chart as one of the most incomprehensible ever:


This comes from the Annual Report 2014 of Allison Transmission.

I applaud the fact that they obviously spent time making the charts. This is not something that comes straight out of Excel.

And someone really tried here--but you'd hope someone else came to the rescue and let them know this is impossible to understand.

The use of leader lines to point to the actual data doesn't work, not least because there are only two margins to fit three lists of numbers. It's like the two little kids being forced to share one seat on the left margin.

The rightmost column adds to 100%. The largest three sections appear to say Allison used cash to pay dividends, to buy back share and to repay debt. The three uses accounted for almost 95% of the positive change in cash (really not sure what the base is of these percentages). The gap of 5% is split into two parts which are explained by labels that are quite uninformative ("Other, net", "Change in cash, Net").

Even this interpretation is flawed because the blue section is net change in cash, which presumably was positive in 2014 (and 2013). However, dividends, share repurchase and debt repayment all cause a negative shift in cash so how could they point in the same direction as a positive net change in cash?

Things fall apart if I apply this interpretation to 2012. The -28% blue section seems to indicate that Allison had a cash deficit that year. This is weird because that would imply cash increased 100% exactly in 2013 and in 2014.

Further, someone is trying to hide bad news. Compare the -28% section in blue for 2012 and the 25% section in red for 2013. The blue section is slightly smaller than it should be. Part of the trick is to draw the horizontal axis in the same blue as the blue block. The top edge of the blue block is really not part of the block!

Now you might argue that the distortion is so small it could be accidental. But then this happens again on a different chart on the same page:


The highest sales number was achieved in 2014 (the blue column). But no! The number in 2012 is 2,142 which is larger than the sales in 2014.


This is a reproduction of the first chart, using a line chart:


It doesn't present the relationship between the five statistics, which I'm not clear about, but at least in this version, you can see the trends.


PS. [7/19/2015: Corrected the labeling of the line chart above.]




Maps and legends

This chart, which I found flipping through Stern magazine in Germany, accomplishes one important goal. It makes me stop flipping, and look.


The chart presents a point of view that is refreshing. The Airbus A320 is a true collaborative effort. The chart presents a good amount of information efficiently. Reminds me of diagrams in instruction manuals for building airplane models.

It is in essence a map. And as with maps, it has a built-in bias. The size of a part is not proportional to its importance or value. So, one issue with this diagram is it draws attention to large parts with uncomplicated shapes.

One way to address this is to use an informative legend. Notice that the map up top takes up a lot of space while serving little purpose. Instead, one can use a bar chart with a colored bar for each country. This bar chart allows one to add an extra measure. For example, the proportion of value accounted for by each country.

European readers: I wonder if there is a standard color scheme for different countries. What do you think of their choice of color?



Would you be willing to miss a train to admire art?

This wonderful data visualization made me stop in my tracks at a train station somewhere in Bavaria.


It conveys so much information in such an efficient manner.

At a glance, the diagram tells passengers the configuration of the train they will be getting on, how many carriages, what types of carriage and crucially at which location the train will come to a stop at the current station.

The most important item is the curvy red line running vertically. This tells you where you are standing in relationship to the entire platform. I was standing right near the middle. If someone is standing on the sides, there are many trains they will not be able to get on.

The entire chart is in German but I didn't need to know German. This is what great data visualization accomplishes.


Would you be willing to miss a train just to admire this work? I would.


PS. I was a bit overexcited when I wrote the above. I hope my German readers will tell me what the red, yellow, green colors signify. Also why do some trains appear to have two or three disconnected carriages?

Visualizing survey results excellently

Surveys generate a lot of data. And, if you have used a survey vendor, you know they generate a ton of charts.

I was in Germany  to attend the Data Meets Viz workshop organized by Antony Unwin. Paul and Sascha from Zeit Online presented some of their work at the German publication, and I was highly impressed by this effort to visualize survey results. (I hope the link works for you. I found that the "scroll" fails on some platforms.)

The survey questions attempted to assess the gap between West and East Germans 25 years after reunification.

The best feature of this presentation is the maintenance of one chart form throughout. This is the general format:



The survey asks whether working mothers is a good thing or not. They choose to plot how the percent agreeing that working mothers is good changes over time. The blue line represents the East German average and the yellow line the West German average. There is a big gap in attitude between the two sides on this issue although both regions have experienced an increase in acceptance of working mothers over time.

All the other lines in the background indicate different subgroups of interest. These subgroups are accessible via the tabs on top. They include gender, education level, and age.

The little red "i" conceals some text explaining the insight from this chart.

Hovering over the "Men" tab leads to the following visual:


Both lines for men sit under the respective average but the shape is roughly the same. (Clicking on the tab highlights the two lines for men while moving the aggregate lines to the background.)

The Zeit team really does an amazing job keeping this chart clean while still answering a variety of questions.

They did make an important choice: not to put every number on this chart. We don't see the percent disagreeing or those who are ambivalent or chose not to answer the question.


Like I said before, what makes this set of charts is the seamless transitions between one question and the next. Every question is given the same graphical treatment. This eliminates learning time going from one chart to the next.

Here is one using a Likert scale, and accordingly, the vertical axis goes from 1 to 7. They plotted the average score within each subgroup and the overall average:


Here is one where they combined the top categories into a "Bottom 2 Box" type metric:



Finally, I appreciate the nice touch of adding tooltips to the series of dots used to aid navigation.


The theme of the workshop was interactive graphics. This effort by the Zeit team is one of the best I have seen. Market researchers take note!