« December 2018 | Main | February 2019 »

GDPR: justice for data visualization

Reader LG found the following chart, tweeted by @EU_Justice.

EU_justice_GDPRinnumbers

This chart is a part of a larger infographic, which is found here.

The following points out a few issues with this effort:

Redo_eujustice_gdpr_complaints_1

The time axis is quite embarrassing. The first six months or so are squeezed into less than half the axis while the distance between Nov and Dec is not the same as that between Dec and Jan. So the slope of each line segment is what the designer wants it to be!

The straight edges of the area chart imply that there were only three data points, with straight lines drawn between each measurement. Sadly, the month labels are not aligned to the data on the line.

Lastly, the dots between May and November intended to facilitate reading this chart backfire. There are 6 dots dividing the May-Nov segment when there should only be five.

***

The chart looks like this when not distorted:

Redo_eujustice_gdpr_complaints_2

 


Labels, scales, controls, aggregation all in play

JB @barclaysdevries sent me the following BBC production over Twitter.

Johnbennett_barclaysdevries_bbc_chinagrowth

He was not amused.

This chart pushes a number of my hot buttons.

First, I like to assume that readers don't need to be taught that 2007 and 2018 are examples of "Year".

Second, starting an area chart away from zero is equally as bad as starting a bar chart not at zero! The area is distorted and does not reflect the relative values of the data.

Third, I suspect the 2007 high point is a local peak, which they chose in order to forward a sky-is-falling narrative related to China's growth.

So I went to a search engine and looked up China's growth rate, and it helpfully automatically generated the following chart:

Google_chinagrowth

Just wow! This chart does a number of things right.

First, it confirms my hunch above. 2007 is a clear local peak and it is concerning that the designer chose that as a starting point.

Second, this chart understands that the zero-growth line has special meaning.

Third, there are more year labels.

Fourth, and very importantly, the chart offers two "controls". We can look at China's growth relative to India's and relative to the U.S.'s. Those two other lines bring context.

JB's biggest complaint is that the downward-sloping line confuses the issue, which is that slowing growth is still growth. The following chart conveys a completely different message but the underlying raw data are the same:

Redo_chinagdpgrowth

 


Visual Exploration of Unemployment Data

The charts on unemployment data I put up last week are best viewed as a collection. 

I have put them up on the (still in beta) JMP Public website. You can find the project here

Screen Shot 2019-01-20 at 1.47.59 PM

I believe that if you make an account, you can grab the underlying dataset.

 


Men and women faced different experiences in the labor market

Last week, I showed how the aggregate statistics, unemployment rate, masked some unusual trends in the labor market in the U.S. Despite the unemployment rate in 2018 being equal, and even a little below, that in 2000, the peak of the last tech boom, there are now significantly more people "not in the labor force," and these people are not counted in the unemployment rate statistic.

The analysis focuses on two factors that are not visible in the unemployment rate aggregate: the proportion of people considered not in labor force, and the proportion of employees who have part-time positions. The analysis itself masks a difference across genders.

It turns out that men and women had very different experiences in the labor market.

For men, things have looked progressively worse with each recession and recovery since 1990. After each recovery, more men exit the labor force, and more men become part-timers. The Great Recession, however, hit men even worse than previous recessions, as seen below:

Jc_unemployment_rate_explained_men

For women, it's a story of impressive gains in the 1990s, and a sad reversal since 2008.

Jc_unemployment_rate_explained_women

P.S. See here for Part 1 of this series. In particular, the color scheme is explained there. Also, the entire collection can be viewed here


What to make of the historically low unemployment rate

One of the amazing economic stories of the moment is the unemployment rate, which at around 4% has returned to the level last reached during the peak of the tech boom in 2000. The story is much more complex than it seems.

I devoted a chapter of Numbersense (link) to explain how the government computes unemployment rates. The most important thing to realize is that an unemployment rate of 4 percent does NOT mean that four out of 100 people in the U.S. are unemployed, and 96 out of 100 are employed.

It doesn't even mean that four out of 100 people of working age are unemployed, and 96 out of 100 of working age are employed.

What it means is of the people that the government decides are "employable", 96 out of 100 are employed. Officially, this employability is known as "in labor force." There are many ways to be disqualified from the labor force; one example is if the government decides that the person is not looking for a job.

On the flip side, who the government counts as "employed" also matters! Part-timers are considered employed. They are counted just like a full-time employee in the unemployment metric. Part-time, according to the government, is one to 34 hours worked during the week the survey is administered.

***

So two factors can affect the unemployment rate a lot - the proportion of the population considered "not in labor force" (thus not counted at all); and the proportion of those considered employed who are part-timers. (Those are two disjoint groups.)

The following chart then shows that despite the unemployment rate looking great, the U.S. labor market in 2018 looks nothing like what it looked like from 1990 to 2008.

Jc_unemployment_rate_explained

Technical notes: all the data are seasonally adjusted by the Bureau of Labor Statistics. I used a spline to smooth the data first - the top chart shows the smoothed version of the unemployment rates. Smoothing removes month-to-month sharp edges from the second chart. The color scale is based on standardized values of the smoothed data.

 

P.S. See Part 2 of this series explores the different experiences of male and female workers. Also, the entire collection can be viewed here.