The blue mist

The New York Times printed several charts about Twitter "blue checks," and they aren't one of their best efforts (link).

Blue checks used to be credentials given to legitimate accounts, typically associated with media outlets, celebrities, brands, professors, etc. They are free but must be approved by Twitter. Since Elon Musk acquired Twitter, he turned blue checks into a revenue generator. Yet another subscription service (but you're buying "freedom"!). Anyone can get a blue check for US$8 per month.

[The charts shown here are scanned from the printed edition.]


The first chart is a scatter plot showing the day of joining Twitter and the total number of followers the account has as of early November, 2022. Those are very strange things to pair up on a scatter plot but I get it: the designer could only work with the data that can be pulled down from Twitter's API.

What's wrong with the data? It would seem the interesting question is whether blue checks are associated with number of followers. The chart shows only Twitter Blue users so there is nothing to compare to. The day of joining Twitter is not the day of becoming "Twitter Blue", almost surely not for any user (Nevetheless, the former is not a standard data element released by Twitter). The chart has a built-in time bias since the longer an account exists, one would assume the higher the number of followers (assuming all else equal). Some kind of follower rate (e.g. number of followers per year of existence) might be more informative.

Still, it's hard to know what the chart is saying. That most Blue accounts have fewer than 5,000 followers? I also suspect that they chopped off the top of the chart (outliers) and forgot to mention it. Surely, some of the celebrity accounts have way over 150,000 followers. Another sign that the top of the chart was removed is that an expected funnel effect is not seen. Given the follower count is cumulative from the day of registration, we'd expect the accounts that started in the last few months should have markedly lower counts than those created years ago. (This is even more true if there is a survivorship bias - less successful accounts are more likely to be deleted over time.)

The designer arbitrarily labelled six specific accounts ("Crypto influencer", "HBO fan", etc.) but this feature risks sending readers the wrong message. There might be one HBO fan account that quickly grew to 150,000 followers in just a few months but does the data label suggest to readers that HBO fan accounts as a group tend to quickly attain high number of followers?


The second chart, which is an inset of the first, attempts to quantify the effect of the Musk acquisition on the number of "registrations and subscriptions". In the first chart, the story was described as "Elon Musk buys Twitter sparking waves of new users who later sign up for Twitter Blue".


The second chart confuses me. I was trying to figure out what is counted in the vertical axis. This was before I noticed the inset in the first chart, easy to miss as it is tucked into the lower right corner. I had presumed that the axis would be the same as in the first chart since there weren't any specific labels. In that case, I am looking at accounts with 0 to 500 followers, pretty inconsequential accounts. Then, the chart title uses the words "registrations and subscriptions." If the blue dots on this chart also refer to blue-check accounts as in the first chart, then I fail to see how this chart conveys any information about registrations (wbich presumably would include free accounts). As before, new accounts that aren't blue checks won't appear.

Further, to the extent that this chart shows a surge in subscriptions, we are restricted to accounts with fewer than 500 followers, and it's really unclear what proportion of total subscribers is depicted. Nor is it possible to estimate the magnitude of this surge.

Besides, I'm seeing similar densities of the dots across the entire time window between October 2021 and 2022. Perhaps the entire surge is hidden behind the black lines indicating the specific days when Musk announced and completed the acquisition, respectively. If the surge is hiding behind the black vertical lines, then this design manages to block the precise spots readers are supposed to notice.

Here is where we can use the self-sufficiency test. Imagine the same chart without the text. What story would you have learned from the graphical elements themselves? Not much, in my view.


The third chart isn't more insightful. This chart purportedly shows suspended accounts, only among blue-check accounts.


From what I could gather (and what I know about Twitter's API), the chart shows any Twitter Blue account that got suspended at any time. For example, all the black open circles occurring prior to October 27, 2022 represent suspensions by the previous management, and presumably have nothing to do with Elon Musk, or his decision to turn blue checks into a subscription product.

There appears to be a cluster of suspensions since Musk took over. I am not sure what that means. Certainly, it says he's not about "total freedom". Most of these suspended accounts have fewer than 50 followers, and only been around for a few weeks. And as before, I'm not sure why the analyst decided to focus on accounts with fewer than 500 followers.

What could have been? Given the number of suspended accounts are relatively small, an interesting analysis would be to form clusters of suspended accounts, and report on the change in what types of accounts got suspended before and after the change of management.


The online article (link) is longer, filling in some details missing from the printed edition.

There is one view that shows the larger accounts:


While more complete, this view isn't very helpful as the biggest accounts are located in the sparsest area of the chart. The data labels again pick out strange accounts like those of adult film stars and an Arabic news site. It's not clear if the designer is trying to tell us that most of Twitter Blue accounts belong to those categories.

See here for commentary on other New York Times graphics.





This chart advises webpages to add more words

A reader sent me the following chart. In addition to the graphical glitch, I was asked about the study's methodology.


I was able to trace the study back to this page. The study uses a line chart instead of the bar chart with axis not starting at zero. The line shows that web pages ranked higher by Google on the first page tend to have more words, i.e. longer content may help with Google ranking.


On the bar chart, Position 1 is more than 6 times as big as Position 10, if one compares the bar areas. But it's really only 20% larger in the data.

In this case, even the line chart is misleading. If we extend the Google Position to 20, the line would quickly dip below the horizontal axis if the same trend applies.

The line chart includes too much grid, one of Tufte's favorite complaints. The Google position is an integer and yet the chart's gridlines imply that 0.5 rank is possible.

Any chart of this data should supply information about the variance around these average word counts. Would like to see a side-by-side box plot, for example.

Another piece of context is the word counts for results on the second or third pages of Google results. Where are the short pages?


Turning to methodology, we learn that the research team analyzed 1 million pages of Google search results, and they also "removed outliers from our data (pages that contained fewer than 51 words and more than 9999 words)."

When you read a line like this, you have to ask some questions:

How do they define "outlier"? Why do they choose 51 and 9,999 as the cut-offs?

What proportion of the data was removed at either end of the distribution?

If these proportions are small, then the outliers are not going to affect that average word count by much, and thus there is no point to their removal. If they are large, we'd like to see what impact removing them might have.

In any case, the median is a better number to use here, or just show us the distribution, not just the average number.

It could well be true that Google's algorithm favors longer content, but we need to see more of the data to judge.



Speed demon quartered and shrunk

Reader Richard K. submitted a link to Microsoft Edge's website.

Screen Shot 2017-08-09 at 10.00.08 PM

This chart uses three speedometers to tell the story that Microsoft's Edge browser is faster than Chrome or Firefox. These speedometer charts are disguised racetrack charts. Read last week's post first if you haven't.

Richard complained the visual design distorting the data. How the distortion entered the picture is a long story. Let's begin with an accurate representation of the data:


Next, we pull those speedometer curves straight:


While the three values are within 10 percent of each other, the lengths of the two shorter curves are only 40-50 percent of the length of the longest one! This massive distortion is due to not starting the axis (i.e., speedometer) at zero.

We now put the missing 25,000 back onto the chart, proportionally expanding each bar. As seen below, fixing the axis does not get us back to the desired relative lengths, so some other distorting factor is at play.


The culprit is that the middle speedometer is 44 percent larger than the other two. If we inflate the side bars by 44 percent, the world is made right again. Phew!





If Clinton and Trump go to dinner, do they sit face to face, or side by side?

One of my students tipped me to an August article in the Economist, published when last the media proclaimed Donald Trump's campaign in deep water. The headline said "Donald Trump's Media Advantage Falters."

Who would have known, judging from the chart that accompanies the article?


There is something very confusing about the red line, showing "Trump August 2015 = 1." The data are disaggregated by media channel, and yet the index is hitched to the total of all channels. It is also impossible to figure out how Clinton is doing relative to Trump in each channel.

Here is a small-multiples rendering that highlights the key comparisons:


Alternatively, one can plot the Clinton advantage versus Trump in each channel, like this:


One sees that Clinton has caught up in the last month (July 2016), primarily through more coverage by "online news."

Imagine Mr. Trump and Mrs. Clinton dining at a restaurant. Are they seated side by side (Economist) or face to face (junkcharts)?

After seeing this chart, my mouth needed a rinse

The credit for today's headline goes to Andrew Gelman, who said something like that when I presented the following chart at his Statistical Graphics class yesterday:

Fidelityad_consumerstaples_adj_smWith this chart (which appeared in a large ad in the NY Times), Fidelity Investment wants to tell potential customers to move money into the consumer staples category because of "greater return" and "lower risk". You just might wonder what a "consumer staple" is. Toothbrushes, you see.

There are too many issues with the chart to fit into one blog post. My biggest problem concerns the visual trickery used to illustrate "greater" and "lower". The designer wants to focus readers on the two orange brushes: return for consumer staples is higher, and risk is lower, you see.

The "greater" (i.e. right-facing) toothbrush is associated with longer brushes and higher elevation; the "lower" (left-facing) toothbrush, with shorter brushes and lower elevation.

But looking carefully at the scales reveals that the return ranges from 6% to 14% and the risk ranges from 10% to 25%. So larger numbers are depicted by shorter brushes and lower elevation, exactly the opposite of one's expectation. The orange brushes happen to  represent the same value of 14.3% but the one on the right is at least four times as large as the one on the left. As the dentist says, time to rinse out!

The vertical axis represents ranking of the investment categories in terms of decreasing return and/or risk so on both toothbrushes, the axis should run from 1 to 10.


How would the dentist fix this?

The first step is to visit the Q corner of the Trifecta Checkup. The purpose of this chart is for investors to realize that (using the chosen metrics) consumer durables have the best combination of risk and return. In finance, risk is measured as the volatility of return. So, in effect, all the investors care about is the probability of getting a certain level of return.

The trouble with any chart that shows both risk and return is that readers have no way of going from the pair of numbers to the probability of getting a certain level of return.

The fix is to plot the probability of returns directly.


In the above sketch, I just assumed a normal probability model, which is incorrect; but it is not hard to substitute this with an empirial distribution, if you obtain the raw data.

Unlike the original chart, it does not appear that consumer staples is a clearcut winner.



Tufte soundbites

Tom B. alerted me to an interview with Ed Tufte, by Ad Age (link). It's a good read. The journalist attended one of Tufte's courses but then the interview was conducted via email. So it reads like a condensed version of Tufte's writing, stuffed with his many colorful coinages.

I like this comment related to Big Data:

First: "overwhelming data" is a bit of a hoax. Many of the time measurements have enormous serial correlation (just because you can measure to the millisecond doesn't mean you've learned anything about a process that moves to a monthly rhythm) and extreme high collinearities in the things measured (as in the endless web metrics, many of which are measuring the same thing over and over). Finally, most website data bizarrely and deliberately overstates the extent and intensity of website activity.

Pets may need shelter from this terrible chart

Josh tweeted quite a shocking attack ad to me last week. He told me it came from the DC Metro. The ad is taken out by a group called HumaneWatch.Org, which apparently is a watchdog checking up on charity organizations. The ad attacks a specific group called the Humane Society of the United States. Here is the map that is the centerpiece of the copy:


Trifecta_checkupI like to use the Trifecta checkup to evaluate graphics. It's a nice way to organize your visualization critique. You progress through three corners: figuring out what is the practical question being addressed by the graphic, then evaluating what data is being deployed, and finally whether the graphical elements (the chart itself) is well executed in relation to the question and the data.


Based on the map, it appears that HumaneWatch is interested in the spending on pet shelters. Every number shown is tiny: on a quick scan, the range may be from 0% to 0.35%. The all-caps title "A Whole Lotta Nothing" confirms that this is the intended message.

Knowing nothing about either of these organizations leaves me confused. Should the "Humane Society" be spending the bulk of its budget on pet shelters? If it doesn't, is it because the staff is pilfering money, or because it has wasteful spending, or because pets are not its major cause, or because pet shelters are not the key way this organization helps pets?

I did look up Humane Society to learn that it is an animal rights group. The four bullet points at the bottom of the ad provide a clue as to what the designer wanted to convey: namely, that this charity is a scam, with too much overhead spending, and spending on pensions.


So I think the question being asked is sufficiently clarified, and it's a pretty important one. How is this organization spending its donations? Is it irresponsible compared to other similar organizations?


The data should be in sync with the question being addressed; that's why there is a link between the two corners of the Trifecta. Given the trouble I endured understanding the question being addressed, it would come as no surprise that this chart scores poorly on the DATA corner.

I don't understand why budget spent on pet shelters is the key bone of contention. Based on the perceived objectives, it seems that they should display directly what proportion of the budget went to overhead, and what proportion went to pensions, with suitable comparisons.

The analysis by state is a disease of having too much data. Let's imagine that the proportions averaged across all states come to 0.1%. If we replaced those 50 numbers with one number printed across all states: "The Humane Society spends less than 0.1% of its budget on pet shelters.", the message would have been identical, while being less confusing.

And it's not just confusion. Cutting the data by state introduces complications. The analyst would need to make sure that any differences between states are not due to factors such as the number of pets, the proportion of households owning pets, the average spending per pet, the supply and demand for pet shelters, the existence of alternatives to pet shelters, etc. None of these issues need to worry the designer who does not slice the data down.

The same reason goes for why the absolute amount of spending (encoded in the colors of individual states) is not worth the ink it's printed on. The range between 0% and 0.35% has been chopped into seven pieces, which creates artificial gaps between the states. This design muddles the graphic's key message, "A Whole Lotta Nothing".


As we land on the final corner of the Trifecta, we ignore our previous complaint and accept that the proportion of budget is an interesting data series to visualize, and turn attention to the graphical elements. This chart scores poorly on chart execution as well!

Notice that the designer simultaneously plots two data series on the same map, the dollar value of pet shelter spending, and it as a proportion of budget. The former is encoded in the color of the state areas while the latter is printed directly as data labels. This is a map equivalent of "dual-axes" line charts, and equally unreadable.

Dcmetro_map_colorsBased on the color legend, our brain tells us the yellow states are better than the blue states but the huge numbers printed on the map conveys the opposite message. The progression of colors makes little sense. The red and yellow stand out but those states are in the middle of the range.

It's a little blurry but I think there is a number of New England states in the high spending category (black and dark gray colors), and the map just happens to obscure this key feature.




DATA: Very Poor


An inspired picture of Blackberry's dying inspiration

The New York Times has a splendid example of an infographics this weekend, showing the rise and fall of the Blackberry.


Notice the inspired touch of the black circles to trace the outline of Blackberry's market share. They are a guide to experiencing the chart.

I wish they had put the Palm section above Blackberry. In an area chart, the only clean section is the bottom section in which the market share is not cumulated. Given the focus on Blackberry, it's a pity readers have to perform subtractions to tease out the shares.

I also wonder if the black circles should contain Blackberry's market share rather than the year labels.

But I enjoyed this chart. Thanks for producing it.