If you blink, you'd miss this axis trick

When I set out to write this post, I was intending to make a quick point about the following chart found in the current issue of Harvard Magazine (link):

Harvardmag_humanities

This chart concerns the "tectonic shift" of undergraduates to STEM majors at the expense of humanities in the last 10 years.

I like the chart. The dot plot is great for showing this data. They placed the long text horizontally. The use of color is crucial, allowing us to visually separate the STEM majors from the humanities majors.

My intended post is to suggest dividing the chart into four horizontal slices, each showing one of the general fields. It's a small change that makes the chart even more readable. (It has the added benefit of not needing a legend box.)

***

Then, the axis announced itself.

I was baffled, then disgusted.

Here is a magnified view of the axis:

Harvardmag_humanitiesmajors_axis

It's not a linear scale, as one would have expected. What kind of transformation did they use? It's baffling.

Notice the following features of this transformed scale:

  • It can't be a log scale because many of the growth values are negative.
  • The interval for 0%-25% is longer than for 25%-50%. The interval for 0%-50% is also longer than for 50%-100%. On the positive side, the larger values are pulled in and the smaller values are pushed out.
  • The interval for -20%-0% is the same length as that for 0%-25%. So, the transformation is not symmetric around 0

I have no idea what transformation was applied. I took the growth values, measured the locations of the dots, and asked Excel to fit a polynomial function, and it gave me a quadratic fit, R square > 99%.

Redo_harvardmaghumanitiesmajors_scale2

This formula fits the values within the range extremely well. I hope this isn't the actual transformation. That would be disgusting. Regardless, they ought to have advised readers of their unusual scale.

***

Without having the fitted formula, there is no way to retrieve the actual growth values except for those that happen to fall on the vertical gridlines. Using the inverse of the quadratic formula, I deduced what the actual values were. The hardest one is for Computer Science, since the dot sits to the right of the last gridline. I checked that value against IPEDS data.

The growth values are not extreme, falling between -50% and 125%. There is nothing to be gained by transforming the scale.

The following chart undoes the transformation, and groups the majors by field as indicated above:

Redo_harvardmagazine_humanitiesmajors

***

Yesterday, I published a version of this post at Andrew's blog. Several readers there figured out that the scale is the log of the relative ratio of the number of degrees granted. In the above notation, it is log10(100%+x), where x is the percent change in number of degrees between 2011 and 2021.

Here is a side-by-side view of the two scales:

Redo_harvardmaghumanitiesmajors_twoscales

The chart on the right spreads the negative growth values further apart while slightly compressing the large positive values. I still don't think there is much benefit to transforming this set of data.

 

P.S. [1/31/2023]

(1) A reader on Andrew's blog asked what's wrong with using the log relative ratio scale. What's wrong is exactly what this post is about. For any non-linear scale, the reader can't make out the values between gridlines. In the original chart, there are four points that exist between 0% and 25%. What values are those? That chart is even harder because now that we know what the transform is, we'd need to first think in terms of relative ratios, so 1.25 instead of 25%, then think in terms of log, if we want to know what those values are.

(2) The log scale used for change values is often said to have the advantage that equal distances on either side represent counterbalancing values. For example, (1.5) (0.66) = (3/2) (2/3)  = 1. But this is a very specific scenario that doesn't actually apply to our dataset.  Consider these scenarios:

History: # degrees went from 1000 to 666 i.e. Relative ratio = 2/3
Psychology: # degrees went from 2000 to 3000 i.e. Relative ratio = 3/2

The # of History degrees dropped by 334 while the number of Psychology degrees grew by 1000 (Psychology I think is the more popular major)

History: # degrees went from 1000 to 666 i.e. Relative ratio = 2/3
Psychology: from 1000 to 1500, i.e. Relative ratio = 3/2

The # of History degrees dropped by 334 while # of Psychology degrees grew by 500
(Assume same starting values)

History: # degrees went from 1000 to 666 i.e. Relative ratio = 2/3
Psychology: from 666 to 666*3/2 = 999 i.e. Relative ratio = 3/2

The # of History degrees dropped by 334 while # of Psychology degrees grew by 333
(Assume Psychology's starting value to be History's ending value)

Psychology: # degrees went from 1000 to 1500 i.e. Relative ratio = 3/2
History: # degrees went from 1500 to 1000 i.e. Relative ratio = 2/3

The # of Psychology degrees grew by 500 while the # of History degrees dropped by 500
(Assume History's starting value to be Psychology's ending value)

 

 


Where have the graduates gone?

Someone submitted this chart on Twitter as an example of good dataviz.

Washingtonpost_aftercollege

The chart shows the surprising leverage colleges have on where students live after graduation.

The primary virtue of this chart is conservation of space. If our main line of inquiry is the destination states of college graduations - by state, then it's hard to beat this chart's efficiency at delivering this information. For each state, it's easy to see what proportion of graduates leave the state after graduation, and then within those who leave, the reader can learn which are the most popular destination states, and their relative importance.

The colors link the most popular destination states (e.g. Texas in orange) but they are not enough because the designer uses state labels also. A next set of states are labeled without being differentiated by color. In particular, New York and Massachusetts share shades of blue, which also is the dominant color on the left side.

***

The following is a draft of a concept I have in my head.

Junkcharts_redo_washpost_postgraddestinations_1

I imagine this to be a tile map. The underlying data are not public so I just copied down a bunch of interesting states. This view brings out the spatial information, as we expect graduates are moving to neighboring states (or the states with big cities).

The students in the Western states are more likely to stay in their own state, and if they move, they stay in the West Coast. The graduates in the Eastern states also tend to stay nearby, except for California.

I decided to use groups of color - blue for East, green for South, red for West. Color is a powerful device, if used well. If the reader wants to know which states send graduates to New York, I'm hoping the reader will see the chart this way:

Junkcharts_redo_washpost_postgraddestinations_2

 


Funnels and scatters

I took a peek at some of the work submitted by Ray Vella's students in his NYU dataviz class recently.

The following chart by Hosanah Bryan caught my eye:

Rich Get Richer_Hosanah Bryan (v2)

The data concern the GDP gap between rich and poor regions in various countries. In some countries, especially in the U.K., the gap is gigantic. In other countries, like Spain and Sweden, the gap is much smaller.

The above chart uses a funnel metaphor to organize the data, although the funnel does not add more meaning (not that it has to). Between that, the color scheme and the placement of text, it's visually clean and pleasant to look at.

The data being plotted are messy. They are not actual currency values of GDP. Each number is an index, and represents the relative level of the GDP gap in a given year and country. The gap being shown by the colored bars are differences in these indices 15 years apart. (The students were given this dataset to work with.)

So the chart is very hard to understand if one focuses on the underlying data. Nevertheless, the same visual form can hold other datasets which are less complicated.

One can nitpick about the slight misrepresentation of the values due to the slanted edges on both sides of the bars. This is yet another instance of the tradeoff between beauty and precision.

***

The next chart by Liz Delessert engages my mind for a different reason.

The Rich Get Richerv2

The scatter plot sets up four quadrants. The top right is "everyone gets richer". The top left, where most of the dots lie, is where "the rich get richer, the poor get poorer".  This chart shows a thoughtfulness about organizing the data, and the story-telling.

The grid setup cues readers toward a particular way of looking at the data.

But power comes with responsibility. Such scatter plots are particularly susceptible to the choice of data, in this case, countries. It is tempting to conclude that there are no countries in which everyone gets poorer. But that statement more likely tells us more about which countries were chosen than the real story.

I like to see the chart applied to other data transformations that are easier. For example, we can start with the % change in GDP computed separately for rich and for poor. Then we can form a ratio of these two percent changes.

 

 


The what of visualization, beyond the how

A long-time reader sent me the following chart from a Nature article, pointing out that it is rather worthless.

Nautre_scihub

The simple bar chart plots the number of downloads, organized by country, from the website called Sci-Hub, which I've just learned is where one can download scientific articles for free - working around the exorbitant paywalls of scientific journals.

The bar chart is a good example of a Type D chart (Trifecta Checkup). There is nothing wrong with the purpose or visual design of the chart. Nevertheless, the chart paints a misleading picture. The Nature article addresses several shortcomings of the data.

The first - and perhaps most significant - problem is that many Sci-Hub users are expected to access the site via VPN servers that hide their true countries of origin. If the proportion of VPN users is high, the entire dataset is called into doubt. The data would contain both false positives (in countries with VPN servers) and false negatives (in countries with high numbers of VPN users). 

The second problem is seasonality. The dataset covered only one month. Many users are expected to be academics, and in the southern hemisphere, schools are on summer vacation in January and February. Thus, the data from those regions may convey the wrong picture.

Another problem, according to the Nature article, is that Sci-Hub has many competitors. "The figures include only downloads from original Sci-Hub websites, not any replica or ‘mirror’ site, which can have high traffic in places where the original domain is banned."

This mirror-site problem may be worse than it appears. Yes, downloads from Sci-Hub underestimate the entire market for "free" scientific articles. But these mirror sites also inflate Sci-Hub statistics. Presumably, these mirror sites obtain their inventory from Sci-Hub by setting up accounts, thus contributing lots of downloads.

***

Even if VPN and seasonality problems are resolved, the total number of downloads should be adjusted for population. The most appropriate adjustment factor is the population of scientists, but that statistic may be difficult to obtain. A useful proxy might be the number of STEM degrees by country - obtained from a UNESCO survey (link).

A metric of the type "number of Sci-Hub downloads per STEM degree" sounds odd and useless. I'd argue it's better than the unadjusted total number of Sci-Hub downloads. Just don't focus on the absolute values but the relative comparisons between countries. Even better, we can convert the absolute values into an index to focus attention on comparisons.

 


Why you should expunge the defaults from Excel or (insert your favorite graphing program)

Yesterday, I posted the following chart in the post about Cornell's Covid-19 case rate after re-opening for in-person instruction.

Redo_junkchats_fraziercornellreopeningsuccess2

This is an edited version of the chart used in Peter Frazier's presentation.

Pfrazier_cornellreopeningupdate

The original chart carries with it the burden of Excel defaults.

What did I change and why?

I switched away from the default color scheme, which ignores the relationships between the two lines. In particular, the key comparison on this chart should be the actual case rate versus the nominal case rate. In addition, the three lines at the top are related as they all come from the same underlying mathematical model. I used the same color but different shades.

Also, instead of placing the legend as far away from the data labels as possible, I moved the line labels next to the data labels.

Instead of daily date labels, I moved to weekly labels, and set the month names on a separate level than the day names.

The dots were removed from the top three lines but I'd have retained them, perhaps with some level of transparency, if I spent more time making the edits. I'd definitely keep the last dot to make it clear that the blue lines contain one extra dot.

***

Every graphing program has defaults, typically computed by some algorithm tuned to the average chart. Don't settle for the average chart. Get rid of any default setting that slows down understanding.

 

 


A testing mess: one chart, four numbers, four colors, three titles, wrong units, wrong lengths, wrong data

Twitterstan wanted to vote the following infographic off the island:

Tes_Alevelsresults

(The publisher's website is here but I can't find a direct link to this graphic.)

The mishap is particularly galling given the controversy swirling around this year's A-Level results in the U.K. For U.S. readers, you can think of A-Levels as SAT Subject Tests, which in the U.K. are required of all university applicants, and represent the most important, if not the sole, determinant of admissions decisions. Please see the upcoming post on my book blog for coverage of the brouhaha surrounding the statistical adjustments (to be posted sometime this week, it's here.).

The first issue you may notice about the chart is that the bar lengths have no relationship with the numbers printed on them. Here is a scatter plot correlating the bar lengths and the data.

Junkcharts_redo_tes_alevels_scatter


As you can see, nothing.

Then, you may wonder what the numbers mean. The annotation at the bottom right says "Average number of A level qualifications per student". Wow, the British (in this case, English) education system is a genius factory - with the average student mastering close to three thousand subjects in secondary (high) school!

TES is the cool name for what used to be the Times Educational Supplement. I traced the data back to Ofqual, which is the British regulator for these examinations. This is the Ofqual version of the above chart:

Ofqual_threeAstar

The data match. You may see that the header of the data table reads "Number of students in England getting 3 x A*". This is a completely different metric than number of qualifications - in fact, this metric measures geniuses. "A*" is the U.K. equivalent of "A+". When I studied under the British system, there was no such grade. I guess grade inflation is happening all over the world. What used to be A is now A+, and what used to be B is now A. Scoring three A*s is tops - I wonder if this should say 3 or more because I recall that you can take as many subjects as you desire but most students max out at three (may have been four).

The number of students attaining the highest achievement has increased in the last two years compared to the two years before. We can't interpret these data unless we know if the number of students also grew at similar rates.

The units are students while the units we expect from the TES graphic should be subjects. The cutoff for the data defines top students while the TES graphic should connote minimum qualification, i.e. a passing grade.

***
Now, the next section of the Ofqual infographic resolves the mystery. Here is the chart:

Ofqual_Alevelquals

This dataset has the right units and measurement. There is almost no meaningful shift in the last four years. The average number of qualifications per student is only different at the second decimal place. Replacing the original data with this set removes the confusion.

Junkcharts_redo_tes_alevels_correctdata

While I was re-making this chart, I also cleaned out the headers and sub-headers. This is an example of software hegemony: the designer wouldn't have repeated the same information three times on a chart with four numbers if s/he wasn't prompted by software defaults.

***

The corrected chart violates one of the conventions I described in my tutorial for DataJournalism.com: color difference should reflect data difference.

In the following side-by-side comparison, you see that the use of multiple colors on the left chart signals different data - note especially the top and bottom bars which carry the same number, but our expectation is frustrated.

Junkcharts_redo_tes_alevels_sidebyside

***

[P.S. 8/25/2020. Dan V. pointed out another problem with these bar charts: the bars were truncated so that the bar lengths are not proportional to the data. The corrected chart is shown on the right below:

Junkcharts_redo_tes_alevels_barlengths

8/26/2020: added link to the related post on my book blog.]


Cornell must remove the logs before it reopens the campus in the fall

Against all logic, Cornell announced last week it would re-open in the fall because a mathematical model under development by several faculty members and grad students predicts that a "full re-opening" would lead to 80 percent fewer infections than a scenario of full virtual instruction. That's what was reported by the media.

The model is complicated, with loads of assumptions, and the report is over 50 pages long. I will put up my notes on how they attained this counterintuitive result in the next few days. The bottom line is - and the research team would agree - it is misleading to describe the analysis as "full re-open" versus "no re-open". The so-called full re-open scenario assumes the entire community including students, faculty and staff submit to a full program of test-trace-isolate, including (mandatory) PCR diagnostic testing once every five days throughout the 16-week semester, and immediate quarantine and isolation of new positive cases, as well as those in contact with such persons, plus full compliance with this program. By contrast, it assumes students do not get tested in the online instruction scenario. In other words, the researchers expect Cornell to get done what the U.S. governments at all levels failed to do until now.

[7/8/2020: The post on the Cornell model is now up on the book blog. Here.]

The report takes us back to the good old days of best-base-worst-case analysis. There is no data for validating such predictions so they performed sensitivity analyses, defined as changing one factor at a time assuming all other factors are fixed at "nominal" (i.e. base case) values. In a large section of the report, they publish a series of charts of the following style:

Cornell_reopen_sensitivity

Each line here represents one of the best-base-worst cases (respectively, orange-blue-green). Every parameter except one is given the "nominal" value (which represents the base case). The parameter that is manpulated is shown on the horizontal axis, and for the above chart, the variable is the assumption of average number of daily contacts per person. The vertical axis shows the main outcome variable, which is the percentage of the community infected by the end of term.

This flatness of the lines in the above chart appears to say that the outcome is quite insensitive to the change in the average daily contact rate under all three scenarios - until the daily contact rises above 10 per person per day. It also appears to show that the blue line is roughly midway between the orange and the green so the percent infected is slightly less-than halved under the optimistic scenario, and a bit more than doubled under the pessimistic scenario, relative to the blue line.

Look again.

The vertical axis is presented in log scale, and only labeled at values 1% and 10%. About midway between 1 and 10 on the horizontal axis, the outcome value has already risen above 10%. Because of the log transformation, above 10%, each tick represents an increase of 10% in proportion. So, the top of the vertical axis indicates 80% of the community being infected! Nothing in the description or labeling of the vertical axis prepares the reader for this.

The report assumes a fixed value for average daily contacts of 8 (I rounded the number for discussion), which is invariable across all three scenarios. Drawing a vertical line about eight-tenths of the way towards 10 appears to signal that this baseline daily contact rate places the outcome in the relatively flat part of the curve.

Look again.

The horizontal axis too is presented in log scale. To birth one log-scale may be regarded as a misfortune; to birth two log scales looks like carelessness. 

Since there exists exactly one tick beyond 10 on the horizontal axis, the right-most value is 20. The model has been run for values of average daily contacts from 1 to 20, with unit increases. I can think of no defensible reason why such a set of numbers should be expressed in a log scale.

For the vertical axis, the outcome is a proportion, which is confined to within 0 percent and 100 percent. It's not a number that can explode.

***

Every log scale on a chart is birthed by its designer. I know of no software that automatically performs log transforms on data without the user's direction. (I write this line with trepidation wishing that I haven't planted a bad idea in some software developer's head.)

Here is what the shape of the original data looks like - without any transformation. All software (I'm using JMP here) produces something of this type:

Redo-cornellreopen-nolog

At the baseline daily contact rate value of 8, the model predicts that 3.5% of the Cornell community will get infected by the end of the semester (again, assuming strict test-trace-isolate fully implemented and complied).  Under the pessimistic scenario, the proportion jumps to 14%, which is 4 or 5 times higher than the base case. In this worst-case scenario, if the daily contact rate were about twice the assumed value (just over 16), half of the community would be infected in 16 weeks!

I actually do not understand how there could only be 8 contacts per person per day when the entire student body has returned to 100% in-person instruction. (In the report, they even say the 8 contacts could include multiple contacts with the same person.) I imagine an undergrad student in a single classroom with 50 students. This assumption says the average student in this class only comes into contact with at most 8 of those. That's one class. How about other classes? small tutorials? dining halls? dorms? extracurricular activities? sports? parties? bars?

Back to graphics. Something about the canonical chart irked the report writers so they decided to try a log scale. Here is the same chart with the vertical axis in log scale:

Redo-cornellreopen-logy

The log transform produces a visual distortion. On the right side, where the three lines are diverging rapidly, the log transform pulls them together. On the left side, where the three lines are close together, the log transform pulls them apart.

Recall that on the log scale, a straight line is exponential growth. Look at the green line (worst case). That line is approximately linear so in the pessimistic scenario, despite assuming full compliance to a strict test-trace-isolate regimen, the cases are projected to grow exponentially.

Something about that last chart still irked the report writers so they decided to birth a second log scale. Here is the chart they ultimately settled on:

Redo-cornellreopen-logylogx

As with the other axis, the effect of the log transform is to squeeze the larger values (on the right side) and spread out the smaller values (on the left side). After this cosmetic surgery, the left side looks relatively flat while the right side looks steep.

In the next version of the Cornell report, they should replace all these charts with ones using linear scales.

***

Upon discovering this graphical mischief, I wonder if the research team received a mandate that includes a desired outcome.

 

[P.S. 7/8/2020. For more on the Cornell model, see this post.]


Gazing at petals

Reader Murphy pointed me to the following infographic developed by Altmetric to explain their analytics of citations of journal papers. These metrics are alternative in that they arise from non-academic media sources, such as news outlets, blogs, twitter, and reddit.

The key graphic is the petal diagram with a number in the middle.

Altmetric_tetanus

I have a hard time thinking of this object as “data visualization”. Data visualization should visualize the data. Here, the connection between the data and the visual design is tenuous.

There are eight petals arranged around the circle. The legend below the diagram maps the color of each petal to a source of data. Red, for example, represents mentions in news outlets, and green represents mentions in videos.

Each petal is the same size, even though the counts given below differ. So, the petals are like a duplicative legend.

The order of the colors around the circle does not align with its order in the table below, for a mysterious reason.

Then comes another puzzle. The bluish-gray petal appears three times in the diagram. This color is mapped to tweets. Does the number of petals represent the much higher counts of tweets compared to other mentions?

To confirm, I pulled up the graphic for a different paper.

Altmetric_worldwidedeclineofentomofauna

Here, each petal has a different color. Eight petals, eight colors. The count of tweets is still much larger than the frequencies of the other sources. So, the rule of construction appears to be one petal for each relevant data source, and if the total number of data sources fall below eight, then let Twitter claim all the unclaimed petals.

A third sample paper confirms this rule:

Altmetric_dnananodevices

None of the places we were hoping to find data – size of petals, color of petals, number of petals – actually contain any data. Anything the reader wants to learn can be directly read. The “score” that reflects the aggregate “importance” of the corresponding paper is found at the center of the circle. The legend provides the raw data.

***

Some years ago, one of my NYU students worked on a project relating to paper citations. He eventually presented the work at a conference. I featured it previously.

Michaelbales_citationimpact

Notice how the visual design provides context for interpretation – by placing each paper/researcher among its peers, and by using a relative scale (percentiles).

***

I’m ignoring the D corner of the Trifecta Checkup in this post. For any visualization to be meaningful, the data must be meaningful. The type of counting used by Altmetric treats every tweet, every mention, etc. as a tally, making everything worth the same. A mention on CNN counts as much as a mention by a pseudonymous redditor. A pan is the same as a rave. Let’s not forget the fake data menace (link), which  affects all performance metrics.


Graph literacy, in a sense

Ben Jones tweeted out this chart, which has an unusual feature:

Malefemaleliteracyrates

What's unusual is that time runs in both directions. Usually, the rule is that time runs left to right (except, of course, in right-to-left cultures). Here, the purple area chart follows that convention while the yellow area chart inverts it.

On the one hand, this is quite cute. Lines meeting in the middle. Converging. I get it.

On the other hand, every time a designer defies conventions, the reader has to recognize it, and to rationalize it.

In this particular graphic, I'm not convinced. There are four numbers only. The trend on either side looks linear so the story is simple. Why complicate it using unusual visual design?

Here is an entirely conventional bumps-like chart that tells the story:

Redo_literacyratebygender

I've done a couple of things here that might be considered controversial.

First, I completely straightened out the lines. I don't see what additional precision is bringing to the chart.

Second, despite having just four numbers, I added the year 1996 and vertical gridlines indicating decades. A Tufte purist will surely object.

***

Related blog post: "The Return on Effort in Data Graphics" (link)


The rule governing which variable to put on which axis, served a la mode

When making a scatter plot, the two variables should not be placed arbitrarily. There is a rule governing this: the outcome variable should be shown on the vertical axis (also called y-axis), and the explanatory variable on the horizontal (or x-) axis.

This chart from the archives of the Economist has this reversed:

20160402_WOC883_icecream_PISA

The title of the accompanying article is "Ice Cream and IQ"...

In a Trifecta Checkup (link), it's a Type DV chart. It's preposterous to claim eating ice cream makes one smarter without more careful studies. The chart also carries the xyopia fallacy: by showing just two variables, readers are unwittingly led to explain differences in "IQ" using differences in per-capita ice-cream consumption when lots of other stronger variables will explain any gaps in IQ.

In this post, I put aside my objections to the analysis, and focus on the issue of assigning variables to axes. Notice that this chart reverses the convention: the outcome variable (IQ) is shown on the horizontal, and the explanatory variable (ice cream) is shown on the vertical.

Here is a reconstruction of the above chart, showing only the dots that were labeled with country names. I fitted a straight regression line instead of a curve. (I don't understand why the red line in the original chart bends upwards when the data for Japan, South Korea, Singapore and Hong Kong should be dragging it down.)

Redo_econ_icecreamIQ_1A

Note that the interpretation of the regression line raises eyebrows because the presumed causality is reversed. For each 50 points increase in PISA score (IQ), this line says to expect ice cream consumption to raise by about 1-2 liters per person per year. So higher IQ makes people eat more ice cream.

***

If the convention is respected, then the following scatter plot results:

Redo_econ_icecreamIQ_2

The first thing to note is that the regression analysis is different here from that shown in the previous chart. The blue regression line is not equivalent to the black regression line from the previous chart. You cannot reverse the roles of the x and y variables in a regression analysis, and so neither should you reverse the roles of the x and y variables in a scatter plot.

The blue regression line can be interpreted as having two sections, roughly, for countries consuming more than or less than 6 liters of ice cream per person per year. In the less-ice-cream countries, the correlation between ice cream and IQ is stronger (I don't endorse the causal interpretation of this statement).

***

When you make a scatter plot, you have two variables for which you want to analyze their correlation. In most cases, you are exploring a cause-effect relationship.

Higher income households cares more on politics.
Less educated citizens are more likely to not register to vote.
Companies with more diverse workforce has better business performance.

Frequently, the reverse correlation does not admit a causal interpretation:

Caring more about politics does not make one richer.
Not registering to vote does not make one less educated.
Making more profits does not lead to more diversity in hiring.

In each of these examples, it's clear that one variable is the outcome, the other variable is the explanatory factor. Always put the outcome in the vertical axis, and the explanation in the horizontal axis.

The justification is scientific. If you are going to add a regression line (what Excel calls a "trendline"), you must follow this convention, otherwise, your regression analysis will yield the wrong result, with an absurd interpretation!

 

[PS. 11/3/2019: The comments below contain different theories that link the two variables, including theories that treat PISA score ("IQ") as the explanatory variable and ice cream consumption as the outcome. Also, I elaborated that the rule does not dictate which variable is the outcome - the designer effectively signals to the reader which variable is regarded as the outcome by placing it in the vertical axis.]