The message left the visual

The following chart showed up in Princeton Alumni Weekly, in a report about China's population:

Sciam_chinapop_19802020

This chart was one of several that appeared in a related Scientific American article.

The story itself is not surprising. As China develops, its birth rate declines, while the death rate also falls, thus, the population ages. The same story has played out in all advanced economies.

***

From a Trifecta Checkup perspective, this chart suffers from several problems.

The text annotation on the top right suggests what message the authors intended to deliver. Pointing to the group of people aged between 30 and 59 in 2020, they remarked that this large cohort would likely cause "a crisis" when they age. There would be fewer youngsters to support them.

Unfortunately, the data and visual elements of the chart do not align with this message. Instead of looking forward in time, the chart compares the 2020 population pyramid with that from 1980, looking back 40 years. The chart shows an insight from the data, just not the right one.

A major feature of a population pyramid is the split by gender. The trouble is gender isn't part of the story here.

In terms of age groups, the chart treats each subgroup "fairly". As a result, the reader isn't shown which of the 22 subgroups to focus on. There are really 44 subgroups if we count each gender separately, and 88 subgroups if we include the year split.

***

The following redesign traces the "crisis" subgroup (those who were 30-59 in 2020) both backwards and forwards.

Junkcharts_redo_chinapopulationpyramids

The gender split has been removed; here, the columns show the total population. Color is used to focus attention to one cohort as it moves through time.

Notice I switched up the sample times. I pulled the population data for 1990 and 2060 (from this website). The original design used the population data from 1980 instead of 1990. However, this choice is at odds with the message. People who were 30 in 2020 were not yet born in 1980! They started showing up in the 1990 dataset.

At the other end of the "crisis" cohort, the oldest (59 year old in 2020) would have deceased by 2100 as 59+80 = 139. Even the youngest (30 in 2020) would be 110 by 2100 so almost everyone in the pink section of the 2020 chart would have fallen off the right side of the chart by 2100.

These design decisions insert a gap between the visual and the message.

 

 


Anti-encoding

Howie H., sometime contributor to our blog, found this chart in a doctor's office:

WhenToExpectAReturnCall_sm

Howie writes:

Among the multitude of data visualization sins here, I think the worst is that the chart *anti*-encodes the data; the longest wait time has the shortest arc!

While I waited I thought about a redesign.  Obviously a simple bar chart would work.  A properly encoded radial bar could work, or small multiple pie charts.  But I think the design brief here probably calls for a bit of responsible data art, as this is supposed to be an eye-catching poster.

I came up with a sort of bar superimposed on a calendar for reference.  To quickly draft the design it was easier to do small multiples, but maybe all three arrows could be placed on a two-week grid and the labels could be inside the arrows, or something like that.  It’s a very rough draft but I think it points toward a win-win of encoding the actual data while retaining the eye-catching poster-ness that I’m guessing was a design goal.

Here is his sketch:

JunkCharts-redo_howardh_WhenToExpectAReturnCall redesign sm

***

I found a couple of interesting ideas from Howie's re-design.

First, he tried to embody the concept of a week's wait by visual reference to a weekly calendar.

Second, in the third section, he wanted readers to experience "hardship"  by making them wrap their eyes to a second row.

He wanted the chart to be both accurate and eye-catching.

It's a nice attempt that will improve as he fiddles more with it.

***

Based on Howie's ideas, I came up with two sketches myself.

In the first sketch, instead of the arrows, I put numbers into the cells.

Junkcharts_redo_whentoexpectareturncall_1

In the second sketch, I emphasized eye-catching while sacrificing accuracy. It uses a spiral imagery, and I think it does a good job showing the extra pain of a week-long wait. Each trip around the circle represents 24 hours.

Junkcharts_redo_whentoexpectacall_2

The wait time is actually encoded in the traversal of angles, rather than the length of the spiral. I call this creation less than accurate because most readers will assume the spiral length to be the wait time, and thus misread the data.

Which one(s) do you like?


Making major things easy, and minor things hard

A recent issue of Significance magazine carried the following stacked column chart showing how the driver license status of men and women change as they age. The data came from the U.K.

Siginificance_olddrivers_1

Quick question - what percentage of British men in their sixties hold full driver licenses?

***

I was just kidding. Those questions can't be quickly answered on a stacked column chart. That's because you have to find the axis, and then mentally invert the axis.

On that chart, larger values are shown pointing down (green) and also pointing up (blue), and ... well, I don't have words for the yellow. In fact, the yellow segments, showing people without licenses, are possibly the most important category for this report.

In making decisions about visualizing data, it's important to separate out the major things from the minor things.

***

Here is a reimagination of the chart using connected dots:

Junkcharts_redo_significanceolderdrivers

What is hard to do using this chart is to verify that the three proportions add to 100%. What is easy is to read off the proportion for any gender, age and license status subgroup.

It's really quite intricate how these researchers binned the age data. There are bins of size 1, 4, 5 and 10, plus the top group is 85 and above. The way I handled these is to turn everything to 1-year bins. I assume that in the wider bins, we don't have precise data for each age, and the bin value is the average among the bin, thus it is as if someone had drawn a horizontal line across the bin width. (I left the top bin alone as I don't know what is the maximum age of a person in this study.)

***

Those of you who have laminated the flowchart of data visualization are probably irate. According to such a flowchart, one must use a column chart because the x variable (age band) has irregularly-sized discrete values, and one must use a stacked column chart because the y variable is a percentage, grouped by a third variable (license status).

Don't be mad, just ditch the flowchart.

 


Deliberately obstructing chart elements as a plot point

Bbc_globalwarming_ridgeplot smThese "ridge plots" have become quite popular in recent times. The following example, from this BBC report (link), shows the change in global air temperatures over time.

***

This chart is in reality a panel of probability density plots, one for each year of the dataset. The years are arranged with the oldest at the top and the most recent at the bottom. You take those plots and squeeze every ounce of the space out, so that each chart overlaps massively with the ones above it.

The plot at the bottom is the only one that can be seen unobstructed.

Overplotting chart elements, deliberately obstructing them, doesn't sound useful. Is there something gained for what's lost?

***

The appeal of the ridge plot is the metaphor of ridges, or crests if you see ocean waves. What do these features signify?

The legend at the bottom of the chart gives a hint.

The main metric used to describe global warming is the amount of excess temperature, defined as the temperature relative to a historical average, set as the average temperature during the pre-industrial age. In recent years, the average global temperature is about 1.5 degrees Celsius above the reference level.

One might think that the higher the peak in a given plot, the higher the excess temperature. Not so. The heights of those peaks do not indicate temperatures.

What's the scale of the vertical axis? The labels suggest years, but that's a distractor also. If we consider the panel of non-overlapping probability density charts, the vertical axis should show probability density. In such a panel, the year labels should go to the titles of individual plots. On the ridge plot, the density axes are sacrificed, while the year labels are shifted to the vertical axis.

Admittedly, probability density is not an intuitive concept, so not much is lost by its omission.

The legend appears to suggest that the vertical scale is expressed in number of days so that in any given year, the peak of the curve occurs where the most likely excess temperature is found. But the amount of excess is read from the horizontal axis, not the vertical axis - it is encoded as a displacement in location horizontally away from the historical average. In other words, the height of the peak still doesn't correlate with the magnitude of the excess temperature.

The following set of probability density curves (with made-up data) each has the same average excess temperature of 1.5 degrees. Going from top to bottom, the variability of the excess temperatures increases. The height of the peak decreases accordingly because in a density plot, we require the total area under the curve to be fixed. Thus, the higher the peak, the lower the daily variability of the excess temperature.

Kfung_pdf_variances

A problem with this ridge plot is that it draws our attention to the heights of the peaks, which provide information about a secondary metric.

If we want to find the story that the amount of excess temperature has been increasing over time, we would have to trace a curve through the ridges, which strangely enough is a line that moves top to bottom, initially somewhat vertically, then moving sideways to the right. In a more conventional chart, the line that shows growth over time moves from bottom left to top right.

***

The BBC article (link) features several charts. The first one shows how the average excess temperature trends year to year. This is a simple column chart. By supplementing the column chart with the ridge plot, I assume that the designer wants to tell readers that the average annual excess temperature masks daily variability. Therefore, each annual average has been disaggregated into 366 daily averages.

In the column chart, the annual average is compared to the historical average of 50 years. In the ridge plot, the daily average is compared to ... the same historical average of 50 years. That's what the reference line labeled pre-industrial average is saying to me.

It makes more sense to compare the 366 daily averages to 366 daily averages from those 50 years.

But now I've ruined the dataviz because in each probability density plot, there are 366 different reference points. But not really. We just have to think a little more abstractly. These 366 different temperatures are all mapped to the number zero, after adjustment. Thus, they all coincide at the same location on the horizontal axis.

(It's possible that they actually used 366 daily averages as references to construct the ridge plot. I'm guessing not but feel free to comment if you know how these values are computed.)


Organizing time-stamped data

In a previous post, I looked at the Economist chart about Elon Musk's tweeting compulsion. It's chart that contains lots of data, every tweet included, but one can't tell the number or frequency of tweets.

In today's post, I'll walk through a couple of sketches of other charts. I was able to find a dataset on Github that does not cover the same period of time but it's good enough for illustration purposes.

As discussed previously, I took cues from the Economist chart, in particular that the hours of the day should be divided up into four equal-width periods. One thing Musk is known for is tweeting at any hour of the day.

Junkcharts_redo_musktweets_columnsbyhourgroup

This is a small-multiples arrangement of column charts. Each column chart represents the tweets that were posted during a six-hour window, across all days in the dataset. A column covers half a year of tweets. We note that there were more tweets in the afternoon hours as he started tweeting more. In the first half of 2022, he sent roughly 750 tweets between 7 pm and midnight.

***

In this next sketch, I used a small-multiples of line charts. Each line chart represents tweets posted during a six-hour window, as before. Instead of counting how many tweets, here I "smoothed" the daily tweet count, so that each number is an average daily tweet count, with the average computed based on a rolling time window.

Junkcharts_redo_musktweets_sidebysidelines

 

***

Finally, let's cover a few details only people who make charts would care about. The time of day variable only makes sense if all times are expressed as "local time", i.e. the time at the location where Musk was tweeting from. This knowledge is not necessary to make a chart but it is essential to make the chart interpretable. A statement like Musk tweets a lot around midnight assumes that it was midnight where he was when he sent each tweet.

Since we don't have his travel schedule, we will definitely be wrong. In my charts, I assumed he is in the Pacific time zone, and never tweeted anywhere outside that time zone.

(Food for thought: the server that posts tweets certainly had the record of the time and time zone for each tweet. Typically, databases store these time stamps standardized to one time zone - call it Greenwich Mean Time. If you have all time stamps expressed in GMT, is it now possible to make a statement about midnight tweeting? Does standardizing to one time zone solve this problem?)

In addition, I suspect that there may be problems with the function used to compute those rolling sums and averages, so take the actual numbers on those sketches with a grain of salt. Specifically, it's hard to tell on any of these charts but Musk did not tweet every single day so there are lots of holes in the time series.


Simple presentations

In the previous post, I looked at this chart that shows the distributions of four subgroups found in a dataset:

Davidcurran_originenglishwords

This chart takes quite some effort to decipher, as does another version I featured.

The key messages appear to be: (i) most English words are of Germanic origin, (ii) the most popular English words are even more skewed towards Germanic origin, (iii) words of French origin started showing up around rank 50, those of Latin origin around rank 250.

***

If we are making a graphic for presentation, we can simplify the visual clutter tremendously by - hmmm - a set of pie charts.

Junkcharts_redo_originenglishwords_pies

For those allergic to pies, here's a stacked column chart:

Junkcharts_redo_originenglishwords_columns

Both of these can be thought of as "samples" from the original chart, selected to highlight shifts in the relative proportions.

Davidcurran_originenglishwords_sampled

I also reversed the direction of the horizontal axis as I think the story is better told starting from the whole dataset and honing in on subsets.

 

P.S. [1/10/2025] A reader who has expertise in this subject also suggested a stacked column chart with reversed axis in a comment, so my recommendation here is confirmed.


Gaining precision by deleting data

The title is a bit of a paradox, isn't it? When we want more precise knowledge about something, we want to gather more data, at greater granularity. But it's not that simple.

Here is the famous "wind map" by the New York Times (link) showing vote margin shifts in the U.S. Presidential elections from 2020 to 2024, at the county level. A red arrow pointing rightward indicates a county in which the voters shifted toward the Republican candidate (Trump). It paints the red wave story loud and clear.

Nyt_votemarginshiftmap

Even though every county is on the map, this map alone doesn't answer all possible questions about vote margin shift. For example, someone might be interested in the vote margin shift in counties with high Hispanic populations. It's impossible to learn this from the above map, even if one has a list of the names of these counties.

The answer is found in the following map, published by NBC News here:

Nbcnews_votemarginshiftmap_hispanics

The story is also very clear. This map can be thought of as the NYT map minus the counties that have negligible Hispanic populations. By deleting all unrelated data, the designer highlights the story about Hispanic voters.

The reader can use the tab up top to see partial shift maps that emphasize different demographic groups. Well done!

 

 

 

 


Fantastic auto show from the Bloomberg crew

I really enjoyed the charts in this Bloomberg feature on the state of Japanese car manufacturers in the Southeast Asian and Chinese markets (link). This article contains five charts, each of which is both engaging and well-produced.

***

Each chart has a clear message, and the visual display is clearly adapted for purpose.

The simplest chart is the following side-by-side stacked bar chart, showing the trend in share of production of cars:

Bloomberg_japancars_production

Back in 1998, Japan was the top producer, making about 22% of all passenger cars in the world. China did not have much of a car industry. By 2023, China has dominated global car production, with almost 40% of share. Japan has slipped to second place, and its share has halved.

The designer is thoughtful about each label that is placed on the chart. If something is not required to tell the story, it's not there. Consistently across all five charts, they code Japan in red, and China in a medium gray color. (The coloring for the rest of the world is a bit inconsistent; we'll get to that later.)

Readers may misinterpret the cause of this share shift if this were the only chart presented to them. By itself, the chart suggests that China simply "stole" share from Japan (and other countries). What is true is that China has invested in a car manufacturing industry. A more subtle factor is that the global demand for cars has grown, with most of the growth coming from the Chinese domestic market and other emerging markets - and many consumers favor local brands. Said differently, the total market size in 2023 is much higher than that in 1998.

***

Bloomberg also made a chart that shows market share based on demand:

Bloomberg_japancars_marketshares

This is a small-multiples chart consisting of line charts. Each line chart shows market share trends in five markets (China and four Southeast Asian nations) from 2019 to 2024. Take the Chinese market for example. The darker gray line says Chinese brands have taken 20 percent additional market share since 2019; note that the data series is cumulative over the entire window. Meanwhile, brands from all other countries lost market share, with the Japanese brands (in red) losing the most.

The numbers are relative, which means that the other brands have not necessarily suffered declines in sales. This chart by itself doesn't tell us what happened to sales; all we know is the market shares of brands from different countries relative to their baseline market share in 2019. (Strange period to pick out as it includes the entire pandemic.)

The designer demonstrates complete awareness of the intended message of the chart. The lines for Chinese and Japanese brands were bolded to highlight the diverging fortunes, not just in China, but also in Southeast Asia, to various extents.

On this chart, the designer splits out US and German brands from the rest of the world. This is an odd decision because the categorization is not replicated in the other four charts. Thus, the light gray color on this chart excludes U.S. and Germany while the same color on the other charts includes them. I think they could have given U.S. and Germany their own colors throughout.

***

The primacy of local brands is hinted at in the following chart showing how individual brands fared in each Southeast Asian market:

Bloomberg_japancars_seasiamarkets

 

This chart takes the final numbers from the line charts above, that is to say, the change in market share from 2019 to 2024, but now breaks them down by individual brand names. As before, the red bubbles represent Japanese brands, and the gray bubbles Chinese brands. The American and German brands are lumped in with the rest of the world and show up as light gray bubbles.

I'll discuss this chart form in a next post. For now, I want to draw your attention to the Malaysia market which is the last row of this chart.

What we see there are two dominant brands (Perodua, Proton), both from "rest of the world" but both brands are Malaysian. These two brands are the biggest in Malaysia and they account for two of the three highest growing brands there. The other high-growth brand is Chery, which is a Chinese brand; even though it is growing faster, its market share is still much smaller than the Malaysian brands, and smaller than Toyota and Honda. Honda has suffered a lot in this market while Toyota eked out a small gain.

The impression given by this bubble chart is that Chinese brands have not made much of a dent in Malaysia. But that would not be correct, if we believe the line chart above. According to the line chart, Chinese brands roughly earned the same increase in market share (about 3%) as "other" brands.

What about the bubble chart might be throwing us off?

It seems that the Chinese brands were starting from zero, thus the growth is the whole bubble. For the Malaysian brands, the growth is in the outer ring of the bubbles, and the larger the bubble, the thinner is the ring. Our attention is dominated by the bubble size which represents a snapshot in the ending year, providing no information about the growth (which is shown on the horizontal axis).

***

For more discussion of Bloomberg graphics, see here.


the wtf moment

You're reading some article that contains a standard chart. You're busy looking for the author's message on the chart. And then, the wtf moment strikes.

It's the moment when you discover that the chart designer has done something unexpected, something that changes how you should read the chart. It's when you learn that time is running right to left, for example. It's when you realize that negative numbers are displayed up top. It's when you notice that the columns are ordered by descending y-value despite time being on the x-axis.

Tell me about your best wtf moments!

***

The latest case of the wtf moment occurred to me when I was reading Rajiv Sethi's blog post on his theory that Kennedy voters crowded out Cheney voters in the 2024 Presidential election (link). Was the strategy to cosy up to Cheney and push out Kennedy wise?

In the post, Rajiv has included this chart from Pew:

Pew_science_confidence

The chart is actually about the public's confidence in scientists. Rajiv summarizes the message as: 'Public confidence in scientists has fallen sharply since the early days of the pandemic, especially among Republicans. There has also been a shift among Democrats, but of a slightly different kind—the proportion with “a great deal” of trust in scientists to act in our best interests rose during the first few months of the pandemic but has since fallen back.'

Pew produced a stacked column chart, with three levels for each demographic segment and month of the survey. The question about confidence in scientists admits three answers: a great deal, a fair amount, and not too much/None at all. [It's also possible that they offered 4 responses, with the bottom two collapsed as one level in the visual display.]

As I scan around the chart understanding the data, suddenly I realized that the three responses were not listed in the expected order. The top (light blue) section is the middling response of "a fair amount", while the middle (dark blue) section is the "a great deal" answer.

wtf?

***

Looking more closely, this stacked column chart has bells and whistles, indicating that the person who made it expended quite a bit of effort. Whether it's worthwhile effort, it's for us readers to decide.

By placing "a great deal" right above the horizon, the designer made it easier to see the trend in the proportion responding with "a great deal". It's also easy to read the trend of those picking the "negative" response because of how the columns are anchored. In effect, the designer is expressing the opinion that the middle group (which is also the most popular answer) is just background, and readers should not pay much attention to it.

The designer expects readers to care about one other trend, that of the "top 2 box" proportion. This is why sitting atop the columns are the data labels called "NET" which is the sum of those responding "a great deal" or "a fair amount".

***

For me, it's interesting to know whether the prior believers in science who lost faith in science went down one notch or two. Looking at the Republicans, the proportion of "a great deal" roughly went down by 10 percentage points while the proportion saying "Not too much/None at all" went up about 13%. Thus, the shift in the middle segment wasn't enough to explain all of the jump in negative sentiment; a good portion went from believer to skeptic during the pandemic.

As for Democrats, the proportion of believers also dropped by about 10 percentage points while the proportion saying "a fair amount" went up by almost 10 percent, accounting for most of the shift. The proportion of skeptics increased by about 2 percent.

So, for Democrats, I'm imagining a gentle slide in confidence that applies to the whole distribution while for Republicans, if someone loses confidence, it's likely straight to the bottom.

If I'm interested in the trends of all three responses, it's more effective to show the data in a panel like this:

Junkcharts_redo_pew_scientists

***

Remember to leave a comment when you hit your wtf moment next time!

 


Election coverage prompts good graphics

The election broadcasts in the U.S. are full-day affairs, and they make a great showcase for interactive graphics.

The election setting is optimal as it demands clear graphics that are instantly digestible. Anything else would have left viewers confused or frustrated.

The analytical concepts conveyed by the talking heads during these broadcasts are quite sophisticated, and they did a wonderful job at it.

***

One such concept is the value of comparing statistics against a benchmark (or, even multiple benchmarks). This analytics tactic comes in handy in the 2024 election especially, because both leading candidates are in some sense incumbents. Kamala was part of the Biden ticket in 2020, while Trump competed in both 2016 and 2020 elections.

Msnbc_2024_ga_douglas

In the above screenshot, taken around 11 pm on election night, the MSNBC host (that looks like Steve K.) was searching for Kamala votes because it appeared that she was losing the state of Georgia. The question of the moment: were there enough votes left for her to close the gap?

In the graphic (first numeric column), we were seeing Kamala winning 65% of the votes, against Trump's 34%, in Douglas county in Georgia. At first sight, one would conclude that Kamala did spectacularly well here.

But, is 65% good enough? One can't answer this question without knowing past results. How did Biden-Harris do in the 2020 election when they won the presidency?

The host touched the interactive screen to reveal the second column of numbers, which allows viewers to directly compare the results. At the time of the screenshot, with 94% of the votes counted, Kamala was performing better in this county than they did in 2020 (65% vs 62%). This should help her narrow the gap.

If in 2020, they had also won 65% of the Douglas county votes, then, we should not expect the vote margin to shrink after counting the remaining 6% of votes. This is why the benchmark from 2020 is crucial. (Of course, there is still the possibility that the remaining votes were severely biased in Kamala's favor but that would not be enough, as I'll explain further below.)

All stations used this benchmark; some did not show the two columns side by side, making it harder to do the comparison.

Interesting side note: Douglas county has been rapidly shifting blue in the last two decades. The proportion of whites in the county dropped from 76% to 35% since 2000 (link).

***

Though Douglas county was encouraging for Kamala supporters, the vote gap in the state of Georgia at the time was over 130,000 in favor of Trump. The 6% in Douglas represented only about 4,500 votes (= 70,000*0.06/0.94). Even if she won all of them (extremely unlikely), it would be far from enough.

So, the host flipped to Fulton county, the most populous county in Georgia, and also a Democratic stronghold. This is where the battle should be decided.

Msnbc_2024_ga_fulton

Using the same format - an interactive version of a small-multiples arrangement, the host looked at the situation in Fulton. The encouraging sign was that 22% of the votes here had not yet been counted. Moreover, she captured 73% of those votes that had been tallied. This was 10 percentage points better than her performance in Douglas, Ga. So, we know that many more votes were coming in from Fulton, with the vast majority being Democratic.

But that wasn't the full story. We have to compare these statistics to our 2020 benchmark. This comparison revealed that she faced a tough road ahead. That's because Biden-Harris also won 73% of the Fulton votes in 2020. She might not earn additional votes here that could be used to close the state-wide gap.

If the 73% margin held to the end of the count, she would win 90,000 additional votes in Fulton but Trump would win 33,000, so that the state-wide gap should narrow by 57,000 votes. Let's round that up, and say Fulton halved Trump's lead in Georgia. But where else could she claw back the other half?

***

From this point, the analytics can follow one of two paths, which should lead to the same conclusion. The first path runs down the list of Georgia counties. The second path goes up a level to a state-wide analysis, similar to what was done in my post on the book blog (link).

Cnn_2024_ga

Around this time, Georgia had counted 4.8 million votes, with another 12% outstanding. So, about 650,000 votes had not been assigned to any candidate. The margin was about 135,000 in Trump's favor, which amounted to 20% of the outstanding votes. But that was 20% on top of her base value of 48% share, meaning she had to claim 68% of all remaining votes. (If in the outstanding votes, she got the same share of 48% as in the already-counted, then she would lose the state with the same vote margin as currently seen, and would lose by even more absolute votes.)

The reason why the situation was more hopeless than it even sounded here is that the 48% base value came from the 2024 votes that had been counted; thus, for example, it included her better-than-benchmark performance in Douglas county. She would have to do even better to close the gap! In Fulton, which has the biggest potential, she was unable to push the vote share above the 2020 level.

That's why in my book blog (link), I suggested that the networks could have called Georgia (and several other swing states) earlier, if they used "numbersense" rather than mathematical impossibility as the criterion.

***

Before ending, let's praise the unsung heroes - the data analysts who worked behind the scenes to make these interactive graphics possible.

The graphics require data feeds, which cover a broad scope, from real-time vote tallies to total votes casted, both at the county level and the state level. While the focus is on the two leading candidates, any votes going to other candidates have to be tabulated, even if not displayed. The talking heads don't just want raw vote counts; in order to tell the story of the election, they need some understanding of how many votes are still to be counted, where they are coming from, what's the partisan lean on those votes, how likely is the result going to deviate from past elections, and so on.

All those computations must be automated, but manually checked. The graphics software has to be reliable; the hosts can touch any part of the map to reveal details, and it's not possible to predict all of the user interactions in advance.

Most importantly, things will go wrong unexpectedly during election night so many data analysts were on standby, scrambling to fix issues like breakage of some data feed from some county in some state.