Break it down, build it up
Oct 09, 2008
Thought of the day:
While commuting today, I wondered why we use the term "data analysis" or "data analyst". I recalled that in chemistry class, we learnt that analysis means breaking things down while synthesis means building things up.
With regards to data, typically we try to collect data at the most detailed level and we build up messages and stories from the little pieces. We don't break things down. We can't break things down, in fact, if the data come to us in aggregated form. (Think ecological fallacy.)
So why don't we say "data synthesis" rather than "data analysis"?
I want some of what you've been smoking.
Posted by: Andrew | Oct 09, 2008 at 12:52 AM
Probably similar to the reason we say freedom-fighter and fire-fighter.
Posted by: michael | Oct 09, 2008 at 02:27 AM
Because "synthesis" means "making things up" :-)
Posted by: derek | Oct 09, 2008 at 02:42 AM
This is actually a valid question.
Analyzing data means to find out about the structures in a dataset, or more statistically spoken, to separate signal from noise by applying models and distribution assumptions. This is usually a process where you start to take your dataset apart.
Once you succeeded with this process, you could actually sample a synthetical dataset using this model assumption. Here you really build up again, which directly relates to the synthesis.
More generally speaking: an analysis generates knowledge, a syntheses applies knowledge.
Posted by: Martin | Oct 09, 2008 at 03:53 AM
Derek - That's exactly what I was thinking.
When I worked as a metallurgist, I was involved with terms like "forge", "fabricate", and so forth. When my group all got Word with its first rudimentary thesaurus, we were very amused that most of the synonyms related to creating false statements.
Analysis means developing understanding about a data set. When you start to apply the analysis to a broader system, you are synthesizing a model.
Posted by: Jon Peltier | Oct 09, 2008 at 07:55 AM
Martin:
I liked your comment so much that I visited your site. I am extremely impressed with your work.
Your books look great, and your papers are a gold mine. I particularly like your paper on Trellis Displays vs. Interactive Statistical Graphics, link. The fact that you wrote it in 1996, 12 years ago, is both impressive and disturbing to me.
We Excel chart users are way behind in charting technology, and falling behind at a faster rate.
Posted by: Kelly O'Day | Oct 09, 2008 at 09:50 AM
well I would say that we are analyst, as we have to drill down into the data and go as deep as needed to find the facts.
from the big picture to the small details is what makes an analyst.
making a sumary of it is only the end result but the biggest part is always to find the details.
(if you dont know the details, you cannot explain the big picture)
Posted by: Adrien Rochereau | Oct 10, 2008 at 09:53 AM
Although I typically try to get the full detailed data set, I usually start with the big picture, then asking myself how I would like to break it down, ignoring cuts that do not seem insightful. I guess I would call myself an analyst.
There are different type of people though: the Myers Briggs "S" (sensor who loves details first then builds up) and "N" type (Intuitive, big picture first and stops breaking down when the goal is reached). I usually score N, but am not a slam dunk case. Background reading: http://en.wikipedia.org/wiki/Myers-Briggs_Type_Indicator
Posted by: Jan Schultink | Oct 10, 2008 at 01:39 PM
Maybe I had "data mining" in mind... the proverbial case of having a lot of data and trying to figure out what it all means. This concept is seeping into statistical modeling via variable selection, regularization, etc.
Posted by: Kaiser | Oct 13, 2008 at 01:09 AM
I've always thought that:
Analysis = "state of knowledge"
Synthesis = "state of wisdom"?
So to me, in "BI/BA" terms, even as I pull and correlate data I'm still "breaking it down" just in interesting ways. It isn't until I start performing what the industry calls "predictive analytics" that I can approach synthesis.
Posted by: Alex | Oct 21, 2008 at 08:05 AM
Data synthesis is usually referred to as "modeling". I'd say "data analysis" when I'd be analyzing models built from data - not all models are synthesized from data.
Posted by: Aleks | Oct 21, 2008 at 09:47 AM
I think it differs by occupation. I analyze demographic, survey, & test score data. When I speak of analyzing data, I'm not referring to a process that begins with aggregated data. My analyses begin with "raw" data -- one record per individual, or whatever the unit of analysis is. When I take aggregated data & put it into an understandable & useful form for a particular audience, I usuually call that simply reporting or presenting the data. "Synthesizing" is not a word I use often.
Posted by: Georgia Sam | Oct 22, 2008 at 02:55 PM
Georgia Sam: that's why I raised this issue. for any of us doing "data mining", the data already exist in the most disaggregated form, and the "modeling" process actually synthesizes the data. There is no analysis in the sense of breaking things down.
Posted by: Kaiser | Oct 31, 2008 at 12:09 AM
Seems like a good idea. Data synthesis sounds great:)
Posted by: Free MP4 player | Nov 02, 2011 at 05:20 AM