Error spotting
Feb 01, 2007
My friend Augustine pointed me to this interesting graph showing the time of sunset over the course of a year. (The original author's write-up is here.)
Of course, one can produce a perfect chart by looking up meterological records. The main interest in this graph is how it was constructed. Each cell in the graph represents an hour of a day, with days running across and time running down. The cells that are not dark each contain a photograph of the sunset contributed to Flickr, the photo-sharing site. So this is in effect a graph created through mass collaboration (about 35,000 photos).
The "white" band roughly indicates the sunset. What intrigues me is the variability... what are the reasons for lighted cells appearing all over the graph?
Some ideas include:
- Different time zones
- Incorrect time setting by some photographers
- Erroneous tagging of photos as "sunset"
What the photo shows is a number of things:
It proofs that the vast majority of the photos uploaded are originated from users located in the northern hemisphere. There is no significant difference between the time of sunsets on a given day, when you live on the same latitude. If the providers of photos were evenly distributed across the latitudes, the sunset 'band' would be straight, not curved.
Although I tried, I can't spot a sunrise pattern as well. I think this is a good proxy for assuming that most of the taging of the photos is actually ok. Think about it: why would you tag a photo sunset, if it wasn't actually a sunset?
I think it is more likely that a photo is incorrecty timed, rather than incorrectly tagged.
Jens
Posted by: Jens | Feb 01, 2007 at 04:27 AM
In the commments, the original flickr-composite creator ('jbum') suggests two reasons for the possible errors:
Posted by: son1 | Feb 01, 2007 at 09:16 AM
People take a lot of photographs of things like sunsets on holiday, but they don't often change the time on their cameras.
Posted by: Tom Carden | Feb 01, 2007 at 02:50 PM
This is a pretty amazing chart, in large part because of the way it was made. I'd add to Jens' comment the fact that the photographers represented here were working mostly at the mid latitudes in the northern hemisphere (i.e. relatively few photographers in the equatorial regions or significantly northern latitudes -- or as Jens mentioned, in the southern hemisphere). If we had many photographers in, say, the northern latitudes (like 50 or 60 degrees north) we'd also see the same curved sinusoidal shape but a more extreme version than this one.
Some part of the width of this must come from observations from slightly different latitudes, and some from differences in longitude within a time zone. A guess at how much error exists from these (and other) sources probably could be made by looking at the point on the curve at the two equinoxes (~ March 21, ~ Sep 21) when the width of overlapped latitude graphs should be almost zero -- essentially a point. I can (maybe) see a little narrowing there, but not much. But this really is an interesting plot.
Posted by: horbrastar | Feb 01, 2007 at 10:13 PM
Is it my imagination or is there a shadow band that might correspond to sunrise in the upper part of the chart? If so this raises the credibility of the mislabelling theory or perhaps a misassociation if the images are pulled out by a machine. If there is a label with sunset nearby in some sense then other pictures might be pulled out.
Posted by: am | Feb 04, 2007 at 12:03 PM
Also, where a person lives in relationship to the time zone boundaries will account for some of the noise.
Posted by: John Johnson | Feb 04, 2007 at 02:13 PM
am, I percieved the shadow band as well. It looks like it is 12 hours off the main band, which could be explained by a significant number of people having their cameras' clocks set 12 hours off--confusing the am and the pm.
Posted by: SamW | Feb 07, 2007 at 03:30 PM