When a Flickr user uploads a photo, she has the option of assigning one or more labels ("tags") to it. Flickr then produces a frequency count for each tag and then plots the top 120 (?) tags; the font size of each tag is proportional to its frequency of use. Tagging is hailed as a massively distributed and participative method of classifying information, and I think it works brilliantly.
The data itself is nothing more than a frequency table ("wedding" 132,356; "party" 120,222; etc.) but this presentation is visually appealing and aptly functional. Compare with this typical histogram presentation:
- The Flickr version is ordered alphabetically whereas the histogram is by frequency and therefore it serves both people who are looking for the most popular categories as well as those who are looking for a specific term.
- Flickr uses a clean interface without excessive underlining, highlighting, dots and so on. No chartjunk! To see chartjunk, go here and here.
Here are some ideas for extension:
- Be flexible in selecting the underlying population of tags: clicking on wedding will give a list of all photos that were labelled "wedding": it is the most popular tag overall but will lead to too many results and too little relevance. Flickr has little tag clouds for each user
- Be flexible with the metric being plotted: aside from frequency of use, the size of the words can vary with other measurements such as recency of use and frequency of clicks
- Introduce a hierarchy of tags: for example, clicking on "wedding" leads to another tag cloud so users can drill down. This can be implemented using a hierarchical clustering algorithm, for example
P.S. The idea to write this post came to me while chatting with Scott Matthews, who has created an interesting browser add-on, found at www.bitty.com