Showing dynamics on a business chart
Fashion tip

Audio bookmarks

I look at a fair number of online videos, especially those embedded on blogs. But I haven't seen this feature implemented broadly. It is a wow feature.

Look at the dots above the progress bar: they tell you what topic is being discussed and allow you to jump back and forth between segments. (the particular dot I moused over said "Randy Moss") The video I saw came from this link.


This simple-looking feature is immensely useful to users. You can efficiently search through the audio file and find the segments you're interested in. It's like bookmarks students might put on pages of a textbook for easy reference, except these are audio bookmarks.

Why isn't this feature more prevalent? I think it's because of the amount of manual effort needed to set this up. Imagine how the data has to be processed. In the digital age, the audio file is a bunch of bits (ones and zeroes) so no computer or humans will be able to identify topics from data stored in that way. So, someone would need to listen to the audio file, and mark off the segments manually, and tag the segments. Then, the audio bookmarks can be plotted on the progress bar... basically a dot plot with time on the horizontal axis.

In theory, you can train a computer to listen to an audio file and approximate this task. The challenge is to attain the required accuracy so you don't need to hire an army of people to correct mistakes.

A very simple concept but immensely functional. Great job!


Feed You can follow this conversation by subscribing to the comment feed for this post.

dan l

I agree, this sort of thing would be great for say radio, especially if I were interested in skipping through the parts that say, involve Peter King as the above example.

FWIW, I think many radio shows 'chunk' their audio down so that topics can be discerned easily. But obviously, this is like sending me a book in 10 different word docs.

Alan B

This is more than just convenient, it is a first attempt at allowing audiovisual content to be experienced in a way that is not completely linear. Having at-a-glance access to the full, top-down structure of the document is invaluable to me, and this is almost never available for a/v content. I look forward to seeing this feature implemented widely.


This is super useful. The audio sharing service SoundCloud has a similar display but with comments on a track, sometimes it's neat to read it as a heatmap where many people have commented on a particular passage:


yeah, the accuracy of automatic speech recognition is pretty awful, so this almost always requires at least human-guided quality control of a transcript. usually more efficient just to have human-guided transcription. it's expensive!


converting audio speech into text transcript (using proper software of course)could ease whole process of bookmarking

The comments to this entry are closed.