Open standards for annotating and indexing networked media

Three core components enable the Continuous Media Web:

  • the Annodex encapsulation format, which interleaves time-continuous data with XML markup in a streamable manner.
  • the CMML (Continuous Media Markup Language), which enables authoring of the markup and interleaving information for creating Annodex streams.
  • a means to address clips of Annodex streams and time offsets into time-continuous data through temporal URI hyperlinks.

Continuous Media Web

Annodex is the format in which media with interspersed XML markup gets transfered over the wire. After a Web client issues a URI request for an Annodex resource, the Web server delivers the appropriate stream or substream in the requested composition. A Web crawler would request to only receive the annotations (i.e. a CMML file) providing for a bandwidth friendly means of crawling media content.

Annodex creation

Annodex files basically consist of markup interspersed throughout the media data at their time of relevance. Markup comes in two different forms: header markup which is relevant for the complete Annodex stream, and clip markup which is relevant only for a section of the stream.

Annodex example

An example Annodex file is given in the image below, zooming into the head and the first clip markup.

CMML example

An example CMML file is given below including a description of the Annodex stream to be created and the markup to be included in the Annodex stream.

<stream timebase="0">
  <import src="galaxies.mpg" contenttype="video/mpeg"/>
  <title>Hidden Galaxies</title>
  <meta name="author" content="CSIRO"/>
<clip id=findingGalaxies" start="15">
  <a href="">
    Related video on detection of galaxies
  <img src="galaxy.jpg"/>
  <desc>What's out there? ...</desc>
  <meta name="KEYWORDS" content="Radio Telescope"/>

