Why XML?

This short article is to explain what is XML and why an XML Working Group has been setup by the UIS Informatics Commission to develop an XML based cave survey language.

What's XML?

XML stands for eXtensible Markup Language and is an initiative of the World Wide Web Consortium (the W3C). Most of you will know a little, and if you have designed Web pages perhaps a lot, of HTML. Well XML is the successor to HTML and will slowly replace much of the HTML that currently dominates the Web. Both Microsoft, Netscape, many other companies and open source developers are creating the next generation of browsers with XML capability.

However in addition to being the next WWW language XML will be the basic data format for many of the documents that you will create and read. Already XML is embedded in HTML documents created by Microsoft Word and much Web content is generated from XML and converted to HTML before it is sent to your browser.

Why? Because XML structures data so that instead of the data being a flat repository of information, often in a proprietary format, in some it can be richly structured heirarchy of data; the data as information becomes more useful. To see how this comes about you can look at a small example in the section What does XML look like? below.

In summary these are some of the advantages of XML.

Who is developing XML and what are they developing

Here are two technical examples: Chemical Markup Language for describing chemical structures and Mathematical Markup Language for displaying complex mathematical expressions and formulae. There are already several hundred organisations, companies and individuals developing XML for a wide range of fields from geography to economics. A short list can be found here. And of course the site you are at now is the home page for the UISIC XML Working Group for Cave Survey Markup Language or CaveXML for describing cave survey data.

What does XML look like?

HTML uses tags to describe how information should appear. Your Web browser reads these tags and displays the information on your screen. For example the list of XML advantages shown above was expressed in HTML as:

<UL>
<LI>it can be used as an exchange format to enable users 
    to move their data between similar applications</LI>
<LI>it provides a structure to data so that it is 
    richer in information</LI>
</UL>

The HTML tags for Unordered List (UL) and List Item (LI) are fixed and set by the W3C. However XML allows developers to define their own set of tags to describe documents in a particular field such as cave surveying. This is an example of what a short XML survey file might look like. The actual tag names and structures are still being developed and are likely to change.

<CAVESURVEY>
<INSTRUMENTS>
    <TAPE    id="#3" units="metres "description="Old fibreglass one" />
    <COMPASS id="#1" units="degrees" zero_correction="+0.5" />
</INSTRUMENTS>
<SURVEY>
    <STATION name="10" description="big stalagmite" />
    <STATION name="11" />
    <SHOT from="10" to="11" dist="3.6" azim="225" elev="55" />
    <SHOT from="11" to="12" dist="7.3" azim="82"  elev="43" />
</SURVEY>
</CAVESURVEY>

Notice that the data is structured. There is information content within this structure and there are several XML tools which can can process XML documents and make queries on this information such as "list all surveys where a compass correction was required" ie. where the zero_correction was not "0.0"

There is another half to the story though! With eXtensible Style Language (XSL), the presentation of an XML document in a Web browser can be controlled. It is, for example, possible to hide particular elements or to change the sequence or appearance of elements in the display. Furthermore, complex data transformations can be performed.

I just survey caves; will it be useful to me?

A common exchange format
Do you reduce your cave survey data to close loops or plot coordinates? If so you probably use one of the many cave survey software packages. They all use a different format for their data. XML has the potential to become a common exchange format in much the same way that Rich Text Format is between word processors, XML though is open and non-proprietary. Thats means that you can exchange your data with others who use different cave survey software.

Many views of the one data
Because XML data is structured XML applications and browsers will be able to display and process the data in many different ways; from displaying different fonts for different sections of the data to reorganising the data to show views based on survey date, survey team, hiding or showing pertinent data.

Top

Michael Lake, 31st Jan 2001