Working with XML The Java API for Xml Parsing (JAXP) Tutorial by Eric Armstrong [Version 1.1, Update 31 -- 21 Aug 2001] This tutorial covers the following topics: Part I: Understanding XML and the Java XML APIs explains the basics of XML and gives you a guide to the acronyms associated with it. It also provides an overview TMof the Java XML APIs you can use to manipulate XML-based data, including the Java API for XML Parsing ((JAXP). To focus on XML with a minimum of programming, follow The XML Thread, below. Part II: Serial Access with the Simple API for XML (SAX) tells you how to read an XML file sequentially, and walks you through the callbacks the parser makes to event-handling methods you supply. Part III: XML and the Document Object Model (DOM) explains the structure of DOM, shows how to use it in a JTree, and shows how to create a hierarchy of objects from an XML document so you can randomly access it and modify its contents. This is also the API you use to write an XML file after creating a tree of objects in memory. Part IV: Using XSLT shows how the XSL transformation package can be used to write out a DOM as XML, convert arbitrary data to XML by creating a SAX parser, and convert XML data into a different format. Additional Information contains a description of the character encoding schemes used in the Java platform and pointers to any other information that is relevant to, but outside the ...
Working with XML
Top Contents Index Glossary
Working with XML
The Java API for Xml Parsing (JAXP) Tutorial
by Eric Armstrong
[Version 1.1, Update 31 -- 21 Aug 2001]
This tutorial covers the following topics:
Part I: Understanding XML and the Java XML APIs explains the basics of XML
and gives you a guide to the acronyms associated with it. It also provides an overview
TMof the Java XML APIs you can use to manipulate XML-based data, including the Java
API for XML Parsing ((JAXP). To focus on XML with a minimum of programming,
follow The XML Thread, below.
Part II: Serial Access with the Simple API for XML (SAX) tells you how to read
an XML file sequentially, and walks you through the callbacks the parser makes to
event-handling methods you supply.
Part III: XML and the Document Object Model (DOM) explains the structure of
DOM, shows how to use it in a JTree, and shows how to create a hierarchy of objects
from an XML document so you can randomly access it and modify its contents. This is
also the API you use to write an XML file after creating a tree of objects in memory.
Part IV: Using XSLT shows how the XSL transformation package can be used to
write out a DOM as XML, convert arbitrary data to XML by creating a SAX parser,
and convert XML data into a different format.
Additional Information contains a description of the character encoding schemes
used in the Java platform and pointers to any other information that is relevant to, but
outside the scope of, this tutorial.
http://java.sun.com/xml/jaxp-1.1/docs/tutorial/index.html (1 of 2) [8/22/2001 12:51:28 PM]l
l
l
l
l
l
l
l
Working with XML
The XML Thread
Scattered throughout the tutorial there are a number of sections devoted more to explaining
the basics of XML than to programming exercises. They are listed here so as to form an
XML thread you can follow without covering the entire programming tutorial:
A Quick Introduction to XML
Writing a Simple XML File
Substituting and Inserting Text
Defining a Document Type
Defining Attributes and Entities
Referencing Binary Entities
Defining Parameter Entities
Designing an XML Document
Top Contents Index Glossary
http://java.sun.com/xml/jaxp-1.1/docs/tutorial/index.html (2 of 2) [8/22/2001 12:51:28 PM]Understanding XML and the Java XML APIs
Top Contents Index Glossary
Part I. Understanding XML and the Java XML APIs
This section describes the Extensible Markup Language (XML), its related specifications,
and the APIs for manipulating XML files. It contains the following files:
What You'll Learn
This section of the tutorial covers the following topics:
1. A Quick Introduction to XML shows you how an XML file is structured and gives you some
ideas about how to use XML.
2. XML and Related Specs: Digesting the Alphabet Soup helps you wade through the acronyms
surrounding the XML standard.
3. An Overview of the APIs gives you a high-level view of the JAXP and associated APIs.
4. Designing an XML Data Structure gives you design tips you can use when setting up an XML
data structure.
Top Contents Index Glossary
http://java.sun.com/xml/jaxp-1.1/docs/tutorial/overview/index.html [8/22/2001 12:51:30 PM]l
l
l
l
l
l
l
l
l
l
l
1. A Quick Introduction to XML
Top Contents Index Glossary
1. A Quick Introduction to XML
This page covers the basics of XML. The goal is to give you Link Summary
just enough information to get started, so you understand what
Local LinksXML is all about. (You'll learn about XML in later sections of
the tutorial.) We then outline the major features that make
XML and Related SpecsXML great for information storage and interchange, and give
Designing an XML Data you a general idea of how XML can be used. This section of
the tutorial covers: Structure
RDF
What Is XML? XSL
Why Is XML Important?
How Can You Use XML? External Links
XML FAQWhat Is XML?
XML Info and Recommended
ReadingXML is a text-based markup language that is fast
becoming the standard for data interchange on the SGML/XML Web Page
Web. As with HTML, you identify data using tags Scientific American article
(identifiers enclosed in angle brackets, like this: <...>).
Collectively, the tags are known as "markup". Glossary Terms
attributes, declaration, DTD,
But unlike HTML, XML tags identify the data, rather element, entity, prolog, tag, well-
than specifying how to display it. Where an HTML tag formed
says something like "display this data in bold font"
(...), an XML tag acts like a field name in
your program. It puts a label on a piece of data that identifies it (for example:
...).
Note:
Since identifying the data gives you some sense of what means (how to
interpret it, what you should do with it), XML is sometimes described as a
mechanism for specifying the semantics (meaning) of the data.
http://java.sun.com/xml/jaxp-1.1/docs/tutorial/overview/1_xml.html (1 of 10) [8/22/2001 12:51:31 PM]1. A Quick Introduction to XML
In the same way that you define the field names for a data structure, you are free to use any
XML tags that make sense for a given application. Naturally, though, for multiple
applications to use the same XML data, they have to agree on the tag names they intend to
use.
Here is an example of some XML data you might use for a messaging application:
you@yourAddress.comme@myAddress.comXML Is Really Cool
How many ways is XML cool? Let me count the ways...
Note: Throughout this tutorial, we use boldface text to highlight things we
want to bring to your attention. XML does not require anything to be in
bold!
The tags in this example identify the message as a whole, the destination and sender
addresses, the subject, and the text of the message. As in HTML, the tag has a
matching end tag: . The data between the tag and and its matching end tag defines
an element of the XML data. Note, too, that the content of the tag is entirely
contained within the scope of the .. tag. It is this ability for
one tag to contain others that gives XML its ability to represent hierarchical data structures
Once again, as with HTML, whitespace is essentially irrelevant, so you can format the data
for readability and yet still process it easily with a program. Unlike HTML, however, in
XML you could easily search a data set for messages containing "cool" in the subject,
because the XML tags identify the content of the data, rather than specifying its
representation.
Tags and Attributes
Tags can also contain attributes -- additional information included as part of the tag itself,
within the tag's angle brackets. The following example shows an email message structure
that uses attributes for the "to", "from", and "subject" fields:
http://java.sun.com/xml/jaxp-1.1/docs/tutorial/overview/1_xml.html (2 of 10) [8/22/2001 12:51:31 PM]1. A Quick Introduction to XML
How many ways is XML cool? Let me count the ways...
As in HTML, the attribute name is followed by an equal sign and the attribute value, and
multiple attributes are separated by spaces. Unlike HTML, however, in XML commas
between attributes are not ignored -- if present, they generate an error.
Since you could design a data structure like equally well using either
attributes or tags, it can take a considerable amount of thought to figure out which design
is best for your purposes. The last part of this tutorial, Designing an XML Data Structure,
includes ideas to help you decide when to use attributes and when to use tags.
Empty Tags
One really big difference between XML and HTML is that an XML document is always
constrained to be well formed. There are several rules that determine when a document is
well-formed, but one of the most important is that every tag has a closing tag. So, in XML,
the tag is not optional. The element is never terminated by any tag other
than .
Note: Another important aspect of a well-formed document is that all tags
are completely nested. So you can have
......, but never
....... A complete list of
requirements is contained in the list of XML Frequently Asked Questions
(FAQ) at http://www.ucc.ie/xml/#FAQ-VALIDWF. (This FAQ is
on the w3c "Recommended Reading" list at
http://www.w3.org/XML/.)
Sometimes, though, it makes sense to have a tag that stands by itself. For example, you
might want to add a "flag" tag that marks message as important. A tag like that doesn't
enclose any content, so it's known as an "empty tag". You can create an empty tag by
ending it with /> instead of >. For example, the following message contains such a tag:
How many ways is XML cool? Let me count the ways...
http://java.sun.com/xml/jaxp-1.1/docs/tutorial/overview/1_xml.html (3 of 10) [8/22/2001 12:51:31 PM]1. A Quick Introduction to XML
Note: The empty tag saves you from having to code in order to have a
well-formed document. You can control which tags are allowed to be empty by creating a
Document Type Definition, or DTD. We'll talk about that in a few moments. If there is no
DTD, then the document can contain any kinds of tags you want, as long as the document
is well-formed.
Comments in XML Files
XML comments look just like HTML comments:
How many ways is XML cool? Let me count the ways...