Unit 4: New Web Technologies: XML, XHTML, CSS
Lesson 1: XML
Reference: What is XML? by Norm Walsh
- Describe the what XML is
- Describe what an XML document looks like
What is XML?
- eXtensible Markup Language
- a simplified version of SGML (Standard Generalized Markup Language)
- XML is really a meta-language for describing markup languages.
- provides a facility to define tags and the structural relationships between them
Why XML?
- XML is not a replacement for HTML
- XML was designed to describe data and to focus on what data is.
- XML was created so that richly structured documents could be used over the web
Why not HTML?
- HTML was designed to display data and to focus on how data looks.
- HTML is about displaying information, XML is about describing information.
Example
- Example XML document:
<?xml version="1.0"?> <oldjoke> <burns>Say <quote>goodnight</quote>, Gracie.</burns> <allen><quote>Goodnight, Gracie.</quote></allen> <applause/> </oldjoke>
Components of XML
- XML documents are composed of markup and content.
- There are six kinds of markup:
- elements
- entity references
- comments
- processing instructions
- marked sections (not covered)
- document type declaration (DTD)
Elements
- Most common form of markup
- Delimited by angle brackets
- Most elements identify the nature of the content they surround
- Begins with a start-tag,
<element>, and ends with an end-tag,</element> - Some elements may be empty,
<element></element>or<element/>for short
Attributes
- Attributes are name-value pairs that occur inside start-tags after the element name
- All attribute values must be quoted
- For example:
<div class="preface">
Entity References
- Some characters have been reserved to identify the start of markup
- For example, the left angle bracket,
< - Every entity must have a unique name
- For example, the left angle bracket is
<, the ampersand is& - You can define your own entities
- A special form, called a character reference, can be used to insert arbitrary Unicode characters
- For example, for decimal
℞and hexadecimal references,℞are the character ℞
Comments
- Comments begin with
<!--and end with--> - Comments can contain any data except the literal string
--For example:
<!-- This is a comment -->
Processing Instructions
- Provide information to an application
- Not textually part of the XML document but the XML processor is required to pass them to an application
- Have the form:
<?name pidata?> - Applications should process only the targets they recognize and ignore all other
- Names beginning with
xmlare reserved for XML standardization
Document Type Declaration (DTD)
- Define what entities are allows and where
- Used to define an instance of a XML document type
Well Formed Documents
- All sytax is proper
- See article for full set of rules
- All XML documents must be well formed (no sloppy coding like that allowed in HTML)
Example
An example of a well formed XML Document:
<?xml version="1.0"?>
<CDs>
<CD>
<artist>Bob Dylan</artist>
<title>Blonde On Blonde</title>
<release_date>1966</release_date>
</CD>
<CD>
<artist>The Doors</artist>
<title>L.A. Woman</title>
<release_date>1971</release_date>
</CD>
</CDs>
Example
An example of an XML Document that is not well formed:
<?xml version="1.0"?>
<cds>
<CD>
<artist>Bob Dylan
<title>Blonde On Blonde</artist></title>
<release_date>1966</release_date>
</cd>
<cd>
<artist>The Doors</artist>
<title>L.A. Woman</title>
<release_date>1971</release_date>
</CD>
</CDs>
Valid Documents
- Documents that validate against a document type declaration (DTD)
- It is not required that XML documents validate, therefore DTDs are optional