XHTML Syntax

Printer-friendly version

HTML was the first markup language I learned, and its elegant simplicity provided a shallow learning curve that allowed users with even the most basic concept of computing to produce visually appealing websites. As multimedia content began to flood the Internet, HTML co-evolved with different server- and client- side scripting and programming languages to facilitate the new demands for interactive and dynamic content. The segregation of structural and visual attributes into XHTML and CSS represents a tremendous, but natural, leap forward in the language's evolution. XHTML allows authors to focus on a logical document structure with quality content by passing the majority of the difficult visual settings off the the Style Sheets. This logical division of labour may also explain why many scientists favour the LaTeX markup language over WYSIWYG word processors: writers can worry about their writing instead of wrestling with the typesetting. XHTML also provides HTML with XML compliance (hence the 'X'), the formal markup metalanguage that represents a degree of improvement over its SGML ancestor comparable to that offered by XHTML over its HTML predecessor. Many of the tags have been carried over from HTML, but with two important constraints introduced by the XML. The first is that all attribute valuesmust be enclosed in "quotation marks". Whereas <tr colspan=2> is a valid HTML tag, it must be <tr colspan="2"> to be a valid XHTML tag. The second constraint is that every opened tag must be closed. Even tags without natural closing counterparts (e.g. <br>) must be closed by including the closing in the tag (e.g. <br/>). Once you remember these two rules, then transitioning from HTML to XHTML is simply a matter of learning to separate content from appearance.

This article is a slowly evolving work in progress. It began with a simple list of text parameters, but I have been slowly adding other information to this document as I find the time. I have added information on document specifications and anchor tags. I am presently working on the table model. The syntax described here pertains to XHTML 1.1 and will also work with version 1.0 Strict1. Eventually, I imagine that it will serve as a slightly enhanced XHTML cheatsheet. I have elected to include only the more commonly used attributes, and have largely chosen to exclude even attributes in favour quicker access to information that is more likely to be of use.

Contents

Document Specification

An XHTML document opens with its type specification. An XML declaration may optionally be entered before the DOCTYPE specification. This is not required if the character encoding is the standard UTF-8, and this XML declaration may cause the browser to choke on your document. If used, however, the declaration looks like so:

<? version="1.0" encoding="UTF-8"?>

Document Type Definition

The XHTML document usually begins with the Document Type Declaration, which identifies the Document Type Definition (now you see why the former is called DOCTYPE and not DTD) to be used to check the validity of the document. The XHMTL DOCTYPE declaration tells the browser which DTD your document conforms to, and contains both a publicly recognized identifier and specific URL as a backup. The DOCTYPE declaration for XHTML 1.1 is:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">

XML Namespace

Because an XHTML is an XML document, the namespace must also be identified. This is done after the DOCTYPE declaration with the line: <html xmlns="http:www.w3.org/1999/xhtml">

This points to the XHTML namespace at W3.

Document Header

As with traditional HTML, the remainder of the document is divided into the Header and the Body. With the exception of the tag closings, the elements in the header haven't changed much from their HTML counterparts.

<base /> Specify the base pathname for all relative URLs in the document. You can also use this tag to specify a default URL target.
href="url"
Specify the URL to be used.
id="text"
A unique identifier for the element.
target="name|_blank|_self|_new"
Direct all un-targeted links to name. _blank directs links to a blank window (or opens a new one), _self directs links to the current window and _new directs them to a new window.
<link /> Define the relationship between the current document and another. Commonly used to define sylesheets.
charset="charset"
Specify the target document's character encoding.
href="url"
Target document's URL.
hreflang="language code"
Base language of the target document.
media ="all|screen|print|..."
Identify the target media for the linked document so that the proper stylesheet is accessed. Check the More Information section for a full list of media.
rel="relationships"
Describe the relationship from the source to the target document. These include stylesheet, next, prev, index and others
rev="relationships"
Describe the relationships from the target to the source.
target="name"
Defines the default target window for links. _blank and _new don't seem to work in this case.
type="resource"
Indicate the media or content type of the outside link. text/css indicates a Cascading Style Sheet.
<meta /> Provide document metadata, including media, character set, author, description, etc.
content="text"
Specify the value of the meta element property. Is always required and used in conjunction with either name or http-equiv.
http-equiv="text"
Information is treated as though it was part of the HTTP header sent ahead by the server. Is associated with the content attribute instead of the name attribute.
id="text"
Unique identifier for the element.
name="text"
Specify a name for the meta information property (e.g. "Author").
scheme="text"
Provide additional information about interpreting the metadata.
<object /> A generic element used to embed media objects. The attributes associated with this element vary with the type of object. See the W3 information on the XHTML Object Module for a detailed description of the syntax.
<script> Place a script inside the document. Can also occur in the body.
charset="character set"
Indicate the script's character encoding.
defer="defer"
Indicates that the script does not generate document content.
id="text"
Assign a unique identifier to the script element.
src="url"
Provide the URL for an external script.
type="content-type"
Specify the scripting language.
xml:space="preserve"
Instruct XML processors to preserve whitespaces in the element
<style> Insert a stylesheet or style specifications
id="text"
Assign a unique identifier to the script element.
media ="all|screen|print|..."
Identify the target media for the style information.
title="text"
Assign a title to the stylesheet.
type="content-type"
Specify the stylesheet language. For CSS this is text/css.
xml:space="preserve"
Instruct XML processors to preserve whitespaces in the element

Required attribute

Document Body

The body is where the document's content resides. An id attribute can be assigned to almost any element inside the body to allow the specification of presentational aspects in a Style Sheet.

Content Dividers

<br /> Break the current line without breaking the paragraph.
<div id="idname"> Introduce a generic division in the document.
<h1> Top-level heading. Heading levels extend from 1 to 6. Headings 5 and 6 are usually smaller than the default text.
<hr /> Insert a horizontal rule bar. Section borders are preferred.
<p> Denotes a paragraph.

Anchors Aweigh

Anchors and their role in hyperlinking could arguably be considered the backbone of the World Wide Web. They are what links documents (and parts thereof) together.

<a>
charset="charset"
Specify the target document's character encoding.
href="URL"
Specify the target URL. This can be another anchor inside the current document, another hypertext document, a media file, an email address or an FTP address.
hreflang="language code"
Specify the base language of the target document.
id="text"
Unique identifier, e.g. for use with style sheets.
rel="relationships"
Establish a relationship between the current and target document. Valid relationships include stylesheet, next, prev, copyright, index and glossary.
rev="relationship"
Specify the reciprocal relationship (from the current document to its source). The same values are available as with rel.
target="text"
Specify the name of the window where the target will be displayed. Use _blank to specify a new, blank window and _self to specify the current frame.
type="media type"
Specify the content type of the target.
Image Maps
coords="x,y"
Specify the x-,y-co-ordinates for a "hot" area in an image map.
shape="rect|circle|poly|default"
Defines the shape of the "hot" area. This may not work with all browsers.
Examples
Link local file (in same directory) <a href="file.html">File</a>
Link file on another server <a href="http://server.address/path/to/file.html">File</a>
Link to email address <a href="mailto:usename@domain.type">Email User</a>
Link to FTP address <a href="ftp://server.address/path/to/file">Download File</a>
Name an anchor <a id="anchor">Anchor</a>
Link to named anchor in current document <a href="#anchor">Go to Anchor</a>
Link to named anchor in external document <a href="http://server.address/path/to/file.html#anchor">Go to Anchor</a>

Text Elements

The following elements can be used to specify text structures. While a few presentational elements still remain, their use as such is discouraged.

<abbr title="text"> Identify the enclosed text as an abbreviation
title -- the title of the abbreviation
<acronym title="text"> Identify the enclosed text as an acronym
title -- the title of the acronym
<address> Identify the enclosed text as the authors contact information, but not an address listing.
<b> Text will appear bold face2
<bdo dir="ltr|rtl" lang="language code" xml:lang="text"> "Bidirectional override" allows you to change the direction of the text.
Either ltr or rtl must be specified
The language can be specified using the lang= or xml:lang tag
<big> Text will appear slightly larger than normal3
<blockquote cite="URL"> Enclosed text is a block quote (multiple paragraphs)
<caption> A caption for a figure or table
<cite> A reference to another document
<code> A piece of code
<del title="text" cite="URL" datetime="YYYY-MM-DDThh:mm:ssTZD"> Indicates deleted text
title -- reason for deletion
cite -- URL of a source document describing changes
datetime -- Standard HTML timestamp
<dfn> Indicates the defining instance of the enclosed term
<em> Denotes text to be emphasized
<h1>...<h6> Header designation, from level 1 to level 6
<ins title="text" cite="URL" datetime="YYYY-MM-DDThh:mm:ssTZD"> Indicates inserted text
title -- reason for insertion
cite -- URL of a source document describing changes
datetime -- Standard HTML timestamp
<kbd> Denotes text entered by the user ("keyboard")
<p> Denotes a paragraph
<pre> Denotes preformatted text
<q cite="URL"> Denotes a brief (inline) quotation
<samp> Denotes sample output from a program, script, etc.
<small> Text will appear smaller than its surrounding text4
<strong> Denotes text to receive strong emphasis (e.g. bold)
<sub> Indicates subscript text
<sup> Indicates superscript text
<tt> Denotes teletype text
<var> Indicates a variable or program argument

Tables

The combination of stylesheets and generic structures like <div> have largely usurped the role of the table as the weapon of choice for controlling the page layout. This is a good thing, as it frees tables up to do what they do best: present multidimensional information in an aesthetically pleasing and cognitively meaningful format. The relegation of the table back to its original purpose has reshaped its attribute set, and many generic display attributes have been deprecated in favour of ones that provide more structural control over the information content.

<table> Places a table within the document body. While the below attributes are valid, specifying these attributes with CSS is recommended.
border="int"
Specify a border thickness A value of zero translates into no border, and in the absence of style instructions, the value defaults to one.
cellpadding="int"
Specify the number of blank pixels between the cell border and its contents. The default is one pixel.
cellspacing="int"
Specify the number of pixels between cells. The default is two pixels.
cellspacing="name|_blank|_self|_new"
Direct all un-targeted links to name. _blank directs links to a blank window (or opens a new one), _self directs links to the current window and _new directs them to a new window.
frame="void|above|below|hsides|vsides|lhs|rhs|box|border"
Provides more refined control over table borders. The values translate as follows:
  • void: No frame (default)
  • above: Top side only
  • below: Bottom side only
  • hsides: Top and bottom sides only
  • vsides: Right and left sides only
  • lhs: Left-hand side
  • rhs: Right-hand side
  • box: All four sides
  • border: All four sides
rules="all|cols|groups|none|rows"
Provides control over interior table ruled lines. The values translate as follows:
  • all: Place rules between all cells in both directions
  • cols: Between columns only
  • groups: Between column and row groups
  • none: No rules appear (default)
  • rows: Between rows only

The information provided here is intended to serve largely as a quick reference. For more in-depth information, consult the resources listed below.

  1. 1. More astute visitors may notice that some of the code I use in this article does not conform to the XHTML Strict specifications itself. This is a regrettable consequence of my inability to find the time to customize the Style Sheet specifications that control this site's display. This is also a good example of a rare instance where users might find the rigidity of XHTML to be somewhat disadvantageous.
  2. 2. Although these tags have been preserved in the XHMTL 1.0 Strict and XHTML 1.1 DTDs, their use is discouraged as they are presentational rather than structural tags.
  3. 3. Ditto here
  4. 4. And again

Comments

Post new comment

  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.
  • Use [fn]...[/fn] (or <fn>...</fn>) to insert automatically numbered footnotes.
  • Images can be added to this post.
  • You may quote other posts using [quote] tags.

More information about formatting options

By submitting this form, you accept the Mollom privacy policy.