Working together for standards The Web Standards Project


Serving XHTML with the Right MIME Type588

In the previous issue of “WaSP Asks the W3C”, we examined how to correctly specify the character encoding
for a document. In this edition, we consult the W3C’s Quality Assurance Group about serving XHTML documents with
the correct MIME type.

MIME has its origins as an extension to email and is reused by HTTP as a means to declare the type of content (or media type) being served. Each resource has a specific MIME type
that is constructed of two parts: the main type and a subtype, which are separated by a slash “/”. The MIME type indicates to the user-agent (as it receives the document) how to handle and treat it accordingly, thereby allowing you to associate a particular application or behavior to the particular media type in your browser.

WaSP asks

Which MIME type should XHTML be served with?

The W3C Responds

The short answer is application/xhtml+xml, as described by the XHTML Media-types W3C Note, but the long answer is somewhat more involved and provides a couple of alternatives.

Why not text/html?

The main reason to use a new MIME type for XHTML is that it is an XML
language, which means that it is subjected to stricter validation and
hence less prone to becoming the tag soup that too many people have
called HTML; thus, it is reasonable to indicate the difference to
browsers so they will be able to handle the resulting code more
efficiently.

The fact that XHTML is based on XML also involves important syntax
differences — the most significant being that empty tags such as
<br> need not be closed in HTML whereas they must be done so in XHTML (à la <br/> ) . These changes in syntax are another reason to clearly
distinguish HTML from XHTML , and thus use a different MIME type.

But some browsers don’t know about
application/xhtml+xml.

Indeed, and that’s one of the biggest current issues with the adoption of the new MIME
type, especially since Internet Explorer doesn’t recognize it
(at least for any version up to 6.x on both Windows and Mac OS). Of
course, this is a common problem with adoption of new technologies and
it usually improves over time. However, for the time being, there are
ways out of this vicious cycle:

  • XHTML 1.0 defines a backward compatibility mode
    that allows one to use XHTML 1.0 while maintaining compatibility with
    the legacy browsers. If you follow these guidelines, you are allowed to
    serve your XHTML as text/html. The backward compatibility
    mode defines some syntactic tricks which allows an XHTML document to be understood by most HTML browsers.
  • Using title="read about content negotiation techniques">content negotiation on your Web server, you can actually serve your XHTML 1.0 either as text/html or application/xhtml+xml
    depending on the user-agent’s capabilities, so that you are able to maintain backward compatibility for legacy browsers while exploiting the capabilities of modern ones.

The first technique makes your content understandable by the vast
majority of web browsers, but in doing so, you lose all the advantages of having
a different MIME type: the power of being treated as XML, allowing your document to be distinguished from tag soup, getting fast track rendering on modern browsers.

The second technique caters for existing browsers while keeping the new MIME
type for those which are able to understand it; the down side is that it can be
quite tricky to implement on your server, depending on how much access
you have to its configurations and settings.

Alternatively, you can serve your XHTML (any version)
as application/xml, or even as text/xml. However,
in serving your XHTML document as text/xml you may
run into issues with character sets because the rules which apply to
text/* MIME types are more complex than those for
application/*. It is also important to note
that for either of these MIME types, Internet Explorer will display the
source code instead of interpreting it as XHTML .

How do I set up the content negotiation you’re speaking about?

This depends on the Web server you’re using: if you’re not an
administrator of the Web server where you want to use this type of content
negotiation, point your Web server administrator to this document and
ask him or her to set it up as needed for you.

If you’re the administrator of the Web server in question, you
can have a look at the title="go to content negotiaton techniques">linked techniques for
your web server — or even better, send the techniques that you
know to the W3C Web Standards Education list
(publicly archived) so we can add it to our knowledge base.

To sum it up

Let’s try to summarize what we’ve just discussed:

(X)HTML version Recommended MIME type Limitations in browser Alternate MIME types Techniques
HTML 2.0,3.2,4.0,4.01 text/html none, but this MIME type has been abused very often as an umbrella for tag soup N/A N/A
XHTML 1.0 application/xhtml+xml Not recognized by Internet Explorer 6.x and previous versions
  • if using the backward compatibility guidelines, text/html
  • application/xml (or text/xml, but with much caution with regards to charset setting)
XHTML 1.1, XHTML Basic, XHTML profiles application/xhtml+xml Not recognized by Internet Explorer 6.x and previous versions
  • application/xml (or text/xml, but with much caution with regards to charset setting)
N/A

References

Discussion

For clarification and discussion on this
topic, please address your comments and questions to the W3C Web
Standards Education list.

To subscribe to the list, send an email to public-evangelist-request@w3.org with
“Subject: subscribe”. You can read archived posts at http://lists.w3.org/Archives/Public/public-evangelist/.

The Web Standards Project is a grassroots coalition fighting for standards which ensure simple, affordable access to web technologies for all.


All of the entries posted in WaSP Buzz express the opinions of their individual authors. They do not necessarily reflect the plans or positions of the Web Standards Project as a group.

This site is valid XHTML 1.0 Strict, CSS | Get Buzz via RSS or Atom | Colophon | Legal