Thursday, February 17, 2011

Why must we still close SCRIPT elements?

I was wondering why in the world we still have to terminate SCRIPT elements with </script> tags in 2011, fer Pete's sake.

i.e. why does this work:

<script src="foo.js"></script>

but this:

<script src="foo.js" />

does not?

For years I thought it was just a bad implementation from the NS4 days that nobody bothered to correct. Not so! Turns out there's a valid (if a bit irritating) reason.

The tl;dr answer is: <script /> is valid XML, but invalid HTML.

If you serve your HTML pages as text/html, a browser will likely use its HTML parser to render the page. And since <script /> is invalid HTML, the HTML parser treats it as unrecognized junk.

(In theory, if you serve your pages as text/xml or application/xhtml+xml, the browser would use its XML parser, recognize the <script /> as valid, and load/execute the associated JS. But that also presumes your documents are also perfectly-formed, valid XML and not tag soup. Can you make that claim?)

As to the question of why <script /> is invalid while <img /> is perfectly okay:

The XHTML spec (and therefore HTML5, which imports the element definitions from the XHTML DTD) defines the SCRIPT tag as containing "document text." It cannot, by definition, be "empty."

SCRIPT is often considered analogous to the IMG element -- and mistakenly so -- because they both use the "src" attribute. In reality, SCRIPT is more akin to the P (paragraph) element, which also cannot be empty (writing <p /> is invalid -- you must instead write <p></p>)

For comparison, here are the starting element definitions from the XHTML DTD:

<!ELEMENT script (#PCDATA)>

<!ELEMENT img EMPTY>

This is why <img /> is fine, while <script /> is a no-no.