November 15, 2001
On whitespace in the DOM
While working on a code sample for the book, I ran into a snag regarding whitespace in HTML documents. While browsers generally ignore (that is, do not display) extra white space between elements, they nonethless still exist in the DOM. For example:
<ul>
<li id="liOne">First Item</li>
<li id="liTwo">Second Item</li>
</ul>
Now, consider this related bit of JavaScript…
var li = document.getElementById(”liOne”);
alert(li.nextSibling.nodeValue);
Ideally, the alert should display “LI” because the immediate sibling of the liOne list item appears to be the second one liTwo. But it’s not. The alert displays “#text” instead. Why? Because the browser interprets the indentations before the LI elements as whitespace text nodes. And you might be suprised to learn that this is correct behavior for XML parsers (Tim Bray: “if it ain’t markup, it’s data.”).
At first I thought I could solve this with the white-space CSS attribute. white-space directs the browser to either preserve or collapse whitespace characters depending on the value specified. white-space:pre preserves whitespace, white-space:normal collapses it.
But it didn’t work as expected. The text nodes were still there in the DOM. It occurred to me that since CSS is largely presentational, the setting white-space might only affect the display of whitespace characters but leave the actual DOM intact. I could be misinterpreting how the white-space property is supposed to work. So, back to the problem. I needed a way to select the next sibling element of any given element via JavaScript, skipping over text and whitespace nodes. After a few brief discussions on several lists, I came up with the following function:
function getNextElement(elm)
{
var sib = elm.nextSibling;
while (sib && sib.nodeType != 1)
{
sib = sib.nextSibling;
}
return sib;
}
(See here for and explanation of nodeValue and nodeType.)
It works. But I really wish I didn’t have to create a function to get the adjacent element. It would be very cool if there were a DOCTYPE or something that allowed inter-element whitespace to be dropped from the DOM without a lot of JavaScript fuss.
UPDATE: mere seconds after posting this, Chris Nott sent me a link to this Mozilla technote which outlines several JavaScript functions for dealing with whitespace. Sigh.








