blank's blog: http://www.planabc.net/
The use of innerHTML attributes is very popular because it provides a simple way to completely replace the content of an HTML element. Another method is to use the DOM Level 2 API (removeChild, createElement, appendChild). But it is obvious that using innerHTML to modify the DOM tree is a very easy and effective way. However, you need to know that innerHTML has some problems with its own:
There are several other minor disadvantages, which are worth mentioning:
I'm more concerned with security and memory issues related to using innerHTML properties. Obviously, this is not a new problem, and there are already people who have come up with ways around some of these problems.
Douglas Crockford writes a cleanup function that aborts some loop references caused by HTML element registration event handling functions and allows the garbage collector to free memory associated with these HTML elements.
Removing script tags from HTML strings is not as easy as it seems. A regular expression can achieve the desired effect, although it is difficult to know whether all possibilities are covered. Here is my solution:
/<script[^>]*>[/S/s]*?<//script[^>]*>/ig
Now, let's combine the two techniques into a separate setInnerHTML function and bind the setInnerHTML function to YAHOO.util.Dom in YUI:
YAHOO.util.Dom.setInnerHTML = function (el, html) {
el = YAHOO.util.Dom.get(el);
if (!el || typeof html !== 'string') {
return null;
}
// Abort circular reference
(function (o) {
var a = o.attributes, i, l, n, c;
if (a) {
l = a.length;
for (i = 0; i < l; i = 1) {
n = a[i].name;
if (typeof o[n] === 'function') {
o[n] = null;
}
}
}
a = o.childNodes;
if (a) {
l = a.length;
for (i = 0; i < l; i = 1) {
c = o.childNodes[i];
// Clear child nodes
arguments.callee(c);
// Remove all listeners registered with elements through YUI's addListener
YAHOO.util.Event.purgeElement(c);
}
}
})(el);
// Remove script from HTML string and set innerHTML property
el.innerHTML = html.replace(/<script[^>]*>[/S/s]*?<//script[^>]*>/ig, );
// Return the reference to the first child node
return el.firstChild;
};
If this function should have anything else or something missing in the regex please let me know.
Obviously, there are many other ways to inject malicious code on the web page. The setInnerHTML function can only normalize the execution behavior of <script> tags on all A-grade browsers. If you are ready to inject HTML code that you cannot trust, be sure to filter on the server side first, and there are already many libraries that can do this.
Original text: Julien Lecomte's "The Problem With innerHTML"