Comment: One of the successes of the HTML 5 recommendation standard is the provision of a detailed specification for how to parse HTML documents. Browser providers have always tried to guess and copy implementations of other browsers, hoping that their parsers will not cause too many problems when processing HTML documents.
Although some parts of HTML 5 are currently controversial, this part about parsing has been unanimously recognized by browser manufacturers. Once the browser starts implementing it, users can benefit from the compatibility improvements that come with it.One of the initial implementations of HTML 5 parsing rules was developed to support HTML 5 validators. (If you want to test this validator, it should be legal HTML 5.) This implementation is developed in Java, provides SAX and DOM interfaces, and is open source.
Interestingly, Henri Sivonen (the author of the validator) recently developed a brand new HTML 5 parsing engine for Gecko, which will be used in the next version of Firefox.
This implementation is actually done by automatically converting the Java implementation of Henri's HTML 5 parser into C++. This transformation is automatically completed and all changes will be submitted to Mozilla's code base.
Generally speaking, when I mention this large-scale programmatic approach to converting Java code base to C++, I will jump out. However, the result is very unexpected: the page loading performance has increased by 3%.
These are based on a series of bug fixes and consistency checks that the code base will provide. You can view the progress of the patch in Mozilla's bug library.
If you want to try a new parser (you are unlikely to find many obvious changes, but any effort to find bugs is worthy of thanks). Download a daily build version of Firefox, open about:config, and set html5.enable to true.
If you want to upgrade to HTML 5, then now is the time. Because HTML 5 is a superset of the features provided by HTML 4 and XHTML 1, upgrading is very easy. You only need to replace the current (X) HTML document type declaration with HTML 5 document type.
<!DOCTYPE html> You can find details on how to get new HTML 5 elements to work on all browsers from the HTML 5 Doctor website.