Topic: Evaluating htmLawed
The htmLawed website provides a number of resources to evaluate htmLawed: a demo page, extensive documentation, source code, test cases, and filtering results against a large number and variety of XSS code.
These resources should help one independently evaluate htmLawed.
htmLawed is fast, small in size (one ~50 kb file) and memory use, highly customizable, and rich in features.
htmLawed allows both black- and white-listing of tags and attributes, and htmLawed does not require many configuration values to be set -- using htmLawed can be as simple as putting in this code 'htmLawed($input)', and a filtering of 'dangerous' HTML can be done with just 'htmLawed($input, array('safe'=>1))'.
A comparison of standalone HTML filters shows that only the HTMLPurifier script comes the closest to htmLawed in terms of efficacy and features. Though good, HTMLPurifier is slow, 15-20 times bigger, uses scores (hundreds?) of files, and consumes a few megabytes of RAM memory just to be loaded. It does not provide full HTML support, lacks some features like customizable code beautification, and has poor end-user documentation. Its code is also no longer PHP 4-compatible.
htmLawed is also a simpler alternative to using HTML Tidy as there is no need to install an external, non-PHP library or a PHP extension.
htmLawed does have minor limitations, also detailed in the documentation. E.g., it permits '<table></table>' even though empty 'table' elements are not permitted as per standards. But from a practical perspective, does that break page display or layouts, introduce security vulnerabilities, or crash applications? No.
The logic of htmLawed puts a priority on safety, speed, tolerance for HTML as it is commonly used, and customization rather than an absolute adherence to standards. Note that no browser enforces 100% standards-compliance. HTML standards themselves continue to evolve with multiple specifications and varying degree of support among different browsers and their versions.