1

Topic: htmLawed: tags in attribute values

It looks like htmLawed 1.2 beta doesn't handle well HTML tags in attribute values. We're using data-* attributes for tooltips, and some of the tooltips have tags, which create junk in output (safe mode). Are you going to fix this or is this by design?

P.S. Thank you for creating htmLawed and especially for clear documentation!

2

Re: htmLawed: tags in attribute values

My understanding is that data-* attribute values cannot contain raw HTML.

That is, HTML entities have to be used instead of "<", ">", etc. of the HTML tags that are used for the tool-tips in your case. E.g., see http://stackoverflow.com/questions/7260195/escape-quotes-in-html5-data-attribute-using-javascript.

The very first step that htmLawed uses for parsing input text relies on the "<" and ">" characters. It is difficult (cumbersome) to make htmLawed identify such characters within data-* attribute values beforehand so as to ignore them.

The best option is to use HTML entities instead of "<", ">", etc. for the HTML tags that are used for the tool-tips. If this is not an option, then you can consider using some code to dynamically replace in the input text the characters with their entities before the text is passed to htmLawed. E.g.:

$in = preg_replace_callback(
        '`data-[^"]+"[^"]+`',
        function(matches){
          return str_replace(array('<', '>'), array('&lt;', '&gt;'), $matches[0]);
        },
        $in
);

$out = htmLawed($in, ...);

3

Re: htmLawed: tags in attribute values

Thank you, this makes sense.