1

Topic: Allow HTML5 data-* attributes?

Hello,
thanks for creating htmlawed! Its been working out great so far with my testing
I was wondering if its possible to allow any data-* attribute on an element?

my current configuration is the following..

$config = array(
    'tidy' => -1,
    'hook_tag' => 'hl_css_filter',
    'comments' => 1, 
    'cdata' => 1,
    'schemes' => 'http, ht\tps', // backslash was added because I couldnt post without it.. 
    'elements' => 'a, em, strong, i, u, del, sub, sup, strike,
                img, marquee, br, hr, 
                p, div, span, article,
                table, thead, tbody, tfoot, tr, td, th, caption, 
                blockquote, cite, code, pre, big, small, 
                h1, h2, h3, h4, h5, h6, 
                dl, dt, dd, fieldset, legend, 
                nav, ul, ol, li',
    'deny_attribute' => '* -id -class -style -title -href -target -colspan -rowspan -src -alt -direction'
);

2

Re: Allow HTML5 data-* attributes?

I will soon release a new version of htmLawed for HTML5 compatibility which will allow use of 'data-*' attributes.

Meanwhile, you can consider using the $spec function argument to allow data-* attributes; see the documentation section for $spec (note the last part of the section).

You can also modify htmLawed code as per posts like this one. I can help.

The problem with both approaches, however, is that the data-* attributes have to be completely named; i.e., you have to know their exact names ('data-' and the values of '*'). Probably this is not an issue for your case.

3

Re: Allow HTML5 data-* attributes?

Thanks for the response
I was thinking about giving the spec option a try for this, and if I get it working I'll report back here for other people to see. The data-* would be different depending on what its acting with, but I should be able to use a regular expression for that, and the value could be anything really..
I'm thinking something like the following should work, but I havent tested it yet.. maybe with a little more tweaks for it I can get it working

$spec = 'div=data(-.+?)';

I'll report back here once I figure something out that works

4

Re: Allow HTML5 data-* attributes?

Unfortunately,  like I wrote in my first reply, the regular expression idea will not work. The logic in htmLawed looks for an exact name match. This, for example, may work:

$spec = 'div=data-firstname, data-lastname, data-middlename'
// we expect only these 3 data-* attributes in the input

.

To allow any 'data-*' attribute, perhaps your only option is to modify htmLawed code. You can test this:

// htmLawed.php file for version 1.14, line 494, in function hl_tag
// The line ends as: ...or isset($rl[$k])){

// Change it to following to allow 'data-*' in any element
...or isset($rl[$k]) or preg_match('`data-.+`', $k)){

// Or, change it to following to allow 'data-*' in only 'div'
...or isset($rl[$k]) or (($e == 'div') && preg_match('`data-.+`', $k))){

5

Re: Allow HTML5 data-* attributes?

Thanks for the feedback, I tried your first method and it worked out great for my needs. sorry it took awhile to get back to ya about it been busy and just got back around to messing with user input.

I saw you have a beta version on htmlawed, and I was looking through the documentation, but didnt see anywhere a way to enable the any data-* attributes in the beta version, are you going to be supporting that in future versions?

6

Re: Allow HTML5 data-* attributes?

There will be support for data-* in the new htmLawed. I hope to have it in the next beta release (1.2.beta.2) that I plan to release this weekend.

7

Re: Allow HTML5 data-* attributes?

The 1.2.beta.2 version of htmLawed with support for custom data-* (star) attributes has been released. By default htmLawed will allow in any element any attribute named 'data-*' where '*' is such that the first three of its characters when lower-cased do not equal 'xml', and it does not have a colon (:), equal-to (=) or white-space character.