For the current beta for htmLawed 1.2, there is no single, generic representation for data-* attributes. Thus, whereas all on* event attributes like 'onclick' can be specified in config.'s 'deny_attribute' using 'on*', one cannot use 'data*'. Instead one has to use the exact data-* attribute names (the same is true for aria-* attributes).
In your case therefore each data-* attribute has to be named (note that it should be 'on*' and not 'on' and that with 'safe' set to 1, there is no need to separately specify the 'on*'):
$string = htmLawed($string, array(
'safe' => 1,
'deny_attribute' => 'style, class, id, data-abc, data-xyz, on*',
'comment' => 1,
'make_tag_strict' => 0
));
I know this htmLawed behavior needs improvement. I will soon release a new version for better handling of data-*.
For the time-being, you can edit line# 494 in htmLawed.php to remove -- or preg_match('`data-((?!xml)[^:]+$)`', $k) -- to disallow data-* attributes by default. Then, to allow specific data-* attributes, use htmLawed's 'spec' argument. For example, with following code, 'data-myAttr1' and 'data-myAttr2' will be permitted only in 'div', and 'data-myAttr3' only in 'img'.
htmLawed(
$string,
array(
'safe' => 1,
'deny_attribute' => 'style, class, id',
'comment' => 1,
'make_tag_strict' => 0
),
'div = data-myAttr1, data-myAttr2; img = data-myAttr3'
);
The values in 'data-myAttr1', etc., can also be checked with htmLawed's 'spec' argument (see documentation).