1

Topic: Deny_attribute => 'style' not working?

Hi, I'm using htmlLawed to purify RSS feeds and today it failed.

My config is:

$config = array(
'safe' => 1,
'deny_attribute' => 'on*,style,',
);

The HTML that was received from an arbitrary RSS was:

<span style="mso-bidi-font-family:David;"><font size="3"><font face="Times New Roman"><strong>Hey there everyone!</strong><br /></font></font></span><span style="mso-bidi-font-family:David;"><font size="3"><font face="Times New Roman">This is my first blog post, so I just want to scribble some thoughts I've been having about this…</font></font></span><span style="FONT-FAMILY:David;"><font size="3">&nbsp;<br /><br /></font></span><span style="mso-bidi-font-family:David;"><font size="3"><font face="Times New Roman">I recently found out that a couple of the people I work with have blogs they publish over the internet. At first, I got a good laugh out of this idea. I really like the internet community, and I even try to help out in some forums every once in a while, but I didn't find the point in publishing a weekly/monthly/yearly/whatever blog about my current findings or projects that I'm working on.<br /><br /></font></font></span><span style="mso-bidi-font-family:David;"><font size="3"><font face="Times New Roman">Well, after a long chat with <a class="" title="Yossi" href="http://blogs.microsoft.co.il/blogs/ysa/" target="_blank">Yossi</a>, he finally convinced me to open my own blog. My arguments were that I think it's a waste of time, not that many people even read it, I rather work on newer projects then write about the old ones, and the list goes on. On the other side, he told me that it's a good idea to post these blogs for yourself and no one else. He told me that he seldom runs into a problem he had in the past, and if he doesn't remember the solution he searches his recent blog posts. More what, when he comes to publishing a blog, he isn't satisfied with just what he knows on the subject he's going to post, so he looks up more about it to cover the subject fully.</font></font></span><span style="mso-bidi-font-family:David;"><font face="Times New Roman" size="3">&nbsp;</font></span><span style="mso-bidi-font-family:David;"><font size="3"><font face="Times New Roman">Now I'm starting to think that the same will be happening to me, I just opened this blog, and my head is already spinning with subjects I want to post about and I'm already searching the net to find out more than what I already know about these subjects.<br /><br /></font></font></span><span style="mso-bidi-font-family:David;"><font size="3"><font face="Times New Roman">I've been building web sites for the past couple of years, using mostly php and MySQL.</font></font></span><span style="mso-bidi-font-family:David;"><font size="3"><font face="Times New Roman">Just recently (a couple of months ago) I started working in the .NET field, learning C#, ASP.NET and started to work with MOSS 2007 (Microsoft Office SharePoint Server) which I like a lot! So I'm dedicating my blog to these new subjects I'm learning, hoping that it will help me advance and boost my motivation!<br /><br /></font></font></span><span style="mso-bidi-font-family:David;"><font size="3"><font face="Times New Roman">Well, I gotta go now… There's a lot I need to learn!...</font></font></span><img src="http://blogs.microsoft.co.il/aggbug.aspx?PostID=126313" width="1" height="1">

And the output is:

<span><span style="font-size: medium;"><span style="font-family: Times;"><strong>Hey there everyone!</strong><br /></span></span></span><span><span style="font-size: medium;"><span style="font-family: Times;">This is my first blog post, so I just want to scribble some thoughts I've been having about this…</span></span></span><span><span style="font-size: medium;">&nbsp;<br /><br /></span></span><span><span style="font-size: medium;"><span style="font-family: Times;">I recently found out that a couple of the people I work with have blogs they publish over the internet. At first, I got a good laugh out of this idea. I really like the internet community, and I even try to help out in some forums every once in a while, but I didn't find the point in publishing a weekly/monthly/yearly/whatever blog about my current findings or projects that I'm working on.<br /><br /></span></span></span><span><span style="font-size: medium;"><span style="font-family: Times;">Well, after a long chat with <a class="" title="Yossi" href="http://blogs.microsoft.co.il/blogs/ysa/" target="_blank">Yossi</a>, he finally convinced me to open my own blog. My arguments were that I think it's a waste of time, not that many people even read it, I rather work on newer projects then write about the old ones, and the list goes on. On the other side, he told me that it's a good idea to post these blogs for yourself and no one else. He told me that he seldom runs into a problem he had in the past, and if he doesn't remember the solution he searches his recent blog posts. More what, when he comes to publishing a blog, he isn't satisfied with just what he knows on the subject he's going to post, so he looks up more about it to cover the subject fully.</span></span></span><span><span style="font-family: Times; font-size: medium;">&nbsp;</span></span><span><span style="font-size: medium;"><span style="font-family: Times;">Now I'm starting to think that the same will be happening to me, I just opened this blog, and my head is already spinning with subjects I want to post about and I'm already searching the net to find out more than what I already know about these subjects.<br /><br /></span></span></span><span><span style="font-size: medium;"><span style="font-family: Times;">I've been building web sites for the past couple of years, using mostly php and MySQL.</span></span></span><span><span style="font-size: medium;"><span style="font-family: Times;">Just recently (a couple of months ago) I started working in the .NET field, learning C#, ASP.NET and started to work with MOSS 2007 (Microsoft Office SharePoint Server) which I like a lot! So I'm dedicating my blog to these new subjects I'm learning, hoping that it will help me advance and boost my motivation!<br /><br /></span></span></span><span><span style="font-size: medium;"><span style="font-family: Times;">Well, I gotta go now… There's a lot I need to learn!...</span></span></span><img src="http://blogs.microsoft.co.il/aggbug.aspx?PostID=126313" width="1" height="1" alt="image" />

style attributes remain... Help?

2

Re: Deny_attribute => 'style' not working?

After spending time in the documentation I think the problem is related to "no_deprecated_attr" which transforms tags like <font> into style tags, thus circumventing expected behavior.

I don't want to allow style tags and I don't want them to sneak in there via deprecated HTML, either.

Is there a way?

3

Re: Deny_attribute => 'style' not working?

Successful tag-transformation of elements like 'font' and 'u' requires the use of the 'style' attribute. Unfortunately there is no way around it.

Note that with the particular configuration you cite, htmLawed will not let user-submitted 'style' attribute values be sneaked in this way. Thus, 'style' in input like

<font face="Times" style="font-variant: bold;">Hey there everyone</font>

*will* be removed, giving

<span style="font-family: Times;">Hey there everyone</span>

The only 'style' value appearing in the output is the one added by htmLawed.

I agree that this can be exploited to alter the appearance of text against admin. wishes as a user can change the font-face or font-size this way. A way to avoid this is to simply disallow the 'font' tag; the 'font' element content -- the plain text -- will still get through, which is what you want.

As a side-note, with 'safe' set to 1, 'on*' need not be specified in 'deny_attribute'. Also, 'deny_attribute' does not need the trailing comma (that bug was fixed some versions ago).

$config = array(
  'safe' => 1,
  'deny_attribute' => 'style',
);

4

Re: Deny_attribute => 'style' not working?

After some consideration, I've worked around the problem by removing the font tags before passing off to htmLawed.

Thanks for your explanation, help, and great work on htmLawed!