1

Topic: Converting only completed tags

Hi,

First of all, please forgive me if this has been brought up before - I suspect it has but couldn't find it looking through the forum.

I'm specifically interested in using Htmlaw for XSS prevention (thank you so much for creating it and making it available!). I present a form to the end user and they enter data - I pass it through Htmlaw and the result is save to the Db (it can't be done on output for legacy reasons in the app - I'm going to change that in a future version).

If the user submits something as simple as `<` it gets encoded to be `&lt;`. Could you clarify they this is? I don't quite understand how a single bracket could be a security issue. Continuing that theme `2014<<2015` for example would be `2104&lt;&lt;2015`.

Might it be possible that if only one of `<` or `>` were found in a string then there is no need to convert it?

Regards,
Allan

2

Re: Converting only completed tags

Hi Allan,

The `<` or `>` character has a special meaning in HTML and is therefore not used for plain text. This is why htmLawed converts a simple input `<`, which obviously has no HTML element, to `&lt;`.

I can see that this issue is inconvenient when, as in your case, the input is filtered and stored and later retrieved for editing or use outside the HTML context.

Since htmLawed is agnostic of the intent or workflow of the administrator setting up htmLawed, I don't think changing the htmLawed logic to address this issue is rational.

The best solution, as you are thinking, is to use htmLawed filtering only when text is displayed and not when it is stored. Another option, which may have some demerits, is to convert the pre-filtered stored text when it is provided for editing. E.g.,

 // Filter user input before storage
$savedText = htmLawed($userText...);

// For editing, retrieve stored text
...

// Replace in the retrieved text &lt;, &gt; and &amp; with <, > and &
$textToEdit = str_replace(array('&lt;'...), array('<'...), $textToEdit);

// Use htmlspecialchars() before putting the text in an HTML textarea element
$textToEdit = htmlspecialchars($textToEdit);

A third, possible option may be to consider using code that relies on the `and_mark` config. option for htmLawed; see this documentation section.