1

Topic: Cut short HTML, tag balancing

When sending a piece of HTML to a web app it often happens that the message transmission is interrupted, ending after < and before >, or in between quotes in an attribute , it seems that htmLawed does not correct for this very common problem, even with tag balancing turned on. Is that right? Should I write my own PHP code to solve this problem?

Thank you for htmLawed, it is awesome!

2

Re: Cut short HTML, tag balancing

htmLawed can handle broken/incorrect HTML to only a certain degree because at some point it is difficult to know if a mistake was intentional (e.g., for an XSS attack) or not, in which case it can also be difficult to discern what the writer intended. E.g., the '<' in let us suppose b<a can be for the less-than sign or it can be for an opening 'a' tag. Input errors can occur because of different reasons which can vary from one setup/use-environment to another. htmLawed cannot know that and has to be cause-agnostic.

You may have to write some code to adjust the input before passing it to htmLawed. Perhaps, checking that the characters <, > and " are even in number, and then appropriately appending some of those characters at the end of the input will be good enough.

[thanks for the feedback]