1

Topic: why sup is not allowed in pre

hi, thanks for such awesome project.

i used htmllawed in graby to crawl and process web pages.

recently, i found wikipedia have some page like this

<pre>algorithm is...over 2<sup>32</sup> blablabla </pre>

however, there is a momkid relation which will not allow sup in pre, so the outcome becomes

<pre>algorithm is...over 2</pre><sup>32</sup> blablabla

the pre is inserted in the wrong place.

so I tried to skim through the readme and search a little on the internet, but can not understand it.

my question is

1. the reason for invalid monkid relationship.
2. is it the intended behavior to insert pre before sup?

2

Re: why sup is not allowed in pre

Thank you for noting this issue. Element sup is permitted in pre and I will soon release a new version to fix the issue. Till then you can edit the htmLawed code: in function hl_bal, edit array invalidMomKidAr for key value of pre to remove sup.

In older HTML standard specification, sup was not permitted*, and htmLawed's logic is still based on that old specification.

Please report any similar nesting issue if you notice them.

*https://www.w3.org/TR/html4/sgml/dtd.html#pre.exclusion

3

Re: why sup is not allowed in pre

patnaik wrote:

Please report any similar nesting issue if you notice them.

Ok, thanks.

4

Re: why sup is not allowed in pre

New release 1.2.9 fixes this issue.