1

Topic: Bug in font-face matching

If you take the sample input from 3.3.2

<center>
The PHP <s>software</s> script used for this <strike>web-page</strike> web-page is <font style="font-weight: bold " face=arial size='+3' color   =  "red  ">htmLawedTest.php</font>, from <u style= 'color:green'>PHP Labware</u>.
</center> 

And put it in the test as of 1.1.11 (http://www.bioinformatics.org/phplabwar … edTest.php), the "a" in arial gets chopped off.

This can be fixed by changing lines 625-627 from:

 if(preg_match('`face\s*=\s*(\'|")([^=]+?)\\1`i', $a, $m) or preg_match('`face\s*=\s*([^"])(\S+)`i', $a, $m)){
  $a2 .= ' font-family: '. str_replace('"', '\'', trim($m[2])). ';';
 }

to

if(preg_match('`face\s*=\s*(\'|")([^=]+?)\\1`i', $a, $m)){
  $a2 .= ' font-family: '. str_replace('"', '\'', trim($m[2])). ';';
}elseif(preg_match('`face\s*=\s*(\S+)`i', $a, $m)){
  $a2 .= ' font-family: '. str_replace('"', '\'', trim($m[1])). ';';
 }

The problem is because the 1st regular expression pairing in the 2nd half of the if-statement takes away the character but doesn't need to (the first half would match quotes, and the \s* matches spaces.)

2

Re: Bug in font-face matching

Thank you for pointing out this bug. I have released version 1.1.12 of htmLawed to fix it. To avoid an if-else, I changed the second preg_match's regex pattern to

preg_match('`face\s*=(\s*)(\S+)`i', $a, $m)