1

Topic: Regex performance problem

Sorry I posted the issue previously here.
https://github.com/kesar/HTMLawed/issues/30

There is a regex that performs very slow in certain scenarios (I'm using HTMLawed indirectly through GLPI). I experience the issue with PHP 7, but not with PHP 8.

Not sure how to workaround it.

2

Re: Regex performance problem

Thanks for noting this issue. I used some online PHP code runners and did not see any performance difference between various PHP 7 and PHP 8 versions for the test code which you provided at https://github.com/kesar/HTMLawed/issues/30; they all took 20-100 ms. E.g., see https://3v4l.org/WR9Vd

3

Re: Regex performance problem

Apparently, the loop needed a few more iterations for the delay to happen.

I found a new way to trigger the "bug", without requiring long lines.
If you run this with version 7.4.x =< 7.4.11 it will be slow:
https://3v4l.org/1n3Vh#v7.4.0

It already takes 1.5s as is.

If you change the "180" value in the second loop for something slightly higer, like 200, 300 etc you will reach timeout.

Apparently PHP 7.4.12 fixed it by updating to PCRE version 10.35, which fixed it by enabling JIT optimization.

I understand this behaviour is not intrinsic to HTMLawed, and I would understand if you left it as is. Anyway I'm experimenting with workarounds, like replacing newlines for tabs, and oddly enough it seems to work.

4

Re: Regex performance problem

Thank you for investigating this further and noting your observations here. I won't be changing the htmLawed regex, as you indicated, as it is not really possible to do so without affecting versatility (various case scenarios).