your assumptions are correct in the reasons we need to keep bbcodes. mainly for ease of use and backwards compatibility for the thousands of posts we have.
the reason pre-parsing is done and stored in the db as the parsed text is simply for performance reasons. There are far more views than inserts. there is little point in parsing from bbcode on every request.. for 100 posts per page, taht is 100 parses of possibly fairly large texts, and that is just for 1 member. With 3k online at any given time, you can see why it is benefical to keep it parsed in the db and just output it as it is, with no further checks; even if the parsing stage has to take a big longer when inputting data in.. it simply overall gives a better efficiency.
--------
if i where to store it in bbcode, the first stage u mentioned isnt even necessary... an htmlentities call would secure the input completely (as the bbcodes arent parsed yet, and the bbcodes themselves use [] instead of <>. then on output i would have to parse it to html, and then check the html produced using this lib or kses. I think it is much better to pre-parse it and have a bit more processing time during parsing/unparsing (unparsing is very fast actually with the <!--x--> system i m using since there are no security checks as that is done during parsing, and its just a simple str_replace call as mentioned below).. generally i believe in the idea that u secure while u insert and then just display as is...
a simpler example would be if you had to store a 'name'. it would be much more efficient to htmlentties the name be4 u insert in the db, and then just display it as is, instead of securing it each time u display it
---------
for the strong and em yea... but what about more complex div and span created tags that use classes to define the style..
eg:
[box]text[/box] => <div class="box">text</div>
[bbox]text[/bbox] => <div class="bbox">text</div>
with a simple str_replace there is no way in distinquishing the 2 </div> ., u wouldnt know which one is box and which one bbox.. offcourse my unparsing is simple str_replaces... which is y i put the comments in there... so <div class="box">text<!--box--></div> can easily be translated str_replace(array('<div class="box">','<!--box--></div>'), array('[box]','[/box]')); etc
you could argue that i could just use [/box] for all </div> closes, and get done with this issue alltogether. but that wont look nice for the end user :P
+ the fact that the box/bbox example wasnt a good one..
what if u have floaters.. [left] and [right].. both use div's to accomplish it.. u can float anything u want in them, images, quote boxes, text, urls etc... having [left]asdf[/box] wouldnt make much sense ;)
or a [ code] box... or [center] to center whatever u put inside it.. a similar problem arizes with span tags; size, color, underlined, dropcaps, strike throughs etc... the bbcode i m offering is quite rich. html as input would be very hard for the members, would they need to memorize class names and learn html in order to make a nice looking post?
offcourse you could use JS to create a nice editor that automates all of this. but then what about members that have js disabled?