1

Topic: LAMP server-side htmLawed setup

Greetings,

I would like to get htmLawed up and running on my server so that all source code generated by Drupal is cleaned up. The problem is, I found the Usage instructions in the Documentation to be rather limited. Furthermore, I have spent literally hours googling for basic setup tutorial/instructions and simply can't find anything. It appears to be rather simple but I am not a seasoned admin or php developer so I am not sure how to accomplish this. I would be very grateful if somebody would kindly provide me with some instructions... like where in my file system to put htmLaw.php and how to configure php.ini to recognize it and use it. As far as configuration, there is plenty of info available for that part.

Cheers
Kevin

2

Re: LAMP server-side htmLawed setup

Hi.

Can you clarify a bit? Is it that you want to use htmLawed with the Drupal software?

Some usage notes regarding htmLawed; pardon me if they are too simple or if you can already surmise them.

(1) The htmLawed code is invoked like any other PHP code. For example, one can have a PHP file named example.php with following code in a sub-directory of the PHP-enabled web-server's root directory

<?php
include('htmLawed.php'); // htmLawed.php is in the same directory as example.php
$text = '<em>Hello world!</em>';
echo htmLawed($text);

htmLawed will be invoked when the web-server is called to process example.php (e.g., when a web-site visitor browses to the web-address http://my.web-server.com/example.php).

(2) For Drupal, there is an htmLawed module. Using it involves downloading the module files, placing them at the right location in the Drupal file-system on the web-server, and then configuring htmLawed behavior using Drupal's web-based administrative interface. See https://drupal.org/node/255900.

(3) The htmLawed code cannot handle HTML code that is used in the <head> section of HTML documents. If a full HTML document, such as the entire text of a .html file, is passed to htmLawed, htmLawed will corrupt the document structure. So, htmLawed (just by itself) shouldn't be used to process entire HTML documents, such as the complete output of the Drupal content management system. htmLawed in the Drupal htmLawed module mentioned above only handles text that users type in the Drupal forms for comments, blog posts, etc. online. Such text after various filtering/processing gets displayed in the <body> (and not <head>) section of HTML documents that are created on the fly by Drupal.

(4) There is no need to alter the php.ini file on the web-server in order to use htmLawed. As long as PHP is enabled on the server, htmLawed will work when appropriately invoked through PHP scripts.

(5) It is theoretically possible to have every HTML document generated on a web-server (regardless of what generated them: Drupal, a Wiki software, static HTML files, etc.) filtered through htmLawed before being sent over the internet to a web-site visitor. However, I have not seen such an implementation.

3

Re: LAMP server-side htmLawed setup

Hi,

Thanks for the reply. I was expecting to receive an e-mail notification when there was a response here and I did not. Glad I decided to check in anyway.

I use Drupal to build websites but I am not really interested in htmLawed for Drupal. I am more interested in having it nicely format the source code of every page generated by apache and php. So I guess I am trying to find out how to install it as a server-side filter so when viewing source code of any page generated by my server, whether Drupal, handcoded, or whatever it is processed and cleaned up by htmLawed.

Number (1) below lools like what I am after. But can I set it up to process the output of every page served by my server (e.g., answer #5)? You say you have not seen such an implementation so I am curious how perfectly structured beautified HTML is created when I look at some source code? I thought this was the result of using something like Tidy or htmLawed.

I really appreciate your help. I haven't been able to find much online. Perhaps I will write a tutorial on how to do this for others once I solve it.

Cheers
Kevin

4

Re: LAMP server-side htmLawed setup

I should also mention that if I could achieve this on a per site basis as opposed to per page basis as outline in (1)... that would work too. I would just like to achieve really tight, nicely formatted code which is hard to get when using systems that piece meal it together. Cheers Kevin

5

Re: LAMP server-side htmLawed setup

I think it will be best to use a module in Apache. There is at least one such module, mod_tidy; see http://mod-tidy.sourceforge.net/

It will filter/process Apache's HTML output using the HTML Tidy software.

Like most Apache modules, I think it can be configured on a per-site basis.

I have not used mod_tidy but you should be able to get sufficient information online from its website, some discussion forum (e.g., http://www.apachelounge.com/viewtopic.php?t=174), etc.

6

Re: LAMP server-side htmLawed setup

I don't think Tidy works with HTML 5. Does it?

7

Re: LAMP server-side htmLawed setup

There is a Tidy fork for HTML5: http://w3c.github.io/tidy-html5/.

Perhaps you can replace the Tidy within mod_tidy Apache module download with this before installing the module.

8

Re: LAMP server-side htmLawed setup

I'll give it a try. thanks a lot for all of your help. cheers kevin