rTxt2htm documentation

1  About
  1.1  License & copyright
  1.2  Formatters
2  Usage
  2.1  Simple
  2.2  Modifying layout and style
  2.3  Modifying code logic
  2.4  Usage with non-English text
3  Other
  3.1  Upgrade
  3.2  Change-log
  3.3  Support
  3.4  Donate

rTxt2htm_README.txt
rTxt2htm 1.2.1, 22 January 2019
Copyright Santosh Patnaik
GPL 3 license
A PHP Labware internal utility - http://www.bioinformatics.org/phplabware/internal_utilities/index.php 

1  About

(to top)

rTxt2htm creates standalone, XHTML 1-Strict HTML files from text files with special but simple and unobtrusive markup. It is intended for generating HTML versions of plain-text documentation (like readme files that accompany software distributions; the r in rTxt2htm hints at readme). It can also be used to create simple web pages.

Documentation files are often in plain-text format, which, while versatile, lacks the enhanced functionality of hyperlinks that allow one to jump between sections of the documentation or to resources outside it.

rTxt2htm parses text files written in a specific format for title, URLs, sections, code fragments, styled text, tables of content, etc., creating the necessary HTML elements for presentation in the HTML output. The format rTxt2htm uses is somewhat like the reStructured text or rst format.

A form interface is provided to upload the plain text files, or paste their content, and to set additional information for use in the generated HTML content.

1.1  License & copyright

(to top)

rTxt2htm is free software licensed under GPL license version 3, and copyrighted by Santosh Patnaik, MD, PhD.

1.2  Formatters

(to top)

rTxt2htm looks for specific white-spacing, characters, etc. (formatters) in the plain-text files for creating the necessary HTML elements.

The formatters that rTxt2htm uses are simple and unobtrusive, and yet meaningful inside plain-text files. A comparison of the HTML, and the plain-text versions of this readme documentation shows this clearly.

Formatters (processing done in the shown order) are:

*  A block of text with +-----(5 or more)+ at top and at bottom (leading or trailing spaces are okay) is rendered as plain, non-formatted, mono-spaced text for tables, ASCII diagrams, etc.; rest of formatters don't apply to its content. Like:

     +~~~~ ~~~~+
     | *hello* |
     +~~~~ ~~~~+

*  A block of text with == Content ==(any number of) at top and atleast one empty line at bottom is considered a table of content (TOC); rest of formatters except those for styled text don't apply to its content. Lines inside the block are made into TOC items, that get auto-linked to different sections, etc., if they have the identifiers for the sections, etc.

The section identifiers, that can have the period (.) character, can be numeric (like 1, 5.4.3 and 2.2.), or alphanumeric but inside round parentheses -- like, (A), (5i) and (A.5i.1). The HTML ID values generated by rTxt2htm for the identifiers asre the same as the identifiers but prefixed with an s and with the brackets replaced with underscores (_). E.g., s5.4.3 and s_A.5i.1_.

*  A block of text flanked with /* style PHP comment markers will be shown in a subtle div element. Like:

   Some subtle text
   Like comments

*  Title, keywords, description, encoding and language for use with the HTML version are gleaned from lines like this (the lines are not shown in the output):

    @@title: rTxt2htm documentation
    @@language: en
    @@keywords: rTxt2htm, text to HTML, convert, conversion, PHP, Labware, readme
    @@encoding: utf-8
    @@description: rTxt2htm generates HTML versions of plain text files

The encoding should be a value accepted by IANA. The language should also be a value accepted by IANA. When such lines are missing, the information provided in the form is used. One may also manually edit the generated HTML files to alter such information.

*  Four or more spaces before a sentence lead to the sentence being shown as code (a tab is considered equal to 4 spaces). Like this:

    <this is some 'code'>

*  Flanking a word or phrase with ' makes it rendered as a special span element, like this (URL and bold or italics formatters are not applied to it). Flanking a word or phrase with ` italicizes it. Flanking a word or phrase with * makes it appear bold.

*  A word followed by :-, a space, and then another word is rendered as the first word hyperlinked to the location pointed out by the second one.

E.g.:

   --  for rTxt2htm support, see section 3.2
   --  rTxt2htm was created for documenting htmLawed

*  Words with http:, https:, mailto:, ftp:, file:, and sftp: are rendered with appropriate hyperlinks. Like, http://www.bioinformatics.org/phplabware.

*  Two = characters followed by optional spaces and then text followed by more = characters on a new line that is preceded by an empty line indicate a section start. The text is shown as an h2 element. Any o's at the end are for div closures. If the text has a leading number like 1 and 3.2.1, the section gets an anchor named the same as the number but prefixed with s, like s1 and s3.2.1.

*  For sub-sections (rendered with an h3 element) and sub-sub-sections (rendered with an h4 element), instead of the = character, the characters - and . respectively are used.

*  Five or more underscores on a line by themselves and preceded by an empty line are rendered as an hr element (horizontal rule); any o's at the end are, like with the formatters for sections, etc., for div closures.

Note:

Empty spaces are preserved, so any indentation is preserved. For bold, italicized or otherwise stylized text, the characters [ and ( if at the beginning, and characters ?, ;, !, :, ,, ., ), and ] if at the end of a word/phrase are not stylized. Same is true for hyperlinking.

Formatters for HTML lists, tables, colored text, etc., are missing as such information either cannot be expressed in plain-text format or is adequately functional in it without a need for a formatter.

2  Usage

(to top)

rTxt2htm should work with PHP 4.3 and higher.

2.1  Simple

(to top)

Browse to rTxt2htm.php on the server (e.g., http://domain.com/rTxt2htm.php) using a web browser and use the provided form to upload the plain-text file, or paste its content, and to provide additional information. If you are satisfied with the output, save the web-page, appropriately renaming it. The HTML version can now be distributed to others.

Some browsers do not save web-pages as originally authored. In such a case, you may want to directly save the output of rTxt2htm. If so, check the form option Direct download. rTxt2htm should now prompt you to save the file as a download.

2.2  Modifying styles and layout

(to top)

Simple editing of the CSS value when submitting the form should generally be enough.

2.3  Modifying code logic

(to top)

Code inside rTxt2htm.php is reasonably documented with inline comments. One can edit regular expression patterns inside it, e.g., to implement customized formatters.

2.4  Usage with non-English text

(to top)

rTxt2htm should work well with non-English text. Ensure you have the proper values set for character encoding and language when submitting the form.

3  Other

(to top)

3.1  Upgrade

(to top)

Simply replace the rTxt2htm.php file.

3.2  Change-log

(to top)

v1.2.1 - released Jan 22, 2019

  *  bug-fix for parsing of single-quote formatters

v1.2 - released Dec 15, 2017

  *  rTxt2htm is now compatible with PHP 7

v1.1.1 - released Mar 15, 2008

  *  text content can be pasted/typed in the form

v1.1 - released Sep 28, 2007

  *  form interface
  *  auto-identification of title, keywords, etc.
  *  separate style for numbers of headings, support for non-numeric section identifiers, etc.

v1.0.2 - released Sep 22, 2007

  *  file: URL auto-linking, nesting for some formatters
  *  minor bug-fixes

v1.0.1 - released Sep 21, 2007

  *  text-styling works for items in table of contents
  *  new formatter for unformatted text for displaying tables, ASCII diagrams, etc.

v1.0 - released Sep 13, 2007

3.3  Support

(to top)

For possible updates, follow up at http://www.bioinformatics.org/phplabware/internal_utilities (which also has a forum). For general PHP issues (not rTxt2htm-specific), check on the internet and at http://php.net.

3.4  Donate

(to top)

A donation in any currency and amount to appreciate or support this software can be sent by PayPal to this email address: drpatnaik at yahoo dot com.

Thank you!




HTM version of rTxt2htm_README.txt generated on 22 Jan, 2019 using rTxt2htm from PHP Labware