Ltoh: a customizable LaTeX to HTML converter Version 97e, 31 Mar 1997 Russell W. Quong (http://www.best.com/~quong/ltoh) ---------------------------------------------------------------------------- Introduction Ltoh is a customizable LaTeX to HTML converter. It handles text, tables, and hypertext links. ltoh is a large Perl script, and hence is (almost completely) platform independent. ltoh is customizable in that you can specify how to translate a given LaTeX2e macro into HTML, including your own personal macros. In fact, you must manually specify how to handle your own macros, otherwise ltoh will give a friendly warning. See the ltoh web page for documentation, the latest release, and how to contact the author (see the bottom of the web page). Naturally, the HTML version of document was generated using ltoh, and in my opinion looks better than the LaTeX2e dvi/PS output, mostly due to the extra colors. The first public release was version 97e on 3/31/97. Ltoh has two main restrictions. First, ltoh does not handle math equations, which in general are difficult to display in HTML. [Some have resorted to converting the latex equations into Postscript (PS), converting the PS to a bitmapped figure, and the displaying the figure in HTML. This is all too difficult for me.] Second, ltoh requires La/Tex macro parameters to be delimited by braces; in practice, ltoh might be unsuitable for most existing TeX code. Surprisingly, I often preview my LaTeX2e documents via ltoh instead of running latex, dvips, and ghostview. ---------------------------------------------------------------------------- The distribution Ltoh is distributed as either a zip file or a gzipped tar file (about 75K). Both distributions contain the following files. ltoh.pl The perl script that does everything ltoh.specs The default specifications. readme.html Generated by ltoh readme.dvi LaTeX2e output readme.ps Uses Times Roman readme.txt Text version (generated from netscape) README rq-ltoh.specs An example of my specifications rq209.sty Allow use of new LaTeX2e font commands in old LaTeX2e. ---------------------------------------------------------------------------- System Software Requirements Ltoh version 97e requires the following system software. 1. A unix operating system. I have tested ltoh only under Linux, although there is nothing Linux specific in the code. A handful of lines are Unix dependent mostly in due to filename conventions. Because ltoh is written in Perl, almost all the code is platform-independent. 2. Perl version 5.00x, preferably 5.002 or higher. Run perl -v to see the version of Perl you have. 3. LaTeX2e macro parameters (or arguments) must be brace delimited (surrounded by braces), because ltoh relies on the braces. 4. The new latex (LaTeX2e) is strongly recommended over the old latex (LaTeX 2.09), Additionally, the default ltoh specifications is based on standard new latex macros. Finally, to make full use of HTML tables, future versions of ltoh are likely to support multiple rows in the table packages only found in the new latex. If you must use LaTeX 2.09 instead of LaTeX2e ltoh relies on unique matching braces to delimit arguments to the latex macros. In particular, the font family and size commands in old latex do not use braces to delimit arguments. Thus, ltoh\ does not (and probably never will) handle old latex 2.09 font specifications. Instead, you must use the LaTeX2e convention, which delimits the text to be affected by a font change via braces as shown in the following comparison. (Old latex) Normal but switch \bf to {bold \it then italics, back to} bold \normalfont then normal. (New latex) Normal but switch \textbf{to bold \textit{then italics, back to} bold} then normal. Produces: Normal but switch to bold then italics, back to bold then normal. Using the old latex syntax, ltoh cannot determine when the bold and italic fonts stop being active. If you have the new latex on your system, use it. If you must use an old latex file, convert it to look like new latex as much as possible. 1. Convert all font change macros to use the new latex syntax. Namely, convert {\XYZ ... } and \XYZ ... \normalfont to \ textXYZ.... 2. Use the style file rq209.sty (which should be) included with the ltoh distribution, which defines the \textXYZ macros for use in the old latex. To use this file, put \input{rq209.sty} in your latex files. The file rq209.sty additionally defines the font size macros \fsizeTiny/.../\fsize Huge which take a single brace-delimited argument. For exapmle, use \fsizesmall{some text} instead of { \small some text }. (This author wrote rq209.sty back in 1994 because the office computer ran the old latex but the home Linux machine ran the new latex.) Alternatively, write and use your own definitions of the \ textXYZ font change macros. (One final note.) The old latex convention is simply a poor technical chioce. The current philosophy for document specifications (and even programming languages) is that parameters/arguments/blocks are clearly delimited syntactically. The use of matching braces by latex2e conforms to the the SGML syntax, as does HTML which ubiquitously uses matching begin and end tags. ---------------------------------------------------------------------------- Running ltoh To generate the HTML file xyz.html from the latex file xyz.tex, assuming ltoh is in your path, run: prompt> ltoh xyz.tex or prompt> perl fullpath-of-ltoh.pl xyz.tex (I have not tested ltoh on a Win32 machine, yet...) On a Win32 machine, which cannot automatically start Perl to execute the ltoh, you would probably run prompt> perl ltoh.pl xyz.tex ---------------------------------------------------------------------------- Specifications There are five types of ltoh specifications. Please note the names. 1. [Begin/end pair (b/e)] Specifies how to translate a latex \begin{XYZ} and matching \end{XYZ} command. 2. [Command (comm)] Specifies how to translate a latex command that does not take any parameters, such as \par, \item or \hrule. 3. [Simple-macro ({})] Specifies a translation for a latex macro that takes a single brace-delimited argument arg-1, where the corresponding HTML consists simply of surrounding the argument with a preamble and postamble. The translation is simple as the argument stays put; ltoh merely puts stuff before and after it. That is, we expect \simplemacro{...} ===> HTML-preable ... HTML-postable For example, use a simple-macro specification to translate the latex macro \textbf{ ... } (switch to bold face) into the HTML ... . 4. [Arg-macro ({N})] Specifies a translation for a latex macro that takes $N$ brace-delimited arguments; the corresponding HTML can make arbitrary use of the arguments. For example, my latex macro \swallow{arg-1} discards its single (possibly long) argument. In the corresponding HTML, we also ``use'' the argument, by discarding it. 5. [Assignment (:=)] An assignment sets a ltoh variable which is then used later. As of version 97e, only a small number of built-in variables are supported. I hope to support setting and getting user-defined variables in the future. The first four specications are known as translations specifications. Translation specifications The four types of translation specifications have the same form. Do not use leading whitespace. Here is the general form and an example of each type. :type :latex-macro-name:HTML-start-code:HTML-end-code:reserved/not-used :b/e :\begin{itemize}:
tags whenever two or more consecutive 1997 blank lines are seen. Version 97c. Much improved handling of special characters such as {, }, <, > and @. In particular, bare braces which mean nothing in Mar latex are stripped from the HTML. Improved paragraph detection 11-15 handling. (OK, OK, "Improved ..." really means "fixed bugs in 1997 ..."). No longer generates HTML comments for latex comments, by default. Version 97c was meant to be first public release, but the tables this readme.tex document broke ltoh badly. Version 97d. Complete rewrite of the table handling code. Latex column alignment specifications are understood and passed onto to the HTML. Multiple columns specified via either \multicolumn or Mar \mc (which is my personal abbreviation macro) are handled 19-20 properly. We try to ignore extraneous LaTeX2e specs in the column 1997 alignment specifications, such as @. , but there is there is a small chance multiple columns In particular, ltoh now handles this file (readme.tex) properly. Mar Version 97e. Official release. Clean up source a bit for release. 25-31 Minor improvements on tables (allow end of a row to be on a 1997 separate line), paragraphs, specification files and handling special characters (allow for multiple chars on one line). ---------------------------------------------------------------------------- License for use You may use ltoh freely, under the following conditions, which are covered under a BSD-style license. 1. You must keep the ltoh notice at the end of all generated HTML. This notice indicates how the document was generated (namely, with ltoh) and has a link to the ltoh web page. You may reduce the size of the notice if you wish, but it must remain in your document and visible. 2. Ltoh comes with neither a warranty nor a gaurantee about its correctness, performance, or suitability for any task. 3. If you modify or redistribute ltoh, you must keep the current disclaimer and license with the ltoh source which must be visible upon startup, unless you 4. I, Russell Quong, retain the copyright for ltoh. Official software license Here's the official license as of 31 Mar 97. # Copyright (c) 1996, 1997 Russell W Quong. # # In the following, the "author" refers to "Russell Quong." # # Permission to use, copy, modify, distribute, and sell this software and # its documentation for any purpose is hereby granted without fee, provided # that the following conditions are met: # 1. Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # 2. All advertising materials mentioning features or use of this software # must display the following acknowledgement: # This product includes software developed by Russell Quong. # 3. All HTML generated by ltoh must retain a visible notice that it # was generated by ltoh and contain a link to the ltoh web page # # Any or all of these provisions can be waived if you have specific, # prior permission from the author. # # THE SOFTWARE IS PROVIDED "AS-IS" AND WITHOUT WARRANTY OF ANY KIND, # EXPRESS, IMPLIED OR OTHERWISE, INCLUDING WITHOUT LIMITATION, ANY # WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. # # IN NO EVENT SHALL RUSSELL QUONG BE LIABLE FOR ANY SPECIAL, # INCIDENTAL, INDIRECT OR CONSEQUENTIAL DAMAGES OF ANY KIND, OR ANY # DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, # WHETHER OR NOT ADVISED OF THE POSSIBILITY OF DAMAGE, AND ON ANY # THEORY OF LIABILITY, ARISING OUT OF OR IN CONNECTION WITH THE USE OR # PERFORMANCE OF THIS SOFTWARE. ---------------------------------------------------------------------------- Motivation (The motivation section belongs right after the introduction, but most people probably just want to get on with using ltoh. So this section has been relegated here. Ah well...) Although other LaTeX2e to HTML converters exist, I wrote my own because I wanted to generate HTML customized to my own liking. In particular, I use my own custom LaTeX2e definitions (who doesn't?) and wanted to generate HTML appropriately. Additionally, when using other converters, I was unable get them to run properly, or I did not like the way the generated HTML looked. Fundamentally, ltoh is a specialized macro processor that reads macro specifications and generates HTML accordingly. A specification indicates how to convert a specific LaTeX2e definition into HTML. My orginal goals in writing ltoh were 1. To become more proficient in Perl. 2. To write a converter that allowed customized HTML to be generated. 3. To handle arbitrary LaTeX2e macros, including macros that take multiple arguments, and arguments that contained nested macros. 4. To generate HTML that resembled the original LaTeX2e, as if someone had done the translation by hand. In particular, I wanted to avoid generating very long lines of HTML source, which are difficult to read when ``viewing the source'' in a browser. ---------------------------------------------------------------------------- Acknowledgements Thanks to VA Research for letting the author work on ltoh. ---------------------------------------------------------------------------- [Converted LaTeX --> HTML by ltoh] Russell W. Quong (quong@best.com) Last modified: Apr 1 1997