Access the Guiguts tools for generating HTML with Fixup> HTML Fixup (or click in the toolbar) which opens the following palette:

This palette controls both automatic (whole-document) conversion to HTML, and specific markup of individual elements. Automatic conversion is covered first; then element-by-element markup.

Autogenerating HTML

The top few buttons on the HTML palette control bulk conversion of the entire document to HTML. The most important is Autogenerate HTML; when you click it, Guiguts makes the following changes, some of which are detailed in the following paragraphs:

Caution: these changes are not undo-able. Save the file before starting this process. Afterward, save it under a new name.

Guiguts tries to leave existing HTML alone during automatic generation, but it does not really parse existing HTML, so it is easily confused by multi-line markup like <table> or even an <img.../> that is pretty-printed across multiple lines. Automatic HTML inserted into such statements can make a mess. For best results, do automatic generation first and do element-by-element markup after.

The Header File

The header.txt file begins with the HTML <head> section which defines, among other things, the type of HTML or XHTML to which the document aspires, the character set it uses, and the <title> element. You need to modify the title text to reflect the book.

The header also defines all the CSS classes on which the generated html depends, for example the poem, stanza, and other classes referenced in the poetry generation. These classes strongly influence the appearance of the etext. You are encouraged to modify the inserted header, for example to change the document type or to change CSS stylings. You should modify it to delete any unused classes.

Generated TOC

The generated chapter table of contents may or may not be useful. For example you may already have the original TOC with page numbers, protected by /$..$/. If so, just delete the generated TOC.

Page Boundary Anchors

The two switches Pg #s as comments and Insert Anchors at Pg #s determine what HTML is produced at each page boundary. Both switches are on by default. Inserting anchors is recommended, provided you have adjusted the page markers beforehand. The etext will then contain an anchor of the form <a name="PG_n" id="PG_n" /> at each page boundary. You can use these anchors to quickly hyperlink the TOC, list of illustrations, and index to the proper places (see below).

The Page Offset field accepts a positive or negative number that reflects a consistent numeric difference between page numbers and folios (as discussed on this page). If the page numbers are consistently 7 greater than the folios, enter -7 (negative 7) in this field. The page numbers in the anchors will be adjusted by that amount, so the anchors correspond to the folios. When you use a negative adjustment, the first few page anchors will contain numbers less than one, for example <a name="Page_-3">. You can delete these. Or, if the original book used roman numerals for these front pages and if there are references to them ("Frontispiece, page iii"), you can manually edit them to the correct values: name="Page_iii" etc.

Block Quote Markup

Guiguts processes block quote (/#..#/) sections in one of two ways. If the CSS Blockquote switch is set on, it marks block quotes as <div class="blockquot">..</div>. If the switch is set off, it marks them as <blockquote>..</blockquote>. This choice was added following a debate in the Post-processing forum as to whether or not the old <blockquote> markup was valid in the new world of XHTML. (Current feeling seems to be that it is still supported when used for its intended purpose, setting off a quote.)

Tabular Markup

Guiguts marks up tabular material in an earnest attempt to make it look something like the original.

In a /f..f/ section, Guiguts puts each line in <p class="center">..</p>.

In a /*..*/ section, Guiguts encloses every line in
<span style="...">..</span><br />
where the style sets a left margin of some amount. Lines that start in the left margin are indented to the proper rewrap margin. Indented lines are given margin values that approximate their text indent. Every line in a /*..*/ section receives an indent of some amount. You can use per-block margins to modify the left margins assigned to these blocks.

In a /$..$/ section, Guiguts puts <br /> after every line that begins in the left margin. Lines that are indented are enclosed in
<span style="...">..</span><br />
and given a left margin that approximates the text indent.

In both /*..*/ and /$..$/ blocks, Guiguts replaces runs of spaces with runs of &nbsp; to try to preserve columnar alignment.

If you plan to use element-by-element markup to format tabular data—for example, if you plan to format a table using <table>, or want to convert a TOC or index into an unsigned list—the spans and breaks inserted by Guiguts will only get in your way. Before you apply automatic HTML, remove the /*..*/ or /$..$/ flags. Guiguts will put <p>..</p> around the unprotected table where it thinks it sees body paragraphs, but it does not rewrap the text during HTML generation. Paragraph markup is easier to remove than the tabular markup.

Inserting HTML Element by Element

Many of the buttons in the HTML palette allow you to quickly insert HTML code. You select the text you want to mark up, then click one of the buttons to insert the markup.

The following table lists the buttons from left to right, top to bottom. Some are discussed in more detail below. Each of these operations can be undone.

<i> Encloses the current selection in <i>...</i>.
<b> Encloses the current selection in <b>...</b>.
<u> Encloses the current selection in <u>...</u>.
<center> Encloses the current selection in <center>...</center>.
<hn> Encloses the current selection in <hn>...</hn>.
<p> Encloses the current selection in <p>...</p>.
<hr> Inserts <hr style="width:95%;" /> at the insertion point.
<br> Inserts <br /> at the insertion point.
nb space Inserts &nbsp; at the insertion point.
Poetry Marks up the current selection as a poem; see below
<big> Encloses the current selection in <big>...</big>.
<small> Encloses the current selection in <small>...</small>.
<ol> Encloses the current selection in <ol>...</ol>.
<ul> Encloses the current selection in <ul>...</ul>.
<li> Encloses the current selection in <li>...</li>.
<sup> Encloses the current selection in <sup>...</sup>.
<sub> Encloses the current selection in <sub>...</sub>.
<table> Encloses the current selection in <table>...</table>.
<tr> Encloses the current selection in <tr>...</tr>.
<td> Encloses the current selection in <td>...</td>.
<big> Encloses the current selection in <big>...</big>.
<blockquote> Encloses the current selection in <blockquote> ...</blockquote>.
<code> Encloses the current selection in <code>...</code>.
Named Anchor Inserts an anchor based on the selection; see below.
Image Inserts the HTML to include an image; see below.
External Link Encloses the current selection in a link to another file; see below.
Internal Link Encloses the current selection in a link to an anchor defined in this document; also used to check for duplicate anchors. See below.
Remove markup from selection Strips HTML markup (except for <i> and <b>) from the current selection. Not an Undo, doesn't restore formatting.
Find orphaned markup Initiates a search for each possible type of unbalanced HTML markup. The search stops at the first error found; correct and click the button again to resume. HTML markup only, unlike the general orphaned markup search.
Auto List Make the selection into an ordered or unordered list; see below
AutoTable Make the selection into a table; see below
div
span
Enclose the current selection in a div or span with specific styling. Enter any attributes to follow <div or <span, e.g. class="name".
Header Insert the file header.txt from the guiguts folder at the top of the document, and insert the </body> and </html> lines at the end of the document.
Link Checker Check and summarize all links; see below
HTML Tidy Pass the document through the tidy program; see below

Poetry Markup

Clicking the Poetry button marks up the current selection with one form of HTML poetry markup:

The use of <span>...<br /></span> on each line is intended to make poetry display properly in text-based browsers such as Lynx. (Current thinking in the DP forums is that the shorter <div>...</div> markup would be as good.)

It is the Guiguts convention that poetry is always rewrapped to be indented by four spaces. Thus, lines that are indented by just four spaces are styled class="in0", and the class number increases by 1 for each two text spaces of indention. Provided that the proofers and you have been careful and consistent about indenting the lines, the result will look correct in a browser.

Inserting Anchors and Links

Use the Named Anchor button to insert an anchor whose id is based on the current selection. For example, select the text CHAPTER 7 and click Named Anchor. The code <a name="CHAPTER_7" id="CHAPTER_7" /> is inserted preceding the selection.

Guiguts deals properly with spaces (converting to underscores) and special characters when making this substitution. This gives you a quick way to insert an anchor for reference from elsewhere in the book.

Use the External Link button to create a link to another HTML file, as when you are breaking a large etext down into separate chapter files. A file-open dialog pops up and you browse to select the target file. If it is in the same directory, Guiguts builds a link using a relative pathname.

You can use the Internal Link button for two purposes: linking to an anchor in this file; and checking for duplicate anchors.

To link to an anchor (for example one created by the Named Anchor button, or a page number anchor created by automatic HTML generation) first select some text that will be the link. Click Internal Link; Guiguts pops up a large window listing all named anchors in the file. It tries to put anchors with similar wording to the current selection at the top of the list. You can opt to exclude the numerous page-number and footnote anchors. Double-click the target anchor; Guiguts encloses the selection in a link with href="#Anchorname".

To check for duplicate anchor-names, clear any current selection and click Internal Link. Guiguts builds its list of existing anchors and checks it for duplicates. It displays any duplicates in a warning message.

Inserting Image Code

The Auto Illus Search button causes Guiguts to search for the first [Illustration] markup and highlight it in search orange. Alternatively you can select an [Illustration] markup line yourself and click Image.

In either case, a file-open dialog pops up, and you use it to browse to the image file for this illustration, for example images/image01.jpg. Guiguts shows the following dialog:

From bottom to top, you see a thumbnail of the image, a choice of alignment buttons, and the dimensions of the image. The Alt Text field is filled with the text from the [Illustration] markup.

Normally you leave the dimensions as-is, but if you want the browser to compress or stretch the image, you can enter different dimensions. You can change just one of the dimensions and set on the Maintain AR (aspect ratio) button, and Guiguts will adjust the other dimension in proportion.

You can set text for the title attribute (usually, just copy the Alt text and paste it into the Title field). When you click OK, Guiguts replaces the [Illustration] line with the following HTML:

<div class="figcenter/left/right" style="width: widthpx;">
<img src="path-to-image" width="width" height="height"
   alt="Alt text" title="Title text" />
<span class="caption">Alt text.</span>
</div>
The path-to-image is a relative path when the image is in the same folder as the document, or a subfolder. Typically images are located in the subfolder images and the generated code has src="images/imagenn.jpg".

Using Auto List

When you click Auto List, Guiguts converts the current selection into an unordered or ordered list, depending on which switch is set. The <ul>..</ul> or <ol>..</ol> tags are placed at the beginning and end. The lines of the selection are marked up as list elements with <li>..</li>.

When the ML (multi-line) switch is off, each line of the selection is marked as a list item. If ML is set on, list items are made from groups of lines separated by blank lines.

Using Auto Table

When you click Auto Table, Guiguts converts the current selection into an HTML table. When the ML (multi-line) switch is off, each line of the selection is marked as a table row. If ML is set on, each group of lines separated by a blank line is made into one table row.

Just as with ASCII Table Effects, columns are defined by two or more spaces between elements. Use the Table Effects palette or space the columns manually to put two spaces between column values; otherwise column values will be combined in a single cell.

The alignment switches left, center, and right set the default alignment for table columns. Guiguts inserts align='left' (or right or center) in each <td> markup.

Often, different columns of a table need different alignment; for example, a column of names should be left-aligned, one of numbers right-aligned. You can specify different alignments for each column by putting characters in the text field "Column Fmt" below the Auto Table button. You should place one character for each column in the table, using < for left-aligned, | (vertical bar) for centered, and > for right-aligned. For a three-column table to be aligned left, left, and right, you would enter <<>. Extra characters are ignored; and columns for which there are no characters get the default alignment.

Using the Link Checker

The Link Checker button at the bottom of the HTML palette invokes a thorough check of all HTML links in the document. It finds all named anchors, internal links, external links, and image links. It opens a report window that lists:

The link check also checks that all files in the images directory are named in a link, and lists the ones that are not. This may give false error reports if images are not all contained in a single directory.

Using HTML Tidy

HTML Tidy is a free program that parses HTML for errors, and which can reformat HTML in various ways. If you have installed Tidy as an executable in your system, you can invoke it by clicking the HTML Tidy button in the HTML palette. Guiguts saves the document and runs it through Tidy. It collects Tidy's output and displays it in a report window. From that window you can optionally have the "tidied" version of the file loaded as the current document.

Tidy is a complex program with many options (see here and here). You cannot pass options directly to Tidy through Guiguts, but you can get it to use a configuration file uh, how? other than setting the environment variable HTML_TIDY?

Hyperlinking Page Numbers

Once Guiguts has generated page-number anchors that match the book's folios, you can rather quickly convert a cross-reference, an index, a table of contents, a list of illustrations—anything that contains a page-number reference—to a hyperlink. This greatly increases the value of the etext to the reader.

The primary tool in this is regular expression search and replace, discussed on this page. As a start, set up a search for:

(?<!\d)(\d{1,3})

That is, look for a non-digit character followed by a string of from 1 to 3 digits, and be ready to quote the string of digits as $1. Set the replacement to:

<a href="#Page_$1">$1</a>

That is, the found number formatted as a link to the page anchor for that number.

Now you are set to walk through all page-number candidates in the document, or in a selection. For example, set the insertion point at the top of chapter 1 and click on the title bar of the Search dialog to give it the keyboard focus. Press the Enter key, which is shorthand for Search. The first string of digits is found and displayed. If it does not represent a page number reference, just press Enter again to find the next one. When you find a number that does represent a page number, as in "(see pg. 192)," type Control-Enter, the shorthand for Replace and Search Again. In this way you can stroll through the book turning page references into hyperlinks.