After you remove the page separators you can see and edit the page boundaries using the Adjust Page Markers palette. To open it, right-click (Mac: control-click) in the Page: field of the status bar.
The above palette opens and simultaneously all page markers (which are normally invisible) are shown as bright-yellow insertions in the text. To close the palette and hide the markers, right-click in the Page: field again.The number of the page marker next after the insertion point is shown. Use the Previous Marker and Next Marker buttons to step through the page markers in sequence. To jump to a particular marker, edit the yellow field at the top to show a page number one higher or lower than the marker you want. Then click Previous Marker or Next Marker. The document scrolls to the center the line containing the marker you want.
You can move the current marker in the text by clicking the four move buttons. The marker slides one character left or right, or one line up or down.
To add a new page marker at the end of the book, set the insertion point in the text somewhere after the last existing marker. Then click Add. You can use this to manually add markers to a book that has lost its marker information.
To insert a new page marker between existing ones, set the insertion point where the marker is to go, and click Insert. A new marker is inserted. Following markers are incremented by 1, if necessary to prevent duplicates.
To remove a marker, use the palette to navigate to the marker you want to remove, and click Remove. The marker is taken out of the book and no other changes are made.
If you click Insert Page Markers, Guiguts inserts a text string of the form [Pg 003] into the text following every (invisible) page marker. You can apply a regex replacement to convert these into HTML, either to create anchors or to style visible page numbers in the HTML text.
The Renumber button and page offset field are discussed below.
The scanned pages we receive are numbered sequentially from 1 and we casually refer to these as "page numbers." Confusion arises because these numbers often are not the same as the numbers printed on the pages of the book. The technical term for the number that is printed on a book page is folio. If we use these terms separately, we can avoid some confusion:
Page numbers are a simple count of every scanned surface. Folios are assigned using complex and not terribly consistent rules that reflect the logistics of early printing technology. For example, the "front matter" (comprising title pages, preface, contents, etc.) was usually the last part of the book finished, after the body had been set up in type. That meant body folios had to be assigned before the count of front-matter pages was known. Sometimes the front matter was left un-folio'd; sometimes it got its own series of folios, often lowercase roman numerals; then the body started over again with folio "1." When a block of glossy "plates" was inserted in the book, it might be numbered sequentially with the pages, or not. If a plate had a blank reverse, that face might count in the folio sequence or not—and it might have been scanned, giving it a page number, or not.
The net is that page numbers don't match folios and the numerical relationship between them can change as you go through the book. However, when Guiguts auto-generates HTML it inserts an anchor at each page boundary with the number of the page. If these anchors reflected the folios, it would be dead easy to link folio references in the index or a cross-reference to the correct page.
Find the first page image in your book with a visible folio in arabic numerals (it might not be "1"). Note the numerical difference between the page number and the folio. Step through the book chapter by chapter, rechecking the difference between folio and page number. If the difference changes, step back by pages until you find the page on which it changes, and note that. On paper make a table in this form:
At folio: | Page number is: | Adjust |
3 | 10 | -7 |
151 | 156 | -6 |
193 | 207 | -14 |
In a simple book the table you have written down has only one row. There is a fixed numeric relationship between folios and page numbers. You can deal with this easily when you generate HTML by entering a Page Offset value in the HTML palette; see this page.
When there are multiple rows in your table you need to adjust the page numbers to match the folios. This is best done from the back of the book working toward the front. In general,
When the difference between page number and folio increases, it means that pages without folios were scanned. In the example above, the difference increased by 8 starting at folio 193. You find a block of 8 unnumbered pages of photos appear in the book between folio 192 and folio 193. They were scanned, hence got page numbers. Using the Adjust Page Markers palette you navigate to the first page marker for this block of illustrations. You click Remove and Next Marker until you have removed the 8 markers for the unnumbered pages and the current marker is the one that matches to the image with folio 192.
Now you can set the Adjust Page Offset value to -8 (negative 8) and click Renumber. Guiguts renumbers all the markers from the current one to the end of the book, deducting 8 from their page numbers. You have effectively erased the last line from your table of differences; now, from folio 150 to the end of the book, the difference is a consistent -6.
Now navigate to the marker matching folio 151 and investigate why the difference changes there. You know that when the difference between folio and page number decreases, it must be because a page with a folio was not scanned. (Possibly real pages were omitted from the scan; it happens!) But you find this sequence of pages: