Opening the Report

Use Fixup>Run Word Frequency Routine to prepare a report on all words in the book. If the file has been modified, Guiguts saves it. Then Guiguts builds an index of all "words" (all pieces of text set off by white space, including numbers and abbreviations) in the book. This can take several seconds. When the count is complete, it is presented in a report window:

Using the Report Window

The body of the window contains a list of all words with their counts. Initially the list is sorted alphabetically To change the report between an alphabetic sort and a numeric sort by counts (most-frequent words first), change the Sort Alpha switch and click All Words.

Initially the list respects letter case ("It" and "it" are different). To change from respecting case to ignoring case, change the Ignore Case switch and click ReRun to rerun the census. Also click ReRun to update the list after you edit the document.

When you double-click a word in the list, Guiguts searches for the first or next occurrence of that word in the document and scrolls to it. Keep double-clicking the word to scan all uses of it. Right-click a word in the list (Mac: control-click) to load that word into the Search Text field of the Search & Replace dialog.

In some displays Guiguts identifies "suspects," items that might be errors. These are marked with four asterisks. The Suspects Only switch causes the display to show only suspects.

You can save any word frequency report in either of two forms. With the report window active, key control-s. A standard file-save dialog opens with the suggested name of wordfreq.txt. Click Save to make a file that is a duplicate of the displayed report, including the counts. You can also key control-x (for eXport). A file-save dialog opens with the suggested name of wordlist.txt. Click Save to make a file that contains only the list of words, without counts, "suspect" flags, etc.

Using Report Actions

The Word Frequency window offers several buttons, each giving a different way to process and display the data. In the order in which they appear they are:

1st Harmonic
(also ctl-w)
One word must be highlighted in the list. The index is searched for words that can be made from the highlighted word by a one-character insertion, deletion, or replacement. The original word and its near-relatives (if any) are displayed in a popup window. Use this window like the main window to search for words in the text.
All Words Re-sorts the full index (based on the Sort Alpha switch) and displays all words. Use this to return to the full list after viewing a subset such as Character Cnts.
ReRun Reruns the indexing and sorting process applying the current Ignore Case and Sort Alpha switch settings. Use this to update the list after you have edited the document.
Check Emdashes Displays all phrases that include an emdash (two hyphens). If an identical phrase having only a single hyphen exists, it is displayed as a suspect.
Check Hyphens Displays all hyphenated phrases. A word that duplicates a hyphenated phrase ("after-thought" and "afterthought") is displayed as a suspect. Use to find unfixed hyphenated words at ends of lines.
Check Alpha/num Displays all words and hyphenated phrases that contain a mix of alphabetic and numeric characters. Use to find one/ell and oh/zero errors.
Check Spelling Invoke the external spell-check program; see this page. Same as Search>Spell Check and the "check" button in the tool bar.
Ital/Bold Words Displays all words and phrases up to four words that are enclosed in italic or bold markup; and all matching words or phrases that are not so marked. Use to find inconsistent markup. Right-click the button to change the maximum number of words in a phrase.
ALL CAPS Displays all words and hyphenated phrases spelled entirely in capital letters.
MiXeD CasE Displays all words and hyphenated phrases that include both a lowercase and a capital letter in the non-initial position. Use to find OCR errors that mis-capitalize c/C, o/O, s/S, u/U, v/V.
Initial Caps Displays all words and hyphenated phrases that start with a single capital letter.
Character Cnts Counts all character values in the document and displays the list. If Sort Alpha is checked, the list is sorted by character; otherwise it is sorted by count, most-used first. Used to check for non-ASCII character use and for equal counts of matching brackets and parens.
Check , Upper Displays all the times an uppercase letter follows a comma. Use to find the common error of comma replacing period.
Check . Lower Displays all the times a lowercase letter follows a period. Use to find the common error of period replacing comma.
Check Accents Displays all words that include an accented character or a special Latin-1 character such as the ae ligature. A word that is the same except for the special character is displayed as a suspect. Use to check for inconsistent use of accents and ligatures.
Unicode > FF Displays all words that include a character from the Unicode sets beyond the Latin-1 set (numerically greater than 255, hex FF), When such words exist, the file is saved as a Unicode file with two bytes per character. (Does not display Unicode or Latin-1 letters that are punctuation or standing alone.)
Stealtho Check A different way to apply the same files as used by the Scanno Searches. Brings up a standard file-open dialog; you browse to select one of the scanno files. Guiguts applies all the searches from the file you select against the list of words, and displays a list of the words that match with their counts.