
The software tools used to generate indexes come in many flavors and varieties. Which technique is used depends on variables such as budget, eventual re-usability of the source material, translation needs, time constraints, media used to publish the material, file sizes and transferal issues, and individual preferences.
There are essentially six different methodologies for indexing:
Standalone tools, usually used for
back-of-the-book indexes, allow indexers to work from page-numbered galleys.
The indexing is completely separate from the published material. Wright
Information uses CINDEX for these tasks, which allows formatting of RTF files
and the generation of a database of entries for any translation purposes.
Embedding tools allow indexing codes to
be embedded in the electronic text of a book or file, and allow the index's
locators to be updated as the text changes. Indexers must work in the same
files as the publishers. Wright Information has expertise in FrameMaker,
PageMaker and Microsoft Word.
Tagging tools allow indexing codes to be
embedded in the electronic text after the indexing is complete. The indexer
inserts numbered dummy tags in the files, and then builds the index separately.
The final step uses macros to insert the indexing at each tag in the files.
Many of these tools are developed in-house to fit the publishing group's needs.
Keywording is used primarily in online
help materials. It can be hard-coded jumps, similar to HTML jumps, or it can be
inserted as embedded coding and compiled into a list by the software. Wright
Information uses RoboHELP, RoboHTML, and other tools to keyword help files.
Weighted-text search tools, similar to
the intelligence in agents or Microsoft's Office Assistant, involve building
terminology sets for helping the intelligence work. An example would be helping
an agent identify the different between a cell in an Excel spreadsheet and a
cell in a jail. Often terminology sets are built specifically for the
information system, outlining all the synonyms and special meanings that a
particular product uses. Indexing thought and practice comes into play in the
building of these terminology sets.
Automated indexing software builds a
concordance, or a word list, from processed files. Although the manufacturers
often claim these packages build indexes, the actual results are a list of
words and phrases, sometimes useful in the beginning stages of building and index.
Usability tests of these packages have shown that the word lists omit many key
ideas and phrases, and cannot fine-tune terminology for easy retrieval, or
build the needed hierarchies of ideas that professional indexing can. Free-text
search, also produced automatically by software, is useful in some
environments, but tests have shown the retrieval is much higher with a
human-generated index. Wright Information owns software that will generate
concordances, but doesn't use it for a finished index.
Abstracting and citation-control software
aids in building abstracts with associated keywords. Wright Information uses
ProCite for abstracting needs.
Thesaurus and controlled-language software
aids in building controlled languages and sets of keywords for metadata and web
sites. Wright Information uses both MultiTes and TermTree software for these
needs.
Web indexing software aids in building
HTML web indexes. Wright Information uses a variety of proprietary tools as
need by the client to build metadata sets, Web-based indexes, and compiled
scripted Web indexes, such as those found in WebHelp.
Jan C. Wright
Wright Information Indexing Services
For more information, please send email to info@wrightinformation.com