DOC2CHM Tips: Best Settings for Clean CHM Output

DOC2CHM Tips: Best Settings for Clean CHM OutputCreating Compiled HTML Help (.chm) files from Microsoft Word documents can streamline documentation distribution, offline help systems, and software manuals. DOC2CHM is a popular tool for converting .doc/.docx files into CHM format. This article walks through best practices and settings to get clean, professional CHM output — covering document preparation, DOC2CHM configuration, troubleshooting, and final packaging.


Why CHM and DOC2CHM?

CHM files bundle HTML pages, images, and an index into a single compressed file, making them ideal for offline help. DOC2CHM automates conversion from Word, preserving structure while generating the HTML, table of contents (TOC), and search index CHM requires.


1. Prepare your Word document for conversion

A clean source document is the foundation of a good CHM. Before using DOC2CHM:

  • Use built-in Word heading styles (Heading 1, Heading 2, Heading 3) consistently. DOC2CHM maps these to the CHM table of contents and navigation.
  • Keep styles simple. Avoid deeply nested or custom heading styles that the converter may not recognize.
  • Use Word’s built-in lists instead of manually typed bullets or numbers.
  • Insert images inline and use standard formats (PNG, JPEG). Avoid large, high-resolution images — resize to target display size.
  • Separate sections with page breaks where logical; avoid complex section breaks that may create HTML artifacts.
  • Use cross-references and bookmarks in Word sparingly; verify they convert properly.
  • Clean up direct formatting: prefer styles over manual fonts, sizes, and colors.

2. Choose the right DOC2CHM settings

DOC2CHM typically offers several options affecting HTML generation, TOC creation, image handling, and indexing. These settings vary by version but commonly include:

  • Output mode: single-page vs. multi-page HTML.

    • Use multi-page when the document is long or has many images; it improves navigation and reduces memory usage.
    • Single-page may be acceptable for very short docs.
  • Heading level mapping: set which Word heading levels become TOC entries.

    • Map Heading 1 → TOC level 1, Heading 2 → level 2, etc. Avoid mapping too many levels; 3 levels are usually sufficient.
  • CSS usage: enable external CSS rather than inline styles when possible to keep HTML clean. DOC2CHM can generate a stylesheet; review and simplify it.

  • Image handling: choose to export images as separate files and reference them, rather than embedding. Ensure image paths are correct and that images are optimized.

  • Encoding: set UTF-8 for modern character support, especially for non-Latin scripts.

  • Index and search: enable index generation if you need search functionality. Provide clear, consistent headings and keywords in Word (use Word’s Index feature if available) to improve the generated search index.

  • Keep hyperlinks relative so they work within the CHM container.


3. Clean HTML output tips

After conversion, inspect the HTML files:

  • Remove unnecessary inline styles and redundant span tags. A simple regex or HTML tidy tool can help.
  • Consolidate CSS rules and remove unused classes.
  • Rename images and files to lowercase, alphanumeric names to avoid path issues.
  • Ensure UTF-8 meta charset is present in each HTML file:
    
    <meta charset="utf-8"> 

4. Table of Contents and Navigation

  • Review the generated .hhc (TOC) file and edit it to improve hierarchy and labels if needed.
  • Ensure that each TOC entry points to the correct HTML anchor. Fix broken anchors by editing the HTML or .hhc file.
  • Use descriptive titles in headings — CHM displays those in the TOC and navigation panes.

5. Index and Search Configuration

  • If DOC2CHM generates an .hhk (index) file, review and organize entries.
  • Use Word’s built-in Index feature to tag important terms before conversion — this produces better index entries.
  • Test CHM search thoroughly. If search results are noisy, refine which sections are indexed by adjusting heading levels or excluding front-matter.

6. Images, Figures, and Multimedia

  • Keep images under 200–300 KB where possible; use PNG for diagrams and JPEG for photos.
  • Use descriptive alt attributes in the HTML for accessibility and better search indexing.
  • For screenshots, crop to the relevant area and add callouts in Word rather than embedding very large unannotated images.
  • Avoid embedding active multimedia (Flash, video) unless your CHM readers support it; many CHM viewers have limited media support.

7. Fonts, Styles, and Layout Consistency

  • Use web-safe fonts or include a CSS fallback stack to ensure consistent rendering.
  • Normalize paragraph spacing and use CSS for margins/padding instead of Word’s direct formatting.
  • Avoid complex tables spanning pages; if tables are wide, consider splitting or using responsive CSS.

8. Handling Special Characters and Languages

  • Use UTF-8 encoding everywhere.
  • For right-to-left languages, ensure the generated HTML includes dir=“rtl” where necessary and that the stylesheet supports it.
  • Test search and index for language-specific behavior.

9. Troubleshooting common issues

  • Broken links: verify relative paths and update .hhc/.hhk files.
  • Missing images: confirm images were exported and paths match the HTML references.
  • Incorrect TOC levels: reassign Word heading styles and re-run conversion.
  • Large CHM size: compress images, remove unused assets, and split very large docs into smaller modules.
  • Strange characters: ensure UTF-8 encoding and remove smart quotes or convert them to HTML entities.

10. Final packaging and testing

  • Compile the CHM and test with the Windows HTML Help Viewer (hh.exe).
  • Test on target systems and viewers if users may use third-party CHM readers.
  • Validate accessibility: keyboard navigation, alt text, and readable font sizes.
  • Keep a source control copy of your Word document and generated assets so you can reproduce or update the CHM.

Example best-practice workflow

  1. Prepare Word doc with Heading styles, optimized images, and Word Index entries.
  2. Export images and clean up direct formatting.
  3. Run DOC2CHM with multi-page output, external CSS, UTF-8 encoding, and image export enabled.
  4. Inspect and tidy HTML/CSS, edit .hhc/.hhk files for TOC/index tweaks.
  5. Compile CHM, test, and iterate.

Quick checklist

  • Use Word Heading styles (1–3).
  • Prefer multi-page output.
  • Use external CSS and UTF-8.
  • Export and optimize images.
  • Review .hhc/.hhk and clean HTML.
  • Test search, links, and accessibility.

Following these tips will produce cleaner CHM files with better navigation, smaller file size, and more reliable indexing.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *