Department of Engineering

IT Services

Producing HTML and PDF files with LaTeX

Introduction

If you intend mailing documents or installing them on the WWW it's important to choose the file format carefully. It's probably not a good idea to use a DVI (latex output) or Word format because many people won't be able to view the files. The main options are

  • HTML - WWW browsers should be able to view the files, but the files won't be faithful in format to the original document. HTML can't currently render maths well (MathML may eventually provide a solution). On the plus side, quite a few Word Processing programs can produce HTML output, also HTML files are easily integrated into a web of other HTML documents and can be scanned by search engines.
  • PDF - Free viewers exist (see Adobe's Acroread page) and many computers have them. The viewers have a print option. Files are likely to be an accurate representation of the original document, and they'll be smaller than the postscript versions. To produce PDF files from a Word Processing program you'll probably need to buy Adobe's Distiller. CUED staff can use a copy on the Computer Operators' PC.
  • Postscript - Free viewers exist but aren't common. People shouldn't have too much trouble printing the files out. Files are likely to be an accurate representation of the original document, but they'll be big. Production of postscript files is usually easy with Word Processing programs - use "print to file" and select "postscript" as the format. You may need to choose your fonts and "Save as ..." options carefully (maximizing for portability) if you want to minimise font problems. From a LaTeX DVI file you can use dvips -j0 -Ppdf to produce a postscript file that can be converted to PDF.

This document concentrates on PDF and HTML production from LaTeX. For more information on formats see the Common Graphics File Formats page.

Simple Options

The following option don't try to take advantage of the electronic medium, but they're easy to use, requiring little or no change to the LaTeX files.

  • HTML - Various converters exist. latex2html is installed on the Teaching System. It deals with tables of contents, cross-references, colored text and tables by converting them into the HTML equivalents, converts graphics to GIF files and creates GIF files for anything (for example maths) that it can't convert to HTML. Optionally it can produce an HTML file for each section of the latex document.
  • PDF - See the Producing PDF page for tips and examples on simple ways to get PDF output. You can create a postscript file and convert it to PDF using ps2pdf (free with ghostscript, use version 6.0 or later). ps2pdf13 converts to PDF 1.3 (which might lead to smaller files). A simple alternative if you use no graphics is to use pdflatex instead of latex, which produces a PDF file directly. This method works better if you use common postscript fonts (by using \usepackage{mathptmx} for example, or \usepackage{pslatex}, which will use the postscript fonts Times, Courier, Helvetica, etc). You can also include the line \usepackage[dvips]{hyperref} to get better support for references.
    A word of warning: Acrobat5 is more fussy than earlier versions. The PDF produced by the above methods should be viewable whatever version of Acrobat reader you use, but it might only print out with Acrobat4.
    dvipdfm (installed on the teaching System) produces a PDF file from a DVI file.

Advanced Features

  • HTML - By using the html package (or you can use the hyperref package with a latex2html option) it's possible to
    • create links to URL (these can appear as footnotes in the paper version) - use \htmladdnormallinkfoot{text}{URL}.
    • use ~/.latex2html-init to override defaults
    • add conditional text - have text in your latex file that's processed only when you run latex (or only when you run latex2html).
    The LaTeX Maths and Graphics document is an example of HTML documents produced from LaTeX.
  • PDF - By using the hyperref package with pdflatex you automatically get PDF files with bookmarks (if you have a table of contents) and cross-references etc. The standard color and graphicx packages have pdftex options too which are activated by adding pdftex to the documentclass options. JPEG, TIFF and PNG graphics inclusion is supported. To deal with a document that has eps graphics
    • Use the epstopdf package to convert the eps docs to pdf docs on demand. It needs to be loaded after \usepackage[pdftex]{graphicx}.
    • Use the epstopdf program manually to convert the eps files to pdf. Ensure that you don't mention the filename's suffix in the \includegraphics commands. Add pdftex to the options in the documentclass line when you run pdflatex, remove it when you run latex.
    • Use the unpsfrag command (available from http://www.gts.tsc.uvigo.es/~fiz/unpsfrag) to convert a LaTeX document that uses the psfrag into one that doesn't, thus letting you use pdflatex.
    So to use pdflatex, begin your document with
     
       \documentclass[pdftex]{article}
       \usepackage[dvips]{graphicx}
       \usepackage[usenames,dvipsnames]{color}
       \usepackage[pdftex]{hyperref}
    
    and then type pdflatex file.tex ; pdflatex file.tex to produce file.pdf
    To use dvipdfm begin your document with
     
       \documentclass[dvipdfm]{article}
       \usepackage[dvips]{graphicx}
       \usepackage[usenames,dvipsnames]{color}
       \usepackage[dvipdfm]{hyperref}
    
    and then type latex file.tex ; latex file.tex ; dvipdfm file.dvi.
    To control the information that appears when acroread's "Document Information" option is run, you can use something like "\pdfinfo{/Title (Using pdfLaTeX) /Author (Tim Love)}". To create links use "\href{URL}{text}".
    When you install the PDF file on the WWW it's a good idea to give readers a way to download a PDF reader. The following HTML fragment creates a link like this ACROBAT READER to the appropriate page.
    <a href="http://www.adobe.com/prodindex/acrobat/readstep.html"
    target="_blank"><img src="http://www.adobe.com/images/getacro.gif" 
    width=88 height=31 border=0 alt="ACROBAT READER"></a>
    

    The locally installed quickrep package by Paul Walmsley facilitates report production with postscript or PDF output.
  • PDF slideshows - If you want to produce presentations from LaTeX, theprosper package is useful.

Common Problems

  • The resulting font quality can be poor unless you're careful about font selection and conversion options. The Quality of PDF from PostScript page has suggestions.
  • From jpmg@eng - "use the colorlinks option to the hyperref package - it highlights the links by changing colour rather than putting an ugly red box round them (which is the default)"
  • From pjw42@eng - "If pdflatex complains about undefined \BOOKMARK then delete the .out file and try again"
  • From jpmg@eng - References: "By default, LaTeX typesets \ref{my:label} by substituting only the section number, subsection number, figure number, or whatever, where \label{my:label} was declared. It typesets \pageref{my:label} by substituting only the page number where \label{my:label} was declared. [...]

    While this makes perfect sense in a printed document, it tends to result in rather ugly online hyperlinks. I vastly prefer the appearance, both online and when printed that is achieved by using the nameref package and a macro such as

    \newcommand{\myref}[1]{`\nameref{#1}' (see p.\pageref{#1})}
    
    which when applied as follows:
    \section{Random Stuff}\label{randomlabel}
    This is very dependent on \myref{randomlabel}.
    
    results in something along the lines of
            2.4 Random Stuff
            This is very dependent on `Random Stuff' (see p.35)"

Sources of Information

Much of the material provided by Patrick Gosling, Paul Walmsley and Robin Fairbairns.