Optimize PDF Files

Summary: We review Acrobat 8 Professional (pre-release) and PDF Enhancer 3.1 for optimizing PDF files. The new Acrobat features faster operations, smaller PDFs, a new interface, and the ability to combine different types of files into one PDF. Learn how to use PDF optimization tools to remove redundancies, subset and outline fonts, and compress text and images for faster downloads and higher user satisfaction.

PDF optimization is often overlooked when creating PDF files for the Web. While PDFs have become quite popular on the Web, many PDFs used in web sites are designed for high quality print output and are not optimized for the Web. Even PDFs designed for Web use can have a wait problem, weighed down with excess fonts, change histories, and unoptimized images and forms. Optimizing PDF files for the Web can significantly shrink their size and boost display speed, saving bandwidth and user frustration.

optimize pdf files splash

In this article we'll give you tips and tools to optimize PDFs for minimum file size while still maintaining accessibility and search engine visibility. We review Adobe's PDF Optimizer in Acrobat 8 Professional (pre-release) and Apago's PDF Enhancer 3.1.

What Is a PDF?

Portable Document Format (PDF) is the defacto file format for presenting device-independent documents on and off the Web. PDFs are an efficient way to accurately describe simple to intricate documents for screen or print output. A PDF document is a collection of objects with structural information in a self-contained series of bytes. PDF is a page description language, like PostScript but simplified with restricted functionality (no programming like PostScript) to be more lightweight. PDFs use the following compression algorithms to reduce file size:

It is in how well you use these compression techniques, how efficiently the data is described (including image resolution) and the complexity of the document (read number of fonts, forms, images, and multimedia) that ultimately determines how large your resulting PDF file will be.

Creating Small PDFs

The main factors in creating small PDFs are image resolution, image type (bitmap or vector), the number of fonts used and how they are embedded, PDF version, and the level of compression. In general the higher the PDF version number, the smaller the file. Acrobat 5 (PDF version 1.4) added JBIG2 compression, which is superior to the CCITT or Zip algorithms when compressing scanned monochromatic copy (see Table 1). JBIG2 (Joint Bilevel Image Experts Group) encodes compressed monochrome (1 bit per pixel) image data from 20:1 to 50:1 for pages full of text. Like other dictionary-based algorithms (LZW, ZIP) JBIG2 creates a table of unique symbols and when a subsequent symbol matches one in the table, it substitutes a token pointing to the table index. JBIG2 also compresses the entire table.

Acrobat 6 (PDF version 1.5) added the ability to compress the entire file (Clean Up Settings dialog). However, since over 90% of Acrobat users have version 5.0 or greater, using PDF 1.4 is a safer alternative. Acrobat will usually display (with a warning) a more recent PDF version, but new compression schemes will spawn an error when opened in older versions of Acrobat.

Table 1: Acrobat Version Information

Acrobat VersionYear IntroducedPDF VersionMajor Features Added
3.019961.2Added interactions, movies/sounds, forms, CJK (Chinese, Japanese, Korean), web (hyperlinks, URLs, etc.), linearization for fast page display
4.019991.3Added structure, digital signatures, file embedding, JavaScript, RTL (Right to Left), color separations, Postscript Level 3 additions, uses opaque imaging model
5.020011.4Added JBIG2 compression, transparent imaging model, tagged PDF files (for standardized extraction of objects)
6.020031.5Added Compress entire file, XFA Forms (XML)
7.020051.6Added improved PDF Optimizer Interface, Path smoothing, Chopped image repair, Image enhancement (despeckle, etc.), can embed OpenType fonts
8.020061.7Improved PDF Optimizer Interface again, broke out "Discard User Data" into one pane, flatten form fields, combine and optimize files. Some improvement in areas of 3D, advanced commenting features, and security. Note defaults to saving in Acrobat 1.6 format.

To create the smallest possible PDFs for the Web minimize the number of fonts, bitmapped images, and substitute vector based-graphics instead. Minimize the number and complexity of forms in your PDF document and flatten form fields, and avoid the use of multimedia.

There are different methods to create PDFs, including outputting to PostScript and Distilling, GDI/Printing (Webopedia Definition of Graphic Device Interface), one-click "Direct to PDF," and dynamically on the server-side. However you create a PDF, the techniques and tools listed below can help you enhance and optimize your PDFs for the Web.

Avoid Refried Graphics

For graphics that must be inserted as bitmaps, prepare them for maximum compressibility and minimum dimensions. Use the best quality images that you can at the output resolution of the PDF. Inserting compressed JPEGs into PDFs and Distilling them may recompress JPEGs, which can create noticeable artifacts. Use black and white images and text instead of color images to allow the use of the newer JBIG2 standard that excels in monochromatic compression. Be sure to turn off thumbnails when saving PDFs for the Web.

Use Vector Graphics

Use vector-based graphics wherever possible for images that would normally be made into GIFs. Vector images scale perfectly, look marvelous, and their mathematical formulas usually take up less space than bitmapped graphics that describe every pixel (although there are some cases where bitmap graphics are actually smaller than vector graphics). You can also compress vector image data using ZIP compression, which is built into the PDF format. Acrobat Reader version 5 and 6 also support the SVG standard.

Minimize Fonts

How you use fonts, especially in smaller PDFs, can have a significant impact on file size. Minimize the number of fonts you use in your documents to minimize their impact on file size. Each additional fully embedded font can easily take 40K in file size, which is why most authors create "subsetted" fonts that only include the glyphs actually used.

Flatten Fat Forms

Acrobat forms can take up a lot of space in your PDFs. New in Acrobat 8 Pro you can flatten form fields in the Advanced -> PDF Optimizer -> Discard Objects dialog. Flattening forms makes form fields unusable and form data is merged with the page. You can also use PDF Enhancer from Apago to reduce forms by 50% by removing information present in the file but never actually used. You can also combine a refried PDF with the old form pages to create a hybrid PDF in Acrobat (see "Refried PDF" section below).

flatten form fields in acrobat 8

Figure 1: Flatten form fields to save space

Dueling Color Spaces

RGB (Red Green Blue) is an additive coloring system where the addition of each color moves the mixed color towards white. CMYK (Cyan Magenta Yellow Black) is a subtractive system where adding color to each part moves the mixed result towards black.

RGB is used in computer screens and televisions. Every system that emits light uses RGB. CMYK is used in reflective systems where the color is set by the reflection of light on a surface. Printing systems are the prime example.

It is important to use the right system for your color-based project. The most important factor is the gamut of each system. Gamut is the range of color each system can generate. RGB is very good at providing bright reds, blues, and greens. CMYK can't do this. Thus converting from RGB to CMYK can give unpredictable results. Not many applications do this well, although Adobe PhotoShop and PDF Enhancer color maps perfectly (1,2).

(1) http://www.adobe.com/products/adobemag/archive/pdfs/98auhtbf.pdf RGB versus CMYK gamuts, from Adobe. RGB almost covers the CYMK gamut, according to this technical document from Adobe. Converting from CMYK to RGB is pretty safe, but color changes do occur.

(2)Email from Leonard Rosenthol of Apago. Feb. 10, 2005.

Use the RGB versus CMYK Color Space

For web-only PDFs if you have a choice, use the RGB color space for your PDFs versus the CMYK color space. RBG has one less data channel than CMYK, so files are that much smaller in size. Also, Microsoft applications all think in RGB, even when importing CMYK images.

Convert to Grayscale

If color is not required, you can convert your PDF to grayscale. In Acrobat 8 select Advanced -> Print Production -> Convert Colors menu. Under Document Colors select "Device Gray," and under Destination Space choose "Gray Gamma 1.8" or 2.2. A test on a color print ad saved 54% when converting to grayscale (save as).

Optimizing Existing PDFs

In many cases you won't have access to the original document, just the resulting PDF file. Many PDFs we've seen are not fully optimized for the Web, using conservative settings more appropriate to high-resolution printers. For computer monitors viewing web-based PDFs, you don't need high resolution images and exact reproduction of font faces, you just want to convey your information in an efficient way. Using the techniques outlined below, you can shrink your PDFs, while still maintaining the textual data for search engines, and reasonable quality for print output. Some webmasters offer two versions of their PDFs, once for fast web display, and one for printing.

Save As...

Once you're done making changes to your PDF document choose File -> Save As and overwrite your existing PDF file (see Figure 2). By default, save as removes changes that are appended to PDFs by the Save command, linearizes the file for fast web viewing, and removes unused objects.

save as menu in acrobat 8

Figure 2: Save As to save space

The result is a compact, linearized PDF that displays the first page (or an arbitrary page) quickly, while the rest of the file downloads in the background. Although linearized PDFs are slightly larger, they also increase perceived speed. Note that optimizing a signed document will invalidate its signature. More »

Optimize PDF Files > PDF Optimizer > PDF Optimizer 2 > Combine Files > Optimization Tools

By website optimization on 25 Sep 2006 AM