Compress a PDF Without Losing Quality (Real Tips)
Compress a PDF Without Losing Quality (Real Tips)
A 60 MB PDF that should be 4 MB is one of those small, recurring annoyances of modern work. It clogs email attachments, slows down cloud sync, and makes mobile readers crawl. Most online compression tools fix the symptom by smashing all your images down to 72 DPI and calling it a day, which trades file size for a document that looks like a fax from 1998. That's not what most people actually want. The real goal is a smaller file that still reads cleanly on screen, prints sharp at the size you'll print it at, and keeps the text searchable.
This guide walks through the actual sources of PDF bloat, which compression levels make sense for which use cases, and how to keep the OCR layer intact so you don't lose searchability in exchange for a smaller file. Along the way you'll see a few free tools and command-line options for batch work. If you just need to shrink a single PDF right now, our free PDF compressor is the fastest path. If you have a stack of fifty files to process, the batch section near the end is what you want.
What Actually Makes a PDF Huge
PDFs balloon for a small number of well-known reasons, and once you know what to look for, you can predict which files will benefit from compression and which are already as small as they're going to get. The first and biggest source of bloat is embedded images. A photograph or scan saved at 300 DPI uses roughly four times the data of the same image at 150 DPI, which is still more than enough resolution for screen viewing on a 4K monitor. A 50-page report with one full-page screenshot per page can easily come in at 80 MB if those screenshots were captured on a Retina display and embedded at native resolution.
Embedded fonts are the second culprit. A well-formed PDF includes the fonts it uses so the document renders identically on every machine. The clean way to do this is font subsetting: only the actual glyphs used in the document get embedded. Lazy or older PDF generators embed the entire font file, which can easily add 200 KB per font. A document with five fonts (regular, bold, italic, plus a code font and a heading font) might be carrying a megabyte of font data it doesn't need.
Beyond images and fonts, PDFs accumulate cruft over time: redundant metadata from each save, transparency layers that older PDF readers needed but modern ones don't, embedded thumbnails generated automatically, and structural overhead from documents that were exported, edited in another tool, exported again, and re-edited. Scanned pages saved as full-color images instead of bilevel black-and-white add bulk that's invisible on screen but huge in the file. Knowing which of these is doing the damage tells you which compression strategy to apply. Our PDF metadata tool is useful for stripping accumulated metadata before compression, which often shaves a small but real percentage off the size.
Image vs Text Compression: Different Animals
The single most important thing to understand about PDF compression is that text and images compress completely differently. Text in a PDF is stored as vector data, which is already compact. The compression you can apply to text is lossless and rarely saves more than 10 to 30 percent of the file size, but it never affects readability or searchability.
Images are where the savings live, and they're also where quality suffers. Every image in the PDF can be downsampled (reducing the pixel dimensions to a target DPI) and recompressed (changing from PNG to JPEG, or increasing JPEG compression level). Both steps are lossy. Downsampling discards detail you can never get back. Recompression introduces JPEG artifacts that compound with each save. The trick is matching the compression to how the document will be used: if it's only ever going to be read on a phone or laptop screen, you can drop image resolution dramatically without anyone noticing. If it might be printed at letter size, you need to keep at least 200 DPI for image clarity.
For documents that are mostly text with occasional images (reports, contracts, ebooks), the right strategy is aggressive image compression with completely lossless text handling. The text stays crisp and searchable, the images shrink to whatever DPI matches the intended use, and you get a 40 to 70 percent file size reduction without anyone complaining. Our PDF compressor defaults to exactly this profile because it covers the majority of real-world documents.
Compression Levels Explained
Most compression tools offer three or four levels, but the labels (low, medium, high, extreme) don't tell you what's actually happening. Here's the translation, with rough size reductions you can expect on a typical mixed-content business PDF.
Lossless compression strips redundant metadata, dedupes objects, and applies Flate compression to text streams without touching image quality. This is the safest option and typically saves 10 to 30 percent. If you're compressing a contract or a legal filing where preserving every detail matters, lossless is the floor.
Light compression downsamples images to around 150 DPI, which is enough for sharp printing at letter size and looks indistinguishable from the original on screen. Combined with lossless text handling, expect a 30 to 50 percent reduction. This is the sweet spot for documents that need to look professional but don't need to be archival quality.
Moderate compression downsamples to around 96 DPI (full-screen retina monitor density), uses fairly aggressive JPEG compression on photos, and may convert PNGs to JPEGs. Size reduction lands around 50 to 70 percent. The result reads fine on screen and prints acceptably at smaller sizes, but you'll see softness in fine details. This is the right level for emailing a long report someone will skim once and then file.
Aggressive compression drops images to 72 DPI, applies high JPEG compression, and may flatten transparency. Size reduction can hit 70 to 90 percent. The output looks visibly worse: photos lose detail, gradients band, and some thin lines blur. Use this only when the recipient just needs to read the text and the file size is a hard limit (like a 5 MB email attachment cap on the receiving server).
For more granular control, the PDF optimizer lets you set image DPI and JPEG quality independently from text settings, which matters when you have a document that's mostly photos and you need to dial in the right tradeoff manually.
Keeping Text Searchable After Compression
Searchable text in a scanned PDF lives in the OCR layer: an invisible text layer overlaid on top of the page images. When you full-text search the document, the search hits the OCR layer; the visible image is just there for human eyes. Compression can strip this layer if you're not careful, leaving you with a document that looks the same but can no longer be searched, copied, or fed into a workflow that expects extractable text.
Two compression behaviors specifically destroy the OCR layer. First, some tools rasterize the entire page (collapse the image and the text layer into a single flat image) before compression. The text layer becomes invisible to the file, even though it was visible before. Second, some tools strip "unused" content streams, and a poorly tagged OCR layer can be misidentified as unused.
To preserve searchability, use a compressor that explicitly supports OCR-layer preservation, or run OCR after compression rather than before. Our PDF compressor keeps text layers intact through compression, so a searchable scanned document stays searchable after shrinking.
For documents you originate yourself (Word docs, Google Docs, design tools exporting to PDF), the text isn't an OCR layer; it's native PDF text. Compression doesn't threaten this kind of text under any normal settings. The OCR concern only applies to scanned PDFs and PDFs created from images.
Batch Compression for a Whole Folder
When you have one PDF, browser-based tools are the fastest option. When you have fifty, you want batch processing. There are three reasonable approaches.
The first is Ghostscript on the command line. It's free, scriptable, and handles batches well. A typical command compresses every PDF in a folder to screen-quality settings: gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/screen -dNOPAUSE -dQUIET -dBATCH -sOutputFile=out.pdf in.pdf. The PDFSETTINGS flag accepts /screen (72 DPI), /ebook (150 DPI), /printer (300 DPI), or /prepress (300 DPI with color preservation). Wrap it in a shell loop and you can process a hundred files unattended.
The second is Adobe Acrobat Pro's batch action runner, which lets you define a sequence of operations (compress, optimize, save with new name) and run it against an entire folder. It costs $15 per month but handles edge cases like password-protected PDFs and form fields gracefully.
The third is browser-based bulk uploads. Several tools, including ours, accept multiple files in one upload and return a zip of compressed outputs. For occasional batch work, this is the lowest-friction option. For repeated batch work as part of a real workflow, Ghostscript wins because you can wire it into a folder watcher or a cron job.
After batch compression, run the PDF metadata tool over the outputs to confirm metadata didn't get reset or corrupted, particularly if the original files had specific creation dates or author tags you want to preserve.
FAQ
Q: Why does my PDF stay the same size after compression?
A: The most common reason is that the file was already optimized when it was created, so there's no fat to trim. The second is that the file is mostly vector text and small images, where compression has limited room to work. The third is that the tool you used compressed images but didn't strip metadata or font duplicates. Try a different tool with explicit "remove unused objects" and "subset fonts" options.
Q: Will compressing a PDF break the digital signature?
A: Yes, almost always. A digital signature includes a cryptographic hash of the document at the moment of signing. Any change to the bytes (including compression) breaks the hash and invalidates the signature. If you need to compress a signed document, do it before signing, not after.
Q: What's the smallest a PDF can reasonably get?
A: For pure text content (no images), you can typically reach 50 to 100 KB per 10 pages. For mixed content with images, expect 200 to 500 KB per 10 pages at moderate quality. For image-heavy documents like scanned books, even aggressive compression usually leaves you at 1 to 3 MB per 10 pages.
Q: Does compressing a PDF lose OCR text?
A: It can, depending on the tool. Avoid tools that rasterize pages before compression, since that flattens the OCR layer into the image and destroys searchability. Use compressors that explicitly preserve text layers, or run OCR after compression instead of before.
Q: Is browser-based compression as good as desktop software?
A: For typical documents, yes. Browser tools use the same underlying libraries (Ghostscript, MuPDF, qpdf) as most desktop compressors. The main differences are file size limits (browser tools usually cap at 50 to 100 MB per file) and batch capacity (desktop tools handle larger batches more gracefully). For one-off compression of a single document, browser tools and desktop apps produce essentially identical results.
Putting It Together
The right compression strategy depends on what the document is and how it'll be used. For a contract going by email, lossless preserves every detail. For a slide-heavy report on Slack, moderate cuts the size in half without anyone noticing. The key is matching the level to the use case.