PDF Security

How to Remove PDF Metadata Free — Strip Hidden Data Before Sharing

Every PDF you create or export carries invisible data most people never see: the author's full name, their company name, the software version used, the date and time the document was first created, and sometimes revision history going back weeks. That data travels with the file when you share it. This guide explains what's actually in your PDF's metadata, when it matters, and how to strip it completely before the file leaves your hands.

By FusionPDF Team · May 22, 2026 · 9 min read · Updated May 2026

Key Takeaways

63% of PDF users are unaware their documents contain embedded metadata, per a Netwrix Data Risk & Security Report.
A Word-exported PDF typically contains author name, company, software version, and creation timestamp by default.
Some situations legally require metadata removal: FOIA responses, court submissions, and tender/RFP documents.
Stripping metadata has zero effect on visible content - text, images, and layout are unchanged.

What Is PDF Metadata and What Does It Actually Contain?

PDF metadata is structured data stored inside the file that describes the document rather than representing its content. The PDF specification defines two metadata systems: the Document Information Dictionary (the original system) and the XMP metadata packet (the newer, richer system). Both are present in most modern PDFs. 63% of PDF users are unaware their documents contain this embedded data, according to the Netwrix Data Risk and Security Report.

Document Information Dictionary fields

The Document Information Dictionary is a simple key-value store defined in the PDF spec. It holds up to eight standard fields. Every modern PDF creation tool populates at least some of these by default, often without asking the user's permission.

Field	What it contains	Risk level
`Author`	The name of the person who created the document, usually pulled from the OS user account	High
`Creator`	The application that originally created the document (e.g., "Microsoft Word 365")	Medium
`Producer`	The software that converted or saved the final PDF (e.g., "Adobe PDF Library 23.6.0")	Medium
`CreationDate`	Timestamp when the document was first created, including timezone offset	Medium
`ModDate`	Last modification timestamp	Low
`Title`	Document title, often the filename or an internal working title not meant for external eyes	Medium
`Subject`	A subject line, sometimes auto-populated from document properties	Low
`Keywords`	Tags or keywords added by the author, sometimes internal classification labels	Medium

XMP metadata: the richer, harder-to-see layer

XMP (Extensible Metadata Platform) is an Adobe standard embedded as an XML packet inside the PDF. It can contain everything in the Document Information Dictionary, plus additional namespaces for rights management, job processing history, document revision history, and custom application-specific data. XMP is not visible through basic "File Properties" dialogs in most readers - you need a dedicated metadata viewer or Acrobat's full properties panel to see it.

63%

of PDF users don't know their documents contain embedded metadata The Netwrix Data Risk and Security Report found that nearly two-thirds of knowledge workers are unaware that standard PDF exports contain author names, company details, and software version information visible to any recipient.

Why Is PDF Metadata a Privacy and Security Risk?

Real incidents illustrate the problem better than abstract warnings. In 2003, the UK government released a dossier on Iraqi weapons capabilities as a Word document converted to PDF. The metadata revealed that the document had been edited by multiple government officials, including one whose name became the center of a major political controversy. The visible content was the intended disclosure. The metadata was not.

What recipients can learn from your metadata

Anyone who receives your PDF can open it in Adobe Acrobat and check File > Properties to see your name, company, and the software you used. For technical users, tools like ExifTool, pdfinfo, or any PDF library can extract the full XMP packet in seconds. That "CONFIDENTIAL - DO NOT DISTRIBUTE" stamp on the cover page means nothing if the metadata tells the recipient exactly who created it, when, and on which system.

Stack exposure is a particular concern in competitive or legal contexts. The Creator and Producer fields reveal your exact software versions. A tender document that shows "Microsoft Word 2021" and "Adobe Acrobat 23.6" in its metadata tells competitors exactly what tools and versions your team uses - occasionally relevant in procurement contexts where software standardization matters.

GPS coordinates in embedded images

This one surprises people. If your PDF includes photos taken with a smartphone - a site inspection photo, a product photograph, a screenshot from a mobile device - those images can carry EXIF metadata including GPS coordinates precise to within a few meters. When a PDF is exported from a Word document containing such images, the EXIF data often travels along. A location that seemed incidental in the photo can become a disclosure in the PDF.

"PDF metadata incidents are rarely malicious on the sender's part - they're accidental disclosures. The 2003 UK government dossier incident, the 2007 NSA inadvertent author disclosure in a published report, and dozens of corporate M&A document leaks all share the same pattern: the sender was unaware that invisible structured data accompanied the visible content." Source: Electronic Frontier Foundation, Metadata Anonymization Analysis, 2022

Visible Metadata vs. Hidden XMP Metadata

The distinction matters because different removal steps are needed for each layer. The Document Information Dictionary is what most "Properties" dialogs show. XMP is what stays hidden and survives many basic cleanup attempts. A PDF that looks clean in Acrobat's basic properties view may still carry a full XMP history packet that tools like ExifTool will expose.

Document Information Dictionary

Visible via File > Properties in most readers. Contains Author, Title, Creator, Producer, dates. Easily viewed by any recipient. Most "remove metadata" tools clear this layer.

XMP Metadata Packet

An XML block embedded in the PDF's metadata stream. Not shown in basic Properties dialogs. Can contain revision history, rights metadata, and custom application fields. Requires deeper stripping to remove fully.

There's a third layer worth knowing about: embedded thumbnails. PDFs sometimes store a small preview thumbnail of the first page inside the file itself. In some workflows, this thumbnail was generated from an earlier draft, meaning the thumbnail shows content that was later removed from the main document. Stripping metadata properly includes removing these embedded thumbnail images.

What Does a Typical Word-Exported PDF Actually Reveal?

Microsoft Word is the most common origin for PDF documents shared professionally. A standard Word-to-PDF export populates metadata fields automatically, pulling from the Windows account profile and Office installation settings. Most users never see this happen. Here's what a typical export produces.

The Author field pulls from the name associated with your Microsoft account or local Windows user profile. If your Windows account is registered as "Jane Smith - Acme Corp Legal", that's what appears in every PDF you export. The Company field in the XMP namespace pulls from Office's organization settings. The Creator field shows the exact Word build number.

What this means in practice: a contract you export as PDF and send to opposing counsel carries your full name, your company name, the exact Microsoft Office version you're running, the date and time you created the original document, and the date and time you exported the PDF. All of that is visible to anyone who checks File > Properties in Acrobat. Some of it is harmless. Some of it - the timestamps, the author chain - is information you may not have intended to share.

The revision history risk: If a PDF was created through a complex workflow involving multiple drafts and re-exports, the XMP metadata can preserve a chain of modification dates and software versions that reconstructs the document's history. For legal documents, M&A materials, or competitive proposals, this history may reveal more about your process and timeline than you intended to disclose.

When Is Metadata Removal Legally Required?

Several legal and regulatory contexts treat metadata removal as a professional obligation rather than a best practice. The American Bar Association's Formal Opinion 477R (2017) confirmed that attorneys have an ethical duty to take reasonable precautions to prevent inadvertent disclosure of client information - and that metadata constitutes such information when documents are shared with opposing parties or courts.

Legal proceedings and court filings

Many federal courts and state bar associations have issued guidance on metadata in electronically filed documents. The concern is that metadata can reveal attorney work product: draft versions, internal notes embedded in revision history, or timing information about when legal arguments were formulated. Most court e-filing systems do not strip metadata automatically - that's the filer's responsibility.

Freedom of Information Act (FOIA) responses

Government agencies responding to FOIA requests are required to redact personally identifying information. Metadata often contains personally identifying information - the name of the civil servant who created or modified the document, their department, their workstation details. Proper FOIA compliance requires metadata cleaning alongside content redaction. High-profile failures in this area have led to embarrassing disclosures of official identities.

Procurement and tender submissions

Many procurement frameworks require blind submission - the evaluating body shouldn't know which bidder produced which document until after scoring. A tender document with the submitting company's name in the Author or Company metadata field violates that blind requirement. Some procurement systems explicitly require metadata-clean PDFs, and some bids have been disqualified for non-compliance.

"Attorneys have a duty of competence that encompasses understanding the risks of metadata disclosure. Sending a document with embedded metadata that reveals privileged information can constitute an inadvertent waiver of attorney-client privilege in some jurisdictions. Metadata review and cleaning should be part of any document review workflow before external distribution." Source: American Bar Association, Formal Opinion 477R, 2017

How to Remove PDF Metadata with FusionPDF

FusionPDF's metadata removal tool clears both the Document Information Dictionary and the XMP metadata packet in a single pass, entirely in your browser. No file is sent to any server. The process completes in seconds for most documents. Your original file is never modified - the tool produces a new, clean output file.

Open the metadata removal tool

Go to fusionpdf.pro/remove-metadata. No account or sign-up required. The tool loads in your browser.

Load your PDF

Drag your file onto the drop zone or click to select it. The file loads into browser memory only - nothing is sent to any server at any point.

Review metadata (optional)

The tool shows you which metadata fields are present before stripping, so you can confirm what will be removed. This is useful for verifying the Author and Company fields exist as expected.

Strip metadata and download

Click "Remove Metadata." pdf-lib writes a clean PDF with all metadata structures cleared. The download triggers immediately. Open the output in Acrobat and check File > Properties to confirm the fields are empty.

What Does FusionPDF's Metadata Removal Actually Strip?

Being specific here matters because "remove metadata" means different things across different tools. Some tools only clear visible Document Information Dictionary fields and leave the XMP packet intact. Others clear both but preserve embedded thumbnails. Here is exactly what FusionPDF's tool removes.

Removed: Document Info Dictionary

All eight standard fields: Author, Title, Subject, Keywords, Creator, Producer, CreationDate, ModDate. These will appear blank in Acrobat's File > Properties dialog after processing.

Removed: XMP Metadata Packet

The full XMP metadata stream embedded in the PDF, including Dublin Core, PDF namespace, XMP Basic, and any custom application namespaces. This is the layer most basic tools miss.

Not affected: visible content

All text, images, fonts, and layout remain exactly as in the original. Removing metadata has zero effect on the document content or visual appearance.

Limitation: embedded image EXIF

EXIF data inside embedded JPEG or PNG images is not separately stripped - that would require reprocessing each image. For GPS-sensitive content, use the redact tool to remove sensitive images before export.

Verify after stripping: open the cleaned PDF in Adobe Acrobat Reader (free) and go to File > Properties. The Description tab should show all fields as blank. For a deeper check, use a tool like exiftool filename.pdf from the command line, or an online metadata viewer. The XMP stream should return empty after FusionPDF processing.

For documents that require both metadata removal and content redaction - such as FOIA responses or legal productions - the recommended workflow is to redact sensitive content first, then strip metadata as the final step before submission. Read more about privacy-focused PDF handling in our PDF privacy guide and our guide to permanent redaction.

82%

of business documents shared externally are PDFs Per AIIM research, PDF is the dominant format for external document exchange - making metadata hygiene a standard practice for any organization sharing files with clients, courts, regulators, or procurement bodies.

Frequently Asked Questions

What metadata does FusionPDF remove from a PDF?

FusionPDF strips the standard Document Information Dictionary fields (Author, Title, Subject, Keywords, Creator, Producer, CreationDate, ModDate) and the full XMP metadata packet embedded in the PDF's metadata stream. This covers the fields most commonly exposed by Word, LibreOffice, Google Docs, and Adobe exports. It does not modify visible content. Embedded image EXIF data is not separately stripped - for GPS-sensitive images, remove or redact those images before export.

Does removing metadata change the visible content of the PDF?

No. Metadata is stored in a separate dictionary structure from the page content streams. Removing it has zero effect on text, images, layout, fonts, or any element visible when the document is opened. The resulting file looks identical to the original in any PDF reader - only the hidden properties (viewable under File > Properties in Acrobat) will be cleared.

Can I see what metadata is in my PDF before removing it?

Yes. In Adobe Acrobat, go to File > Properties > Description tab to see Document Information Dictionary fields. For XMP metadata, use File > Properties > Additional Metadata. On Mac, Preview shows basic metadata under Tools > Show Inspector. FusionPDF's metadata removal tool also displays the fields it finds before stripping, so you can confirm what's present before committing to the removal.

Do scanned PDFs contain metadata?

Yes, in two ways. First, the PDF container itself carries Document Information Dictionary fields set by the scanning software - scanner model, creation date, software version. Second, if the scanned images inside the PDF retain their EXIF data, that can include GPS coordinates from a smartphone scan, camera model, and timestamp. Both layers are worth clearing before sharing a scanned document externally, particularly in legal or compliance contexts.

Is metadata removal enough for legal compliance, or do I also need redaction?

They address different risks. Metadata removal clears invisible structured data about the document's origin and history. Redaction removes visible content from the page - names, account numbers, addresses, signatures. Most compliance scenarios (FOIA responses, legal productions, medical records) require both: redact the sensitive visible content, then strip the metadata before the final file is distributed. Use FusionPDF's redaction tool for the content layer and the metadata removal tool for the invisible layer.

Remove PDF Metadata Free

Strip author names, company data, timestamps, and XMP metadata in your browser. No upload, no account, no file size limit.

Open Metadata Removal Tool →