How to Remove PDF Metadata Free — Strip Hidden Data Before Sharing
Every PDF you create or export carries invisible data most people never see: the author's full name, their company name, the software version used, the date and time the document was first created, and sometimes revision history going back weeks. That data travels with the file when you share it. This guide explains what's actually in your PDF's metadata, when it matters, and how to strip it completely before the file leaves your hands.
- 63% of PDF users are unaware their documents contain embedded metadata, per a Netwrix Data Risk & Security Report.
- A Word-exported PDF typically contains author name, company, software version, and creation timestamp by default.
- Some situations legally require metadata removal: FOIA responses, court submissions, and tender/RFP documents.
- Stripping metadata has zero effect on visible content - text, images, and layout are unchanged.
What Is PDF Metadata and What Does It Actually Contain?
PDF metadata is structured data stored inside the file that describes the document rather than representing its content. The PDF specification defines two metadata systems: the Document Information Dictionary (the original system) and the XMP metadata packet (the newer, richer system). Both are present in most modern PDFs. 63% of PDF users are unaware their documents contain this embedded data, according to the Netwrix Data Risk and Security Report.
Document Information Dictionary fields
The Document Information Dictionary is a simple key-value store defined in the PDF spec. It holds up to eight standard fields. Every modern PDF creation tool populates at least some of these by default, often without asking the user's permission.
| Field | What it contains | Risk level |
|---|---|---|
Author |
The name of the person who created the document, usually pulled from the OS user account | High |
Creator |
The application that originally created the document (e.g., "Microsoft Word 365") | Medium |
Producer |
The software that converted or saved the final PDF (e.g., "Adobe PDF Library 23.6.0") | Medium |
CreationDate |
Timestamp when the document was first created, including timezone offset | Medium |
ModDate |
Last modification timestamp | Low |
Title |
Document title, often the filename or an internal working title not meant for external eyes | Medium |
Subject |
A subject line, sometimes auto-populated from document properties | Low |
Keywords |
Tags or keywords added by the author, sometimes internal classification labels | Medium |
XMP metadata: the richer, harder-to-see layer
XMP (Extensible Metadata Platform) is an Adobe standard embedded as an XML packet inside the PDF. It can contain everything in the Document Information Dictionary, plus additional namespaces for rights management, job processing history, document revision history, and custom application-specific data. XMP is not visible through basic "File Properties" dialogs in most readers - you need a dedicated metadata viewer or Acrobat's full properties panel to see it.
Why Is PDF Metadata a Privacy and Security Risk?
Real incidents illustrate the problem better than abstract warnings. In 2003, the UK government released a dossier on Iraqi weapons capabilities as a Word document converted to PDF. The metadata revealed that the document had been edited by multiple government officials, including one whose name became the center of a major political controversy. The visible content was the intended disclosure. The metadata was not.
What recipients can learn from your metadata
Anyone who receives your PDF can open it in Adobe Acrobat and check File > Properties to see your name, company, and the software you used. For technical users, tools like ExifTool, pdfinfo, or any PDF library can extract the full XMP packet in seconds. That "CONFIDENTIAL - DO NOT DISTRIBUTE" stamp on the cover page means nothing if the metadata tells the recipient exactly who created it, when, and on which system.
Stack exposure is a particular concern in competitive or legal contexts. The Creator and Producer fields reveal your exact software versions. A tender document that shows "Microsoft Word 2021" and "Adobe Acrobat 23.6" in its metadata tells competitors exactly what tools and versions your team uses - occasionally relevant in procurement contexts where software standardization matters.
GPS coordinates in embedded images
This one surprises people. If your PDF includes photos taken with a smartphone - a site inspection photo, a product photograph, a screenshot from a mobile device - those images can carry EXIF metadata including GPS coordinates precise to within a few meters. When a PDF is exported from a Word document containing such images, the EXIF data often travels along. A location that seemed incidental in the photo can become a disclosure in the PDF.
Visible Metadata vs. Hidden XMP Metadata
The distinction matters because different removal steps are needed for each layer. The Document Information Dictionary is what most "Properties" dialogs show. XMP is what stays hidden and survives many basic cleanup attempts. A PDF that looks clean in Acrobat's basic properties view may still carry a full XMP history packet that tools like ExifTool will expose.
There's a third layer worth knowing about: embedded thumbnails. PDFs sometimes store a small preview thumbnail of the first page inside the file itself. In some workflows, this thumbnail was generated from an earlier draft, meaning the thumbnail shows content that was later removed from the main document. Stripping metadata properly includes removing these embedded thumbnail images.
What Does a Typical Word-Exported PDF Actually Reveal?
Microsoft Word is the most common origin for PDF documents shared professionally. A standard Word-to-PDF export populates metadata fields automatically, pulling from the Windows account profile and Office installation settings. Most users never see this happen. Here's what a typical export produces.
The Author field pulls from the name associated with your Microsoft account or local Windows user profile. If your Windows account is registered as "Jane Smith - Acme Corp Legal", that's what appears in every PDF you export. The Company field in the XMP namespace pulls from Office's organization settings. The Creator field shows the exact Word build number.
What this means in practice: a contract you export as PDF and send to opposing counsel carries your full name, your company name, the exact Microsoft Office version you're running, the date and time you created the original document, and the date and time you exported the PDF. All of that is visible to anyone who checks File > Properties in Acrobat. Some of it is harmless. Some of it - the timestamps, the author chain - is information you may not have intended to share.
The revision history risk: If a PDF was created through a complex workflow involving multiple drafts and re-exports, the XMP metadata can preserve a chain of modification dates and software versions that reconstructs the document's history. For legal documents, M&A materials, or competitive proposals, this history may reveal more about your process and timeline than you intended to disclose.
When Is Metadata Removal Legally Required?
Several legal and regulatory contexts treat metadata removal as a professional obligation rather than a best practice. The American Bar Association's Formal Opinion 477R (2017) confirmed that attorneys have an ethical duty to take reasonable precautions to prevent inadvertent disclosure of client information - and that metadata constitutes such information when documents are shared with opposing parties or courts.
Legal proceedings and court filings
Many federal courts and state bar associations have issued guidance on metadata in electronically filed documents. The concern is that metadata can reveal attorney work product: draft versions, internal notes embedded in revision history, or timing information about when legal arguments were formulated. Most court e-filing systems do not strip metadata automatically - that's the filer's responsibility.
Freedom of Information Act (FOIA) responses
Government agencies responding to FOIA requests are required to redact personally identifying information. Metadata often contains personally identifying information - the name of the civil servant who created or modified the document, their department, their workstation details. Proper FOIA compliance requires metadata cleaning alongside content redaction. High-profile failures in this area have led to embarrassing disclosures of official identities.
Procurement and tender submissions
Many procurement frameworks require blind submission - the evaluating body shouldn't know which bidder produced which document until after scoring. A tender document with the submitting company's name in the Author or Company metadata field violates that blind requirement. Some procurement systems explicitly require metadata-clean PDFs, and some bids have been disqualified for non-compliance.
How to Remove PDF Metadata with FusionPDF
FusionPDF's metadata removal tool clears both the Document Information Dictionary and the XMP metadata packet in a single pass, entirely in your browser. No file is sent to any server. The process completes in seconds for most documents. Your original file is never modified - the tool produces a new, clean output file.
Go to fusionpdf.pro/remove-metadata. No account or sign-up required. The tool loads in your browser.
Drag your file onto the drop zone or click to select it. The file loads into browser memory only - nothing is sent to any server at any point.
The tool shows you which metadata fields are present before stripping, so you can confirm what will be removed. This is useful for verifying the Author and Company fields exist as expected.
Click "Remove Metadata." pdf-lib writes a clean PDF with all metadata structures cleared. The download triggers immediately. Open the output in Acrobat and check File > Properties to confirm the fields are empty.
What Does FusionPDF's Metadata Removal Actually Strip?
Being specific here matters because "remove metadata" means different things across different tools. Some tools only clear visible Document Information Dictionary fields and leave the XMP packet intact. Others clear both but preserve embedded thumbnails. Here is exactly what FusionPDF's tool removes.
Verify after stripping: open the cleaned PDF in Adobe Acrobat Reader (free) and go to File > Properties. The Description tab should show all fields as blank. For a deeper check, use a tool like exiftool filename.pdf from the command line, or an online metadata viewer. The XMP stream should return empty after FusionPDF processing.
For documents that require both metadata removal and content redaction - such as FOIA responses or legal productions - the recommended workflow is to redact sensitive content first, then strip metadata as the final step before submission. Read more about privacy-focused PDF handling in our PDF privacy guide and our guide to permanent redaction.
Frequently Asked Questions
What metadata does FusionPDF remove from a PDF?
FusionPDF strips the standard Document Information Dictionary fields (Author, Title, Subject, Keywords, Creator, Producer, CreationDate, ModDate) and the full XMP metadata packet embedded in the PDF's metadata stream. This covers the fields most commonly exposed by Word, LibreOffice, Google Docs, and Adobe exports. It does not modify visible content. Embedded image EXIF data is not separately stripped - for GPS-sensitive images, remove or redact those images before export.
Does removing metadata change the visible content of the PDF?
No. Metadata is stored in a separate dictionary structure from the page content streams. Removing it has zero effect on text, images, layout, fonts, or any element visible when the document is opened. The resulting file looks identical to the original in any PDF reader - only the hidden properties (viewable under File > Properties in Acrobat) will be cleared.
Can I see what metadata is in my PDF before removing it?
Yes. In Adobe Acrobat, go to File > Properties > Description tab to see Document Information Dictionary fields. For XMP metadata, use File > Properties > Additional Metadata. On Mac, Preview shows basic metadata under Tools > Show Inspector. FusionPDF's metadata removal tool also displays the fields it finds before stripping, so you can confirm what's present before committing to the removal.
Do scanned PDFs contain metadata?
Yes, in two ways. First, the PDF container itself carries Document Information Dictionary fields set by the scanning software - scanner model, creation date, software version. Second, if the scanned images inside the PDF retain their EXIF data, that can include GPS coordinates from a smartphone scan, camera model, and timestamp. Both layers are worth clearing before sharing a scanned document externally, particularly in legal or compliance contexts.
Is metadata removal enough for legal compliance, or do I also need redaction?
They address different risks. Metadata removal clears invisible structured data about the document's origin and history. Redaction removes visible content from the page - names, account numbers, addresses, signatures. Most compliance scenarios (FOIA responses, legal productions, medical records) require both: redact the sensitive visible content, then strip the metadata before the final file is distributed. Use FusionPDF's redaction tool for the content layer and the metadata removal tool for the invisible layer.
Remove PDF Metadata Free
Strip author names, company data, timestamps, and XMP metadata in your browser. No upload, no account, no file size limit.
Open Metadata Removal Tool →