Metadata Awareness
See what your documents actually carry beyond the rendered page, and sanitize before any external share.
What Is Metadata Awareness?
Documents are the most underestimated data leak vector in corporate communications. A polished PDF that looks clean on the rendered page routinely carries tracked changes from the legal redline, embedded reviewer comments from internal stakeholders, author + last-modified-by + company metadata, hidden text in white-on-white, the cropped-out portions of inserted images, and 'redactions' that are just black rectangles drawn on top of fully selectable text. In this exercise you push out an embargoed Phase 3 press release for Coastveil Therapeutics' lead oncology candidate, hit Send, and watch Bloomberg quote the redacted financial projections back at you three days later. You then open Coastveil's internal DocSentry forensics view to see exactly what Bloomberg's reporter extracted from your file, learn why 'Save as PDF' preserves the document's object stream by default, and apply the proper sanitization workflow to the next release before it leaves your laptop. The exercise reinforces three rules: every external document gets sanitized through the approved internal tool before sending; 'Save as PDF' is a layout format, not a sanitization step; and a black rectangle drawn over text is decoration, not redaction.
What You'll Learn in Metadata Awareness
- Identify the categories of hidden data that routinely live inside corporate documents — tracked changes, embedded reviewer comments, author + properties metadata, white-on-white hidden text, cropped-out image regions, and layered annotation rectangles drawn over text
- Understand why "Save as PDF" is a layout-preservation format that, by default, carries the source document's object stream into the PDF rather than sanitizing it
- Distinguish a real redaction (text replaced with [REDACTED] and the file flattened before export) from a drawn black rectangle that leaves the underlying characters fully selectable in any modern PDF reader
- Apply the sanitize-before-share workflow for any external document — press, regulators, vendors, partners, contractors, public bug reports, screenshots into a support chat, attachments into an investor data-room
- Use an approved internal document hygiene tool with a clearance signature the email-gateway DLP recognizes, and recognize that consumer LLMs and free online "PDF metadata cleaners" are out-of-policy substitutes that move the leak to a third-party server
Metadata Awareness — Training Steps
-
An Embargoed Press Release on a Tight Deadline
Coastveil Therapeutics has been waiting on Phase 3 readouts for Verymyl-12 — your lead oncology candidate — for almost two years. The numbers came back strong last Friday. Legal, Clinical, the CFO's office, and Investor Relations have all signed off on the public messaging, and the embargo lifts at 09:30 ET tomorrow when the U.S. market opens. Marcus, your VP of Communications, owns the redline. Priya, on press-ops, owns the distribution wire. Your job is to hand the locked-final to Priya so she can push it to the ~60-outlet financial-press list tomorrow morning.
-
Priya Asks for the Final
Priya Iyer runs press-ops at Coastveil — she owns the distribution wire that pushes releases out to the financial-press list. She just dropped the locked-final into your inbox with a request to confirm send before she queues the wire.
-
Download the Press Release PDF
Pull the attached PDF down to your Downloads folder so you can open it and give it a final look before you reply.
-
Open the Press Release
The PDF is in your Downloads folder. Open it from the file manager and give it a final look before you reply to Priya.
-
Looks Polished
The release reads exactly as Legal cleared it. The lede is the trial result. The body cites the trial design and the safety profile. The financial projections paragraph shows clean redacted blocks where the embargoed numbers used to be. The author byline is Coastveil Investor Relations. Visually, there is nothing on the page that should not be on the page. That is exactly the problem. What your eyes see in a polished PDF is not what is actually inside the file.
-
Reply to Priya With the PDF Attached
Reply to Priya on the same thread with the locked-final PDF attached. Your reply is the formal handoff that triggers her press-ops wire — she will push the same file to the ~60-outlet financial-press list at 09:00 ET tomorrow, ahead of the embargo lift.
-
Ledgermark Breaks the Story
The Verymyl-12 release went out Tuesday morning via Priya's wire. The trade press picked it up cleanly. By Friday morning, Ledgermark has a different story.
-
Read the Ledgermark Article
Before you go investigate the leak, see what is actually in print. Open the Ledgermark Financial article from the SOC email — the same story Wall Street has been reading for the last hour.
-
Open DocSentry Forensics
DocSentry runs entirely on Coastveil's network. The SOC has already pulled your outbound PDF from the email gateway logs and queued it on the forensics view. You just need to sign in.
-
Read the Internal-Tool Notice
Before you do anything on the page, read the notice at the top. The same banner is on every approved document hygiene surface at Coastveil because skipping the internal tool — by uploading a sensitive draft to a consumer AI or a free online PDF utility — is the most common shadow-IT incident the SOC chases.