Key Takeaways — brief reading, less than 30 seconds
- Audits get postponed because they have no obvious endpoint. Lock the scope first: which folders are in, which are out, and which deliverables (spreadsheet, heat map, kill list) close the work.
- The asset list is the easy layer. Metadata, usage, and ownership are where the real work hides; most "content audit" articles skip them.
- Match scope strategy to library size: full inventory under 5,000 files, statistical sample for very large libraries, team-by-team in between. Full audits on 50,000-file libraries do not finish.
- The inventory is one row per asset and twelve columns: filename, path, format, size, dimensions, creator, last-modified, last-accessed, brand-restricted, licence status, used-where, duplicate-of. The decision goes in the thirteenth column at the end.
- You don’t need a DAM to run the audit. Free tools handle the layers: Sheets/Airtable/Notion for the inventory, Adobe Bridge or DigiKam for cataloguing, ExifTool for metadata, czkawka or dupeguru for duplicates, WinDirStat or ncdu for the disk-usage heat map.
- Duplicates split into exact (caught by file hash), near-duplicate (same image, different format or resolution), and version chaos (drafts and finals). Each category gets handled differently.
- Every audited asset gets one of four decisions: keep, update, archive, delete. "Think about it later" is not an option if the audit is going to finish.
- Some cases break the matrix: legal hold, expired licences, brand-restricted, orphans. Each needs its own handling rather than passing through the four-way classification.
- Feed the findings forward into DAM intake rules. Single-master logos, naming conventions enforced at upload, required metadata fields, and mandatory ownership prevent the same mess from accumulating again under a new tool.
Glossary10 terms
- Asset audit: The inventory pass that lists every image, video, design file, and brand asset in a creative library, classifies each one, and assigns a decision (keep, update, archive, or delete) before any reorganisation, migration, or new tool gets picked.
- Inventory record: The audit’s central artifact: one row per asset with twelve fields (filename, path, format, size, dimensions, creator, last-modified, last-accessed, brand-restricted flag, licence status, used-where, duplicate-of pointer) plus the decision in the thirteenth column.
- Decision matrix: The four-way classification applied to every audited row: Keep (in active use, migrates as-is), Update (in active use but needs work first), Archive (not active but legally or historically valuable), Delete (orphan, duplicate, or obsolete with all four removal criteria met).
- Kill list: The explicit set of assets marked for deletion or deep archive, with a named approver per batch. Without a named approver, deletion sits in the queue indefinitely while everyone waits for someone else to take responsibility.
- Heat map: A visual summary of how a library distributes by type, age, or audit decision. Often a stacked-bar chart (e.g. "5,200 keep / 1,800 update / 8,400 archive / 14,000 delete") that converts the audit’s effort into a one-slide stakeholder briefing.
- EXIF / IPTC: The two metadata standards that ship inside image files. EXIF (Exchangeable Image File Format) carries camera-derived data — exposure, lens, GPS, timestamp. IPTC (International Press Telecommunications Council) carries editorial fields — creator, caption, copyright, keywords. Both survive most exports and are what ExifTool reads and writes.
- Orphan file: An asset with no named owner, no recorded usage, and no metadata flagging it as referenced by any active campaign. Default audit action: archive for one quarter, delete if unclaimed.
- Near-duplicate: Files representing the same asset in different formats or resolutions (logo.png, logo@2x.png, logo-final.png). Different from exact duplicates (identical bytes); requires filename-pattern matching plus visual sampling to identify the canonical master.
- Statistical sample: A scope strategy for very large libraries: pull a random 5–10% of files, audit that subset, extrapolate the proportions. Answers "how much of the library is duplicates / in current use / stale" without producing a per-file kill list.
- Intake rules: The constraints a DAM enforces at upload — required metadata fields, naming conventions, ownership assignment, deduplication checks. The audit’s findings become these rules so the same chaos does not accumulate in the new system.
Editor's note: This is a creative-asset audit. The subject is the images, videos, brand files, and design files sitting on a shared drive waiting to be organised. For the brand-identity side of the work — perception, positioning, brand strengths — see our brand management piece.
A creative asset audit is the inventory pass that puts every image, video, design file, and brand asset on the shared drive into one record — filename, location, owner, licence status, last use, plus a decision: keep, update, archive, or delete. Without one, expired stock licences keep shipping in live campaigns, the freelancer who shot last quarter’s hero takes the canonical files to a dead account when she leaves, the brand kit fragments across seventeen folders, the storage bill keeps climbing on files nobody opens, and the team rebuilds assets that already exist because nobody can find the canonical version. Past a few thousand mixed assets and a handful of people producing them, you need one.
The audit has no obvious deliverable and nobody waiting on it, so it sits in the backlog. The framework below is what closes it.

Why Audits Get Postponed#
In the stalled audits I’ve sat with the root cause was always the same: nobody had locked the folder list before the spreadsheet was opened. The team started cataloguing whatever was in front of them, three or four weeks in were still adding rows from drives that probably shouldn’t have been in scope, and the file gradually stopped getting opened because nobody could say when it would be finished.
The way out is to lock the scope before any spreadsheet opens. Which folders are in: active project archives, the brand kit, current campaigns. Which are out: personal drives, finished-and-archived work, anything older than the cutoff date. What the deliverable is: the spreadsheet, heat map, and kill list described later in the article. With those locked, the audit takes one or two weeks of intermittent work for a typical creative team.
The Four Layers: Assets, Metadata, Usage, Ownership#
The list of files is the easy layer. The other three are where the real work hides, and naming them up front turns the audit from “list everything” into four separate, scoped questions.
- Assets. What files exist, where, in what format, what size. The simplest layer; usually a directory listing produced by the file system in an afternoon.
- Metadata. What the files claim about themselves: filenames, embedded EXIF and IPTC fields (the camera and editorial metadata that ships inside image files), tags, descriptions, captions. Usually inconsistent or missing.
- Usage. Which assets are actually used by which channels, and which are orphans nobody references anymore. The hardest layer to measure quickly because it requires looking at where assets ship to (web pages, campaigns, decks, social) rather than where they sit.
- Ownership. Who owns the file, who can authorise its deletion, who has rights to use it. Hard to reconstruct after the fact when the people who created the files have left the organisation or never logged the rights in the first place.
The audit’s value comes from the second, third, and fourth layers. The asset list alone tells you nothing actionable. Design the metadata schema before the audit produces data and the audit’s output is already in the taxonomy’s shape rather than needing translation afterwards. The taxonomy you build later runs on the metadata you find now.
Scope Decision: Full vs Sampled vs Team-by-Team#
In the audits I’ve sat with, full inventory passes on libraries above about 50,000 files don’t finish. The team gets four weeks in, the spreadsheet has a couple of thousand rows, the question of “is this still relevant?” gets harder per row, and the team stops opening the spreadsheet. Pick the scope strategy first.
Full inventory. Realistic only on small libraries (under 5,000 files) or after a duplicate-dedup pre-pass that cuts the volume in half. Worth doing when the asset count is bounded and the migration deadline is real.
Statistical sample. Pull a random 5–10% of files, audit that subset, extrapolate the proportions. The statistical sample answers “should we even bother” questions: how much of the library is duplicates, what proportion is in current use, how stale is the median asset. It does not produce a kill list, only a decision about whether building one is worth the effort.
Team-by-team. Audit one team’s folder fully, ship the process, then roll out to the next. Takes the longest on the calendar but actually completes, because each team-level pass produces a finished artefact. Best for organisations where the asset library is genuinely siloed by team rather than fully shared.
The thresholds: under 5,000 files, full. Over 50,000, statistical sample first then team-by-team. Between, team-by-team unless the migration deadline is short enough to demand a sampled approach.
The Inventory Record (What to Capture Per Asset)#
The inventory is a spreadsheet with one row per asset. The columns determine whether the spreadsheet is useful or just busywork. Twelve is the working number I keep coming back to; more than that and the rows stop getting filled.
| Field | What good data looks like |
|---|---|
| Filename | The original, exactly as it lives on disk. Don’t rename during audit; rename after decisions land. |
| Full path | Folder hierarchy from the root. Becomes searchable metadata after migration. |
| Format | JPEG, PNG, MP4, PSD, AI, INDD, RAW, PDF. From the file extension; verify on a sample. |
| Size | Bytes. Useful for storage planning and detecting format anomalies (a 50 KB “PSD” isn’t a PSD). |
| Dimensions | Pixels (images), duration (video), pages (PDF). Determines what channels the asset can ship to. |
| Creator | From file metadata, IPTC byline, or institutional knowledge. Often empty — flag rather than guess. |
| Last modified | From the file system. Proxy for “is this still being touched.” |
| Last accessed | If the file system tracks it. Proxy for “is anyone still using this.” |
| Brand-restricted flag | True if the asset is part of the canonical brand library; false otherwise. The flag separates “treat as immutable” from “treat as project work.” |
| Licence status | Owned, licensed (with expiry), public domain, unknown. The unknown count is the audit’s most important finding. |
| Used where | Channel, page, campaign, deck. Empty cells are candidates for orphan classification. |
| Duplicate-of pointer | Filename of the canonical version if this row is a near-duplicate. Resolves to one row per asset after dedup. |
The thirteenth field is the decision (keep, update, archive, or delete). It comes last, after the rest of the row is populated, because the decision depends on the data above it.
Tools to Run the Audit#
The audit is a metadata exercise: list files, classify them, capture decisions. The tooling splits into four free-or-cheap layers (spreadsheet, file cataloguing, duplicate finder, disk-usage analyser), plus a fifth option for teams that want to skip the spreadsheet and run the audit inside an open-source DAM. None of these are DAMs in the buy-this-product-then-do-the-audit sense — the DAM comes after, as the destination the cleaned inventory migrates into.
Spreadsheet & Database#
Where the inventory rows live and where the decision column actually gets ticked.
Free or Bundled
- Google Sheets(opens in new tab) — free with any Google account. Fine for libraries under a few thousand rows. Filtering by multiple criteria (decision status × approver × format) gets clunky above that.
- Microsoft Excel(opens in new tab) — from $9.99/mo (M365 Personal) or $12.50/user/mo (M365 Business Standard). Handles bigger sheets than Google Sheets and stays usable above 50,000 rows. Excel is included in every M365 tier.
- SQLite + a custom schema — free. Worth considering above 20,000 assets where joins against external usage logs (CDN, web analytics, e-commerce attribution) start to matter.
Relational / Database-Style
- Airtable(opens in new tab) — free tier exists; Team $20/user/mo, Business $45/user/mo (annual billing only). Wins on relational links between assets, campaigns, and approvers. Read-only collaborators and form submitters are not charged — only edit-permission seats count.
- Notion(opens in new tab) — free for solo use; Plus $10/user/mo, Business $20/user/mo. Better for teams that want notes and meeting context attached to rows. Sync databases and row-level permissions are Business-tier only.
File Cataloguing & Metadata#
Most of these have a local database (usually SQLite) under the hood, so once a tool has indexed a folder it can answer inventory questions much faster than walking the file tree on every query.
Free / Open Source
- Adobe Bridge(opens in new tab) — free with any Adobe ID (no Creative Cloud subscription required; sign-in re-validates every 30 days). Browses PSD, AI, INDD, RAW in place across the folder tree, with batch-tagging and renaming without re-importing.
- DigiKam(opens in new tab) — free, open source, cross-platform. SQLite catalogue, face recognition, GPS, hierarchical tags. The strongest free option for photo and video libraries.
- ExifTool(opens in new tab) — free, open source, command line. Reads and writes EXIF and IPTC fields across thousands of files in one pass. Indispensable for finding the metadata holes the audit needs to flag.
Paid
- Adobe Lightroom Classic(opens in new tab) — from $9.99/mo (Photography Plan 20 GB, closed to new subscribers since Jan 2025) or $11.99/mo (Lightroom plan 1 TB, now also includes Classic). The catalogue is a SQLite database; metadata exports as IPTC.
Duplicate Finders#
Run one of these as a pre-pass before the audit, not during. The duplicate count drops your row count meaningfully — in the libraries I’ve audited, somewhere between 15 and 30%.
Free / Open Source
- Czkawka(opens in new tab) — free, open source (Rust). Fast, cross-platform, handles exact matches, near-duplicates, and visually similar images.
- dupeguru(opens in new tab) — free, open source. Older but maintained, GUI-driven, three modes (standard, music, picture). Strongest of the three on near-duplicate image detection.
- fdupes / jdupes(opens in new tab) — free, open source, command line. Hash-based exact-match detection, ideal for scripting and large libraries on Linux/macOS.
Disk-Usage Analysers (the Heat Map)#
The audit’s heat-map deliverable starts here: where the bytes actually live before you start counting files row by row.
Windows
- WinDirStat(opens in new tab) — free, open source. Treemap visualisation of folder sizes; the canonical free option on Windows.
- TreeSize(opens in new tab) — Free edition exists (commercial use disallowed); Personal €40 one-time; Professional €3.40/user/mo subscription (perpetual licences quote-based). Pro adds duplicate finder, scheduled scans, and CLI/exports the Free edition omits.
macOS
- GrandPerspective(opens in new tab) — free, open source on SourceForge (small paid version on the Mac App Store supports the developer).
- DaisyDisk(opens in new tab) — $9.99 one-time, covers up to 5 personal Macs. Polished sunburst visualisation; arguably the best-looking option on macOS.
Linux / Cross-Platform CLI
- ncdu(opens in new tab) — free, open source. Ncurses disk-usage analyser; the standard answer on Linux, also runs on macOS via Homebrew.
Open-Source DAMs as Audit Workspace#
For teams that want the audit’s output to land directly in a running DAM rather than a separate spreadsheet, three open-source options can serve as both destination and audit workspace. Heavier setup than a Sheet, but no re-import step at the end.
Self-Hostable
- ResourceSpace(opens in new tab) — self-hosted Community version is free and open source. Montala-hosted managed plans are quote-based across four tiers (Team / Business / Enterprise / Platinum Cloud), all GBP and contact-only.
- Phraseanet(opens in new tab) — self-hosted, free, open source (GPL v3) by Alchemy. Commercial support, training, and hosting are quote-based via Alchemy.
- Pimcore(opens in new tab) — Community Edition free and open source. Paid editions: Professional $9,900/yr, Enterprise $29,900/yr, Cloud-hosted PaaS from $39,900/yr. Heavier setup than ResourceSpace or Phraseanet, with PIM and commerce-framework features the others don’t carry.
Duplicates and Version Chaos#
The hardest part of a real audit is the duplicate problem. It splits into three categories, and each one needs its own handling.
Exact duplicates. Identical bytes, different paths. Detected with a file-hash pass (md5 or sha1 across the library). Easy to delete: keep one canonical copy in the most-recently-modified location, archive the others. In the libraries I’ve audited the rate has typically run somewhere between 15 and 30% of files after a few years on a shared drive; budget on the lower end and treat anything above as a useful surprise.
Near-duplicates. Same image, different format or resolution. logo.png, logo@2x.png, logo-final.png, logo-FINAL-v2.png. The decision per cluster is which one is the master, usually the highest-resolution lossless source. Detected by filename pattern matching plus visual sampling on the borderline cases.
Version chaos. Drafts, semi-finals, real-finals, post-launch edits. Hardest because the “right” version depends on usage context: the version on the website is not the version sent to print. The pattern that works: keep the version that shipped, archive the rest, accept that future versions of the asset go through a real version-control discipline rather than the filename ladder. For what that discipline looks like, see version control for designers.
The Decision Matrix: Keep / Update / Archive / Delete#
Every audited asset gets one of four decisions. The matrix removes the per-asset judgement call and replaces it with rules.
| Decision | Triage criteria | Where it goes |
|---|---|---|
| Keep | In active use, current, well-named, licence in order. Last accessed within 12 months OR brand-restricted regardless of access. | Migrates as-is to the DAM. The canonical asset. |
| Update | In active use but needs work before migration: rename, re-export, add metadata, fix licence record, replace with a higher-resolution master. | Goes through a fix queue, then migrates. The fix queue is its own time budget — usually 10–20% of the keep count. |
| Archive | Not in active use but legally or historically valuable. Old campaign assets, brand-history references, anything tied to a contractual retention requirement. | Cold storage. Not the live DAM. A separate archive bucket on cheaper storage, indexed for retrieval but not surfaced in normal search. |
| Delete | Orphan, duplicate, or obsolete. Not accessed in 18 months, no inbound usage, no brand value, no licence retention requirement. All four conditions, not any. | Goes away. Confirmed by the named owner of the parent folder, then deleted in batches with a 30-day soft-delete window before permanent removal. |
The point of the matrix is to remove judgement from the per-row decision. A row that hits the delete criteria gets deleted. A borderline row goes to update or archive, never “think about it later.”
Six rows from a representative library, showing what populated entries actually look like. The schema in the previous section produces rows like these; the matrix above produces the decision in the last column. The fifth row is the one that breaks the matrix and gets handled in the next section.
| Filename | Last accessed | Used where | Licence | Notes | Decision |
|---|---|---|---|---|---|
| brand-mark-master.ai | 2 weeks ago | Brand kit canonical, deck templates, web header | Owned | Brand-restricted; single master | Keep |
| Q3-2025-hero-FINAL_v3.psd | 8 days ago | Q3 campaign landing, paid social | Owned | Filename ladder; three near-duplicates in same folder | Update |
| summer-2023-banner-set.zip | 14 months ago | Retired Summer 2023 launch | Owned | Brand-history reference; not in active use | Archive |
| logo-final-v2.png | 7 months ago | Nowhere current | Owned | Lossy export of brand-mark-master.ai | Delete |
| shutterstock-1487293.jpg | 3 weeks ago | About-page hero | Stock, expired Mar 2025 | Re-licence or replace before migration | Special: licence |
| team-portrait-marina.cr2 | 11 months ago | Author bio, team page | Owned | Creator field empty; no IPTC byline | Update |
Special Cases (Legal Hold, Licences, Orphans)#
Some cases break the matrix and need their own handling in the spreadsheet.
Legal hold. Assets that cannot be deleted due to ongoing litigation, regulatory retention rules, or industry-specific obligations (pharma, finance, healthcare). Flag and exclude from the matrix entirely. These always migrate or always archive, regardless of the other criteria.
Expired licences. Stock photos, fonts, or music whose use rights have lapsed. These need replacement, not relocation. Archiving an expired-licence asset is still a violation if it ships from the archive. The right action is to remove the asset from any active distribution and tag it as “awaiting re-licence or replacement” in the audit.
Brand-restricted. Assets that should never have been on the public drive in the first place. Logos in their original Adobe Illustrator source, executive headshots from internal sessions, draft strategy documents. The audit catches these before the migration scatters them further. The action is to consolidate them into the brand-restricted area of the DAM with appropriate access controls.
Orphan files. No owner, no usage, no metadata. Default action: archive for one quarter, delete if nobody claims them. The 90-day window is enough to catch the case where a folder gets accessed for the first time in years.
What an Audit Produces#
The audit is finished when it has produced three artefacts. Define them at the start; they’re the deliverable that closes the project.
The spreadsheet. One row per asset, twelve inventory columns and the decision in the thirteenth. The full record. The point of putting it somewhere persistent is to keep it as an audit trail after the migration; Notion, Airtable, or a shared Google Sheet all work.
The heat map. A visual summary of where assets live, how they distribute by type/age/decision. Useful for the stakeholder briefing that justifies the migration scope. A simple stacked-bar chart showing “5,200 keep / 1,800 update / 8,400 archive / 14,000 delete” converts the audit’s effort into a one-slide justification for everything that comes after.
The kill list. The explicit set of assets marked for deletion or deep archive, with the named approver who signed off on each batch. Without a named approver per batch, deletion sits in the queue indefinitely while everyone waits for someone else to take responsibility.
Frequency: One-Shot vs Recurring#
Once the first audit ships, the choice is what comes next.
One-shot pre-migration. The most common case. Audit, decide, migrate, never audit again. The risk is backslide within a year as new chaos accumulates. Without intake rules in the DAM (covered in the next section), the team’s drive habits import themselves into the new tool and the same audit is needed again before long.
Recurring quarterly. Lighter audits scoped to “what’s been added since the last audit.” Pairs with intake rules so new assets land already tagged. The quarterly pass is then usually twenty minutes per team to confirm the auto-applied metadata is right and flag anomalies.
Most teams should plan for one-shot then quarterly thereafter, even if the quarterly is twenty minutes per team.
Audit → DAM Intake (Feed the Findings Forward)#
The audit is wasted if its findings do not become the DAM’s intake rules. Every category of mess the audit found should become a rule in the new system that prevents the same mess from accumulating again.
- Duplicate logos. The audit found 12 versions of the company logo across the drive. The DAM routes any logo-tagged upload through a brand-team review queue, so a new variant lands in moderation rather than directly in the brand kit.
- Inconsistent naming. The audit found
logo_final_v2_FINAL.psdandbrand-mark_v3.psdas the same asset. The DAM takes the filename out of the uploader’s hands: the file is stored against an asset record and renamed to a canonical pattern derived from the metadata fields (brand, type, version) rather than whatever was typed at upload. - Missing metadata. The audit found half the brand-asset library had no creator, no licence, no expiry. The DAM makes those fields required at upload. No row in the future audit ever has them empty again.
- Orphan files. The audit found 14,000 files with no owner. The DAM enforces ownership assignment at upload: every asset has a named owner and a workspace, and orphan creation is structurally prevented.
Each row above corresponds to a structural change in the new system. If the audit is part of a Drive migration, the rules ship with the new system on day one. If it’s a standalone discipline pass before the migration is even scoped, the rules become the brief for the DAM you eventually pick.








