Key Takeaways — brief reading, less than 30 seconds
  • Audits get postponed because they have no obvious endpoint. The fix is a scoped audit with an explicit deliverable: spreadsheet, heat map, kill list. Define those three at the start.
  • The four layers are assets, metadata, usage, and ownership. Most “content audit” articles cover only the first; the other three are where the real work hides.
  • Pick the scope strategy before you start: full inventory under 5,000 files, statistical sample over 50,000, team-by-team in between. Full audits on 50k libraries don’t finish.
  • Twelve fields per asset row: filename, path, format, size, dimensions, creator, last-modified, last-accessed, brand-restricted, licence status, used-where, duplicate-of. The decision goes in the thirteenth column at the end.
  • Three duplicate categories — exact (file hash), near-duplicate (same image different format), version chaos (drafts and finals). Each gets handled differently.
  • Four-way decision: keep / update / archive / delete. The matrix removes per-asset judgement and replaces it with rules. “Think about it later” is not an option if the audit is going to finish.
  • Special cases that break the matrix: legal hold, expired licences, brand-restricted, orphans. Each gets its own handling.
  • The audit feeds forward into the DAM’s intake rules. Every category of mess the audit found becomes a rule that prevents it from accumulating again — single-master logos, naming conventions at upload, required metadata fields, mandatory ownership.

Editor's note: This is a creative-asset audit, not a marketing-content audit. If you came here for what to do with old blog posts and SEO content, you want a different article. This one is about the images, videos, brand files, and design files sitting on a shared drive waiting to be organised. If you wanted the brand-identity audit instead — evaluating brand strengths, perception, positioning — that’s the work that lives in our brand management piece.

The migration project agrees in principle. The DAM has been picked. The team is ready. Then the kickoff meeting hits the question nobody wanted to be asked first: what are we actually moving? The inventory doesn’t exist. The folder structure is whatever it grew into over four years. Half the assets live with people who left. The decision “we’ll just audit first” sounds reasonable on the Tuesday and is still in the queue eighteen weeks later because nobody knows where the audit ends.

Audits feel limitless. This one won’t.

A clean spreadsheet on a laptop screen, suggesting the inventory artefact at the heart of an audit
The audit’s deliverable is a spreadsheet, a heat map, and a kill list. Define them up front and the work has an endpoint.

Why Audits Get Postponed#

The audit is the prerequisite for almost everything: migration, taxonomy, brand-kit consolidation, archive cleanup. It also has no clear endpoint, no obvious deliverable, and no user demanding it tomorrow — so it gets postponed indefinitely. The team agrees a DAM migration is needed, can’t start because nobody knows what’s in the existing folder, decides “we’ll audit first,” then the audit never happens because it feels limitless.

The fix is a scoped audit with an explicit endpoint, not a perfect one. Three things define the scope before any spreadsheet opens: which folders are in (active project archives, brand kit, current campaigns), which are out (personal drives, finished-and-archived work, anything older than the cutoff date), and what the deliverable is (the next section names three artefacts that close the audit). With those three locked, the audit takes one or two weeks of intermittent work for a typical creative team, not the indefinite quarter it tends to feel like.

The Four Layers: Assets, Metadata, Usage, Ownership#

Most “content audit” articles cover only the first layer — the list of files. The other three are where the real work hides. Naming them up front turns the audit from “list everything” into four separate, scoped questions.

  • Assets. What files exist, where, in what format, what size. The simplest layer; usually a directory listing produced by the file system in an afternoon.
  • Metadata. What the files claim about themselves: filenames, embedded EXIF/IPTC, tags, descriptions, captions. Usually inconsistent or missing. The metadata that matters is what survives outside the original folder.
  • Usage. Which assets are actually used by which channels, and which are orphans nobody references anymore. The hardest layer to measure quickly because it requires looking at where assets ship to (web pages, campaigns, decks, social) rather than where they sit.
  • Ownership. Who owns the file, who can authorise its deletion, who has rights to use it. The hardest layer to reconstruct after the fact — especially when the people who created the files have left the organisation or never logged the rights in the first place.

The audit’s value comes from the second, third, and fourth layers. The asset list alone tells you nothing actionable. The taxonomy you build later runs on the metadata you find now — design the schema before the audit produces data, and the audit’s output is already in the taxonomy’s shape rather than needing translation afterwards.

Scope Decision: Full vs Sampled vs Team-by-Team#

A full audit on a 50,000-file shared drive will not finish. The team gets four weeks in, the spreadsheet has 1,200 rows, the question of “is this still relevant?” gets harder per row, and the project quietly stalls. Pick the scope strategy first.

Full inventory. Realistic only on small libraries (under 5,000 files) or after a duplicate-dedup pre-pass that cuts the volume in half. Worth doing when the asset count is bounded and the migration deadline is real.

Statistical sample. Pull a random 5–10% of files, audit that subset, extrapolate the proportions. The statistical sample answers “should we even bother” questions: how much of the library is duplicates, what proportion is in current use, how stale is the median asset. It does not produce a kill list — it produces a decision about whether the kill list is worth building.

Team-by-team. Audit one team’s folder fully, ship the process, then roll out to the next. Slowest in calendar time, fastest in actual completion because each team-level pass produces a finished artefact. Best for organisations where the asset library is genuinely siloed by team rather than fully shared.

The thresholds: under 5,000 files, full. Over 50,000, statistical sample first then team-by-team. Between, team-by-team unless the migration deadline is short enough to demand a sampled approach.

The Inventory Record (What to Capture Per Asset)#

The inventory is a spreadsheet with one row per asset. The columns determine whether the spreadsheet is useful or just busywork. Twelve fields cover almost every operational case; more than that and the rows stop getting filled.

FieldWhat good data looks like
FilenameThe original, exactly as it lives on disk. Don’t rename during audit; rename after decisions land.
Full pathFolder hierarchy from the root. Becomes searchable metadata after migration.
FormatJPEG, PNG, MP4, PSD, AI, INDD, RAW, PDF. From the file extension; verify on a sample.
SizeBytes. Useful for storage planning and detecting format anomalies (a 50 KB “PSD” isn’t a PSD).
DimensionsPixels (images), duration (video), pages (PDF). Determines what channels the asset can ship to.
CreatorFrom file metadata, IPTC byline, or institutional knowledge. Often empty — flag rather than guess.
Last modifiedFrom the file system. Proxy for “is this still being touched.”
Last accessedIf the file system tracks it. Proxy for “is anyone still using this.”
Brand-restricted flagTrue if the asset is part of the canonical brand library; false otherwise. The flag separates “treat as immutable” from “treat as project work.”
Licence statusOwned, licensed (with expiry), public domain, unknown. The unknown count is the audit’s most important finding.
Used whereChannel, page, campaign, deck. Empty cells are candidates for orphan classification.
Duplicate-of pointerFilename of the canonical version if this row is a near-duplicate. Resolves to one row per asset after dedup.

The thirteenth field is the decision — the four-way classification covered in §6. It comes last, after the rest of the row is populated, because the decision depends on the data above it.

Duplicates and Version Chaos#

The hardest part of a real audit is the duplicate problem. Most articles wave hands at it. There are three categories and each one gets handled differently.

Exact duplicates. Identical bytes, different paths. Detected with a file-hash pass (md5 or sha1 across the library). Easy to delete — keep one canonical copy in the most-recently-modified location, archive the others. A typical creative team’s drive has 15–30% exact duplicates by file count after a few years.

Near-duplicates. Same image, different format or resolution. logo.png, logo@2x.png, logo-final.png, logo-FINAL-v2.png. The decision per cluster is which one is the master — usually the highest-resolution lossless source. Detected by filename pattern matching plus visual sampling on the borderline cases.

Version chaos. Drafts, semi-finals, real-finals, post-launch edits. Hardest because the “right” version depends on usage context: the version on the website is not the version sent to print. The pattern that works: keep the version that shipped, archive the rest, accept that future versions of the asset go through a real version-control discipline rather than the filename ladder. Versioning conventions are the bit that turns this section’s mess into the next article’s solution.

The Decision Matrix: Keep / Update / Archive / Delete#

Every audited asset gets one of four decisions. The matrix removes the per-asset judgement call and replaces it with rules.

DecisionTriage criteriaWhere it goes
KeepIn active use, current, well-named, licence in order. Last accessed within 12 months OR brand-restricted regardless of access.Migrates as-is to the DAM. The canonical asset.
UpdateIn active use but needs work before migration: rename, re-export, add metadata, fix licence record, replace with a higher-resolution master.Goes through a fix queue, then migrates. The fix queue is its own time budget — usually 10–20% of the keep count.
ArchiveNot in active use but legally or historically valuable. Old campaign assets, brand-history references, anything tied to a contractual retention requirement.Cold storage. Not the live DAM. A separate archive bucket on cheaper storage, indexed for retrieval but not surfaced in normal search.
DeleteOrphan, duplicate, obsolete. Last accessed > 18 months AND no inbound usage AND no brand value AND no licence retention requirement.Goes away. Confirmed by the named owner of the parent folder, then deleted in batches with a 30-day soft-delete window before permanent removal.

The point of the matrix is to remove judgement from the per-row decision. A row that hits the delete criteria gets deleted. A row that’s borderline goes to update or archive, never “think about it later.” The audit only finishes if “think about it later” isn’t one of the options.

Special Cases (Legal Hold, Licences, Orphans)#

Four cases break the matrix and need their own row in the spreadsheet.

Legal hold. Assets that cannot be deleted due to ongoing litigation, regulatory retention rules, or industry-specific obligations (pharma, finance, healthcare). Flag and exclude from the matrix entirely — these always migrate or always archive, regardless of the other criteria.

Expired licences. Stock photos, fonts, or music whose use rights have lapsed. Need replacement, not relocation — archiving an expired-licence asset is still a violation if it ships from the archive. The right action is to remove the asset from any active distribution and tag it as “awaiting re-licence or replacement” in the audit.

Brand-restricted. Assets that should never have been on the public drive in the first place. Logos in their original Adobe Illustrator source, executive headshots from internal sessions, draft strategy documents. The audit catches these before the migration scatters them further. The action is to consolidate them into the brand-restricted area of the DAM with appropriate access controls.

Orphan files. No owner, no usage, no metadata. The archaeology layer of any creative library. Default action: archive for one quarter, delete if nobody claims them. The 90-day window is enough to catch the case where a folder gets accessed for the first time in years.

What an Audit Produces#

The audit is finished when it has produced three artefacts. Define them at the start; they’re the deliverable that closes the project.

The spreadsheet. One row per asset with the inventory columns from §4 and the decision from §6. The full record. Lives somewhere persistent (Notion, Airtable, or a Google Sheet shared with the team) so it survives the migration as an audit trail.

The heat map. A visual summary of where assets live, how they distribute by type/age/decision. Useful for the stakeholder briefing that justifies the migration scope. A simple stacked-bar chart showing “5,200 keep / 1,800 update / 8,400 archive / 14,000 delete” converts the audit’s effort into a one-slide justification for everything that comes after.

The kill list. The explicit set of assets marked for deletion or deep archive, with the named approver who signed off on each batch. The kill list is what makes deletion happen rather than be agreed to in principle. Without a named approver per batch, the deletion sits in the queue indefinitely while everyone waits for someone else to take responsibility.

Frequency: One-Shot vs Recurring#

Two patterns for ongoing audit work after the first one ships.

One-shot pre-migration. The most common case. Audit, decide, migrate, never audit again. The risk is backslide within a year as new chaos accumulates — without intake rules in the DAM (covered in §10), the team’s drive habits import themselves into the DAM and the audit problem returns in eighteen months under a new tool.

Recurring quarterly. Lighter audits scoped to “what’s been added since the last audit.” Less heroic, more maintainable. Pairs with intake rules so new assets land tagged and the quarterly audit is trivial — usually twenty minutes per team to confirm the auto-applied metadata is right and flag anomalies.

Most teams should plan for one-shot then quarterly thereafter, even if the quarterly is “twenty minutes per team.”

Audit → DAM Intake (Feed the Findings Forward)#

The audit is wasted if its findings do not become the DAM’s intake rules. Every category of mess the audit found should become a rule in the new system that prevents the same mess from accumulating again.

  • Duplicate logos. The audit found 12 versions of the company logo across the drive. The DAM intake enforces a single-master in the brand kit; future logo uploads either match an existing canonical or get flagged for brand-team review before landing.
  • Inconsistent naming. The audit found logo_final_v2_FINAL.psd and brand-mark_v3.psd as the same asset. The DAM intake form enforces a naming convention at upload; filenames that don’t match the pattern get rejected.
  • Missing metadata. The audit found half the brand-asset library had no creator, no licence, no expiry. The DAM makes those fields required at upload. No row in the future audit ever has them empty again.
  • Orphan files. The audit found 14,000 files with no owner. The DAM enforces ownership assignment at upload — every asset has a named owner and a workspace; orphan creation is structurally prevented.

The audit produces a list of pain points; the DAM intake is where each pain point becomes “can’t do that anymore.” The spreadsheet’s long-term value lives in the rules it generates, not the rows it lists. If the audit is part of a Drive migration, the rules ship with the new system on day one. If it’s a standalone discipline pass before the migration is even scoped, the rules are the brief for the DAM you eventually pick.

Share this article:

Related Articles