GVS is now part of Acquia. Acquia logo

Content inventory and content audit with Views

lisa's picture

Do you know the exact state of the content in your large Drupal site? Thinking of revamping, redesigning or upgrading your site? If you answered 'no!' to the first question or 'yes!' to the second question, it's time for content audit.

In this blog post, I cover the what, why, who, when, where and how's of content audits. I've conducted a few small content audits, and I'm leading a much bigger one, on Drupal.org.

What the heck is a content audit?

The idea behind a content inventory is to determine what content you have and where it lives (the quantitative survey). The content audit is qualitative, where you assess whether it's any good or not, and what needs to change to improve it.

Traditionally, content inventories are compiled manually, one page of your site at a time, in some sort of spreadsheet. The content inventory should also include PDFs, images, videos, and utility pages such as checkout and log in pages. Content should be inventoried regardless of whether it's hosted on your site or externally. If it's seen or heard within in your content, it needs to be held accountable.

Sites with 5000+ nodes could be auditing using a sampling of nodes that represent each content type, but what if you miss some glaring errors? In those instances, you could provide a small link to all site visitors who can flag the page as needing work (this can be done easily enough in Drupal, using a variety of approaches).

Newsflash! Content audits aren't fun and exciting (unless you're a content geek like I am). They can be really boring.

Disc Inventory X for Mac audits the contents and space of your hard drive, displaying the contents as colored squares to represent each file type, and how much space those file types consume. Wouldn't it be great to get this sort of clarity on your site content?

Hard disc inventory depicted as colored squares

Why you should audit your content

Primarily, because it's impossible to be sure of the content quality in large sites, especially those with multiple contributors. When a site is audited, all sorts of oddities will be found, such as unfinished blog posts, unpublished nodes, outdated or inaccurate content, redundancies, content in the wrong place, or content that doesn't meet the organization's style standards/guidelines. (You do have style guidelines, right?). You might also find gaps in your content, e.g. discover that Product A, Product B and Product D was covered extensively, but Product C's listing doesn't include target audience and Product E's description was never written.

In Content Strategy for the Web, Kristina Halvorson writes

"If you don't know what content you have now, you can't make smart decisions about what needs to happen next."

If you want even more reasons why you should conduct a content inventory/audit, the books listed at the bottom of the post give reasons, and more in-depth advice.

Who should do a content audit?

A person who cares about the quality of your site's content, or a content strategist. Sometimes the person who cares becomes the content strategist.

It's tempting to ask an intern or a random person in the organization who has free time to do, but it means the job may never get done, or done haphazardly, especially if they think they know the content really well. As with QA and user testing, content auditing is often better done with fresh eyes; someone outside the organization.

When to do a content audit

Ideally, you'll complete the audit before:

  • Reorganizing your current site structure
  • Upgrading to a new version of Drupal or (oh noes!) migrating to a different CMS
  • Adding new content types
  • Making content decisions based on your site's SEO performance

If a site were migrating to Drupal, a content audit could be performed after the migration, but before key decisions about site architecture, IA/hierarchy, navigation/menus and publishing workflows are finalized.

Like any website project, the more organized and informed you are, the more likely your project will be a success.

Where to do a content audit

I don't mean, "Should I be in the 3rd floor conference room, or should I be at my desk?" I mean, where should you store the content audit's data?

The probem with constructing your content audit in a spreadsheet is that when the content audit is revisited 6 months later, it's going to be out of date—what a drag!

The standard Drupal administrative content overview page at admin/content/node/overview doesn't allow you to add more columns or sort by column heading; the overview page alone isn't sufficient.

Roll your own inventory with Views

Here's a recipe, using Views 2 in Drupal 6 to (mostly) painlessly create your content inventory.

Screenshot of a content inventoryThe content inventory will be a like a spreadsheet, with columns listing information for each of your nodes.

Preparation for the content audit

  1. Ensure Views 2 is installed and enabled
  2. Enable core Statistics module (optional)
  3. You may want an administrative theme enabled; the table view of the View is going to get wide, and you don't want to try to decipher squished columns. Otherwise, just ensure the blocks or other page elements don't display on the inventory page.
  4. You can use the Annotate module to store editorial or "what needs doing" notes about each node.
  5. You can use flags to flag notes as needing a particular type of work.

I'll cover annotations and flags in depth, in a future blog post about the Content Audit module, which will package a content audit View and other tools for quicker inventories.

Building the view

First, create a new View and call it something like 'node_content_inventory'.

Basic settings

Title: [Your site name] Node Content Inventory
Style: Table
Access: administer nodes (or perhaps you want to use your Admin role for access)

Having long pages is desirable once you start resorting and filtering. If you have a couple hundred nodes, you can avoid using pagers, so set it to 'No' with Items per page as Unlimited.

Otherwise, display 200 items or so per page and enable the full pager.

In the Table styles options, ensure every field is marked as sortable. Unfortunately, you can't sort by path.

(If you're using Book content, you should include the depth too! But unfortunately, there isn't a default way to display the parent of each book node.)

Screenshot of view edit screen

Fields to display

Node: Nid Nid
Node: Title Title
Node: Path Path
Node: Type Type
Node: Published Published
Node: Updated date Updated date
Node: Post date Post date
User: Name Author
Node statistics: Total views Total views
Book: Depth Depth

These will produce the site's basic content inventory. The above are just suggestions; feel free to add more if you want to evaluate different fields.

Create a Page display and give it a path such as 'node-content-inventory'.

Filters

Next, you'll want to filter out content that isn't relevant to your content inventory. Let's say you want to audit everything on your conference site apart from Room and Time slot nodes.

Create a Node: Type filter, only including the content you want to audit. Then expose the node Type filter. You may want to create and expose other filters, depending on your needs.

Views edit screen

A manually-constructed content audit would include which section the site is in. So, if your site's sections are generated by taxonomy term, you should also include this taxonomy field in its own sortable column in the View.

Ideally, the node paths would be sortable with an exposed filter, but that will require writing a Views handler. It's on the To Do list!

In the header text, include the date and month the content will be reviewed, and update it each time you re-evaluate your content. (Every 6 months, right?)

Finally, save the view (actually, you should probably be saving as you're constructing it).

Evaluating the content

Look at your page at node-content-inventory. It's a giant list of all your nodes.

Now what? You have options.

  • Option A (Export): export this view to CSV and manage your content audit in spreadsheet software or Google Docs. The Views Bonus Pack module will let you export the view as a CVS file.
    Pro: Less data to store in your database
    Con: Your inventory is almost immediately outdated; will be extra time-consuming to update on a regular basis
  • Option B (Flags): 'Flag' content that needs to be updated using the Flag module. You can adjust the flag settings so that only certain roles can use this flag, and then set which content types can be flagged (or set it as global). You can then add the Flag to the content audit view, and/or create a separate view of nodes that have been flagged. However, marking something as "needs work" doesn't provide enough detail, so you'll want to record notes somehow (see Option C). You can create additional flags for each type of action to be taken, e.g. "needs style review" or "redundant", without the casual visitor even knowing, while keeping your content team in the loop.
  • Option C: Keep 'audit notes' field on each content type in Drupal site database with the Annotate module or by creating a field on the nodes.
  • Option D: Use the Content Audit feature module, which constructs the View, and will soon have configured Annotation (this hasn't been built yet).
  • Option E: Create another node type called Audit notes, and once filled in, then nodereference the node it covers. Then, it's easy to build a view of the Audit notes (I'm also wondering whether this could be a custom bit of code to automagically do this, then include in the Content Audit module).

Option C and D are great, because you're not reinventing the wheel every time you review your content inventory.

Content inventory for files and other non-node content

For files, such as images or PDFs attached to nodes, you might want to create a separate View and call it file_content_inventory. You could combine it with your primary node content audit, but the headings are different.

Other non-node content, such as landing pages created with Panels, Views, tpl.php or things like login pages need to be audited manually.

1, 2, 3... Audit!

Now the fun begins. Read existing content and check the following (and you may want to include extra instructions for your reviewers, if it isn't you doing the work).

  • Does it contain redundant info (is this content covered elsewhere? note down the node ID where it is covered)
  • Is it in the wrong place (does another page make more sense?)
  • Does the content need restructuring? Provide examples.
  • Is it inaccurate? Don't spend time researching the inaccuracies, but if you think the content is inaccurate, let's say so.
  • Is it useful? Does it add value? Or just wasting people's time? Just because it exists, doesn't mean it should be kept around.
  • Does it need language (grammar, spelling etc) improvements?
  • Does the tone and style meet the guidelines? http://drupal.org/contribute/documentation/guide/style
  • Is it written for the right audience?
  • Is there any key information that's missing?
  • Does the url path not make sense? As a top level or important page, does it need it's own path?
  • Could the content be supported by an image or illustration? If yes, what?

This process could take weeks or months, depending on many factors.

Summary

  • Content audits. They're important. Do them.
  • If you don't want to do the audit yourself, find someone who can. Hire an information architect or content strategist.
  • Take the time to improve your content based on the audit. People come to your site for the great content.

process
Content audit process: Start > Decide what needs auditing > Configure tools for auditing > Evaluate content > ??? > Profit!!

The '???' is where you (or your content strategist and writers) do the magic of identifying and fixing content problems. After that, your site will reap the benefits.

Now, I'm interested if anyone has done content audits using Views or other modules, and what your experiences it it were! Please drop some knowledge in the comments below.

Additional Reading

Content Strategy for the Web, by Kristina Halvorson, particularly Chapters 4 and 12

The Elements of Content Strategy by Erin Kissane

Information Architecture for the World Wide Web by Peter Morville and Louis Rosenfeld

A Project Guide to UX Design by Russ Unger and Carolyn Chandler

Comments

I meant to link this

I meant to link this must-read blog post from A List Apart, A Checklist for Content Work.

Wow, great write up and very

Wow, great write up and very useful. I can think of how this can come in handy for SEO work, for example.

GVS projects

CertifiedToRock.com was created to allow community members and employers to get a sense of someone's involvement with the Drupal project.

GVS is now part of Acquia.

Acquia logo

Contact Acquia if you are interested in a Drupal Support or help with any products GVS offered such as the Conference Organizing Distribution (COD).

We Wrote the Book On Drupal Security:

Cracking Drupal Book Cover