In my role as Support Manager at Parse.ly, I spend a lot of time thinking about and talking about metadata. It’s a key component of our integration, as well as an area where content creators constantly seek improvement.
But what is metadata, exactly? Where do you start? The very words “metadata” and “taxonomy” can make you feel like the very model of a Modern-Major general from Gilbert & Sullivan’s “H.M.S. Pinafore”:
“I am the very model of a modern Major-General, I’ve information vegetable, animal, and mineral, I know the kings of England, and I quote the fights historical. From Marathon to Waterloo, in order categorical…”
The lyrics alone are enough to make your head spin. But all that information, in order categorical, is the very infrastructure of your content analytics. And like the modern Major-General, content analytics is very well-acquainted with matters mathematical.
What is metadata, anyway?
Metadata is literally data, or information about data. It can be a difficult concept to wrap your head around as there are seemingly unlimited options as to what it can describe or define.
The most common types of metadata include:
- Descriptive: information that is identifiable and searchable (think subject, author, keywords)
- Structural: information on how the data is organized
- Administrative: information about the source of the data
- Reference: information about the contents and quality of the data
- Statistical: information about what produces the data
When it comes to content though, descriptive metadata is the most comprehensive information you need to understand your audience. While other versions of metadata are certainly important, descriptive metadata helps us answer questions like: What are we writing? How are we writing it? Who are we writing it for? Descriptive metadata provides a sharper lens through which to view your audience, one that can answer questions about your audience’s attention and uncover trends that you might miss otherwise.
Perhaps because metadata is largely hidden from view in html, we don’t associate it with the data that we use every day. Did you know your headline, author, and top-level navigation categories all qualify as descriptive metadata? Even the url is descriptive, telling web crawlers, “Hey web crawler! This is a story! A single, unique story!” Descriptive metadata, especially in the form of a JSON tag, is simply a way to organize all of this information in one place for the web crawler to read it.
The ability to use descriptive metadata like this is already within our control, made easy by CMS integrations and tag managers. But because metadata as a whole concept can encompass so much, many content producers tend to put off implementing their descriptive metadata, or forget about it entirely. In the 2015 New York Times Innovation Report, John O’Donovan, then CTO of The Financial Times, highlighted the importance of metadata:
“Everyone forgets about metadata. They think they can just make stuff and then forget about how it is organized [. . .] But all your assets are useless to you unless you have metadata — your archive is full of stuff that is of no value because you can’t find it and don’t know what it’s about.”
So how do you describe your content? How do you organize it? What do you want to learn about your content and the people engaging with it? The magic of metadata lies in unlocking the means to visualize and investigate these questions. The challenge content makers face is how to harness that magic through strategy and product applications such as CMS integrations, auto-tagging, etc. Let’s walk through the considerations behind a well-crafted metadata strategy.
For strategy: keep it simple
At this point, you might be curious or even excited about the seemingly infinite amount of information you can include. While it’s important to include information such as “headline,” “author,” and “section,” it’s easy to go overboard in fields that allow multiple values. Take topic tags, for example. Just because you can include up to 100 keywords per post, doesn’t mean you should.
A taxonomy with thousands of keywords is messy and confusing. It’s not going to answer the all-important questions about your audience that will help you reach your goals. Like anything, it’s about striking a balance. On top of “What are you writing?”, “How are you writing it?”, and “Who are you writing it for?”, consider asking questions like, “What topics do readers care the most about?” and “What is the sweet spot for word count?”.
So how do you decide what to include and what to leave out? Keep it simple. Consider starting with three T’s: topic, type, and template. If you’re looking for some inspiration, check out our post, 4 Content Strategy Questions You Can Answer with Tagging. It’s full of great tips from publishers that keep their taxonomy simple and focused on the information they’re most interested in.
For product integrations: keep it consistent
Once you’ve settled on a strategy, it’s time to implement it in your product applications. The technical aspect of metadata is important to keep everyone in your organization on the same page and maintain consistency. Inconsistencies as simple as a section with and without capital letters (“Blog” vs. “blog”), or an author name with and without a middle initial (“John Smith” vs. “John K. Smith”) can completely throw off a web crawler. This can create multiple entries in your audience data and make it difficult to get a true sense of how content is performing.
Be sure to use CMS integrations, auto-tagging, and any other helpful shortcuts. Almost every CMS has the capability to add metadata because, in addition to your audience insights, search engines and social media platforms use metadata to offer content previews. Some add it automatically, while others ask you to install a plug-in.
Every webpage has metadata. Search engines use this metadata to decide what searches the web page relates to. Social media sites and messaging platforms like Slack, Twitter, and Facebook use the metadata to show informative previews when the webpage is shared. Plus, your audience insights platform uses this metadata to organize and analyze your content in order to help you understand your audience better.
You may not be the very model of a modern Major-General, but a well-crafted metadata strategy might make you feel like one! By sticking to a streamlined metadata strategy and making use of products at your disposal, you’re well on your way to obtaining actionable insights about your audience.
Key takeaways for a well-crafted strategy
- Focus on descriptive metadata
- Keep it simple: start with topic tags, then consider type tags and template tags
- Sync up your CMS or tag manager to keep your metadata consistent