This essay originally was published on January 14, 2021, with the email subject line CT No. 69: "Foundations for a content-driven data strategy," alongside a review of cookie management tool Iubenda.

Congratulations! You’re regularly publishing content. Whether you’re a one-person show launching your newsletter or running a 50-FTE-deep enterprise brand content marketing operation, you’ve gotten past the first hurdle: getting something out the door on a semi-regular basis.

Now you have to take care of your audience, ensuring that they’ll stay with you for your publishing endeavors. And the first big step in ensuring that they grow with you? Having a sustainable data collection strategy.

“But I don’t want to collect or sell data,” you say. “I hate when ads follow me around on the internet, and I wanted to create content for an audience not for robots.”

Lloyd Dobler, played by John Cusack, says, "I don't want to sell anything, buy anything or process anything as a career."
In high school and college I felt the same way as Lloyd Dobler.

Darling, you’ve built your career on a computer. You are creating and distributing data already, so you should understand how to collect, use and protect data responsibly.

Most content folks don’t start their careers thinking about the data they create and collect when they are producing and distributing content. For some reason we’ve built a world where numbers and words are considered wholly separate endeavors for two different types of people, and “data” connotes numbers.

Real talk: All the content we make with our screens and our keyboards becomes data, and interacting with an audience requires data management.

If you’re using a content management system, you’re creating structured data.

If you’re gathering email addresses in a newsletter, you’re collecting audience data.

Whether you think about it intentionally or willfully ignore it, data management is a part of your life as a content professional.

Legacy media companies have audience development departments (at least I hope they still do), filled with the information management pros who topped mainframes off with subscriber deets decades before databases were cool. (They’re still explaining their jobs to their families at the holidays, I assure you.)

Most brands and agencies leave data management and security to operations or have a separate analytics department that often doesn’t collaborate or integrate with content or publishing departments. Data given to marketers is rarely content-focused, or it’s so reductive that it doesn’t help make nuanced decisions (and that’s probably why so many editors still rely on pageviews, the worst metric ever).

Even now, 20 years after the rise of search engines, PPC, email newsletters, social channels and other tools, very few businesses devote resources to honing their digital first-party data collection strategy and approach.

And almost no one — except the most forward-thinking digital businesses — considers audience data collection as a facet of content strategy and website structure.

Data from Star Trek pats his head and rubs his stomach at the same time. [gif]
You can be an editor or content manager and understand audience data at the same time.

Why do you need audience data?

You need data about your audience so you can connect with existing subscribers and attract new ones. You need to be able to reach them on their preferred media, whether that’s via email or web or tv or social channels.

You also need data to maintain your audience’s interest and connection with your content. Unless you’re one of the lucky folks who can attract followers with a snap of your fingers or a bluster of opinions, you need to know what content your audience will read in the future. Good data significantly improves your audience's experience with your brand.

Most importantly, unique audience data is one of the most monetizable assets in a content operation. Audience data seduces advertisers, investors, your sales team, and even your audience itself. (Communities love to know what they have in common.)

Content marketers who have a hard time proving their value to the higher-ups can point to the data editorial gathers:

  • What do you know about your customers that you wouldn’t know without your content marketing?
  • How large can that dataset be?
  • How can you expand that data?
  • Will that data eventually lead to higher revenue?

Finally, you need your own (first-party) data because it is extraordinarily likely that in the future — maybe in two years, maybe in ten — the data we use for third-party digital advertising today will be significantly limited if not cut off entirely. Third-party cookies that connect personal audience data across websites are already going the way of the dodo (and, frankly, that data’s lost much of its effectiveness now that the novelty’s worn off).

Collecting your own data with the help of content ensures that you can reach your audiences, clients, and customers for years to come.

Before we get too far, let’s talk privacy

It’s a hard line to walk: your audience is entitled to their privacy. They’re entitled to you not spamming them or selling their information directly to data brokers. They’re entitled to having their data treated seriously and securely.

In The Office, Darryl and Andy close the door on Erin as she tries to peek through. [gif]
In this essay I'm using gifs from popular tv episodes I've never actually seen.

If you’re a media or content marketing company, or even a newsletter, promising “We will never sell your data” isn’t particularly realistic phrasing. If you run any programmatic advertising or use Facebook/Google/remarketing/floodlight pixels on your website, or you plan to sell your company one day, or even if you merge lists with three other newsletter authors in a new collective, you are technically selling your audience’s data.

Promising to “never sell data” and then doing any of the above… well, I’m not a lawyer, but that sounds like fraud. We could all use a little less fraud in our lives. So let’s be as responsible and honest as possible in the often-opaque world of digital content monetization.

As creators/publishers/editors/marketers serving our audience, we are also running businesses and are entitled to a grow revenue, however, and that means data is bought and sold through advertising, by publishing research, by connecting subscribers with each other, or some other creative usage of data that no one’s thought of yet.

If you’ve worked with a company that has prepped for GDPR or CCPA compliance, or any healthcare company that has to comply with HIPAA or any other privacy-focused legislation, you’ve probably been through some review of data management processes or privacy training (and you probably slept through it because let’s face it: data management training ain’t sexy). I’m aiming to make the topic a little simpler here, and hoping to empower you to collect good data with your content, instead of just relying on big tech pixels to do it for you.

What data do you need?

Writing for an audience means, at some level, you need to know who your audience is. You can continue marching to your own beat and say, “Our audience is everyone who likes our content,” but that’s not a particularly sustainable or monetizable business model.

Before you embark on your content-driven data acquisition strategy, you should define what you want to know about your audience first. It’s like setting a hypothesis in an experiment: if you know what you want ahead of time, you won’t be overbroad in your data collection and you’ll be able to package your results far more tidily.

Common audience data that I’ve found to be the most helpful in defining content strategies for clients:

  • Preferred content types (video, text, interactive, social)
  • Preferred information architectures (where do you summarize vs. where do you go into detail)
  • How they use your product, if in content marketing
  • Triggers to conversion action (not just of your own product but the products of your advertisers)
  • Biggest points of confusion or knowledge gaps (aka pain points)
  • Interest in exploring new topics/trends vs. clearly defined pillars
  • Current customer/reader needs versus prospective customer/reader needs
  • Prior knowledge of a topic (beginner, intermediate, expert)
  • Objectives/jobs to be done (i.e., how did they get to your content in the first place and where is that along their consumer journey?)
  • Behavioral patterns like level of engagement, reading frequency and user path patterns
  • Other related content/brands they like and enjoy
  • How they originally found your content
  • Job seniority or title

You’ll notice that demographic data isn’t on here. In my experience, demographic data is far more misleading than helpful, especially when content creators get an idea stuck in their head of who their target audience is and aren’t able to move beyond that vision.

Jon Stewart flails his arms in the air, saying "Stereotype alert!" [gif]
Audience stereotyping happens far more than we want to admit.

Demographic data can be helpful in showing stakeholders that their audience is more diverse than they originally believed. Demo data is also helpful if minors comprise your audience because, well, there are rules about marketing to folks under 18 and under 13. But otherwise, it’s more often used poorly and drives poor results.

Psychographic and behavioral data, however, is far more telling of how your audience interacts with and cares about your content. Knowledge patterns, gaps and objectives can define exactly how you build your process of creating, optimizing and distributing content. All the above data can be gleaned with the help of a solid content analytics platform and trained eyes — no personally identifiable information (PII) or third-party aggregate data platforms needed.

Every item in the list above concerns multifaceted aspects of audience data — and it will take time to collect, especially if you’re in charge of also creating and distributing content. All good data takes time to collect – so set up your content operation now to make sense of its data before third-party data becomes more limited.

What to measure instead of pageviews | The Content Technologist
Pageviews suck. Sessions aren’t much better. How will you measure content better in the 2020s?
Google Analytics 4 for Content Strategists | The Content Technologist
Google Analytics 4 will likely change how the web approaches content measurement, and content strategists and marketers should take note: these are changes for the better.
Is targeting an “affluent” audience racist? | The Content Technologist
One way to democratize the media: Know when to use behavioral data and when to use demographic data in audience development.