CT No.194: Free will v. automated content recommendations

This week hit me like a Hummer striking a cougar on the interstate. But the resilience I've practiced is coming in handy, and, unlike the ill-fated adventuring mountain lion, I'm still moving.

This newsletter supports the striking staff of the Washington Post and is frustrated in solidarity with layoffs in the past few weeks that reflect the sad state of creative labor and knowledge work in 2023.

Also, a correction: I misspelled Shane MacGowan's name in the last issue and regret the error. I've been listening to The Pogues as penance.

–DC

In this issue:

The first of a two-part essay that has been in the hopper for a while
The return of links! No, nothing to do with OpenAI or the "existential harms" of artificial generative intelligence — surely you can find that elsewhere — just a few good reads
Did You Read? Revisiting one of my favorite sci-fi novels

Sponsors and promotions

Learn sustainable, algorithm-proof content strategy

Join our 2024 cohort to learn methodologies and measurement that improves digital experiences and boosts your business

Explore the cohort

When science isn't: The over-STEMification of digital content experiences

by Deborah Carver

If I take a hammer to my laptop to see if it survives a few taps, does that make me a physicist?

My friend works with ghost hearts.

A pig's heart, washed translucent with detergent, is injected into a patient's stem cells, enabling the ghost heart to be transplanted into a patient without their body rejecting the foreign organ — at least theoretically. A brief Google search tells me the technique hasn't worked in practice yet, but the fact that we're even at the point where we can bioengineer hearts blows my mind. Science like this makes my work feel miniscule. As far as I'm concerned, ghost hearts, if they work, are nothing short of a miracle.

"Yeah, but I wish I knew more women scientists," my friend tells me over beers. "So I can talk about my work without someone hitting on me or worried that I'm hitting on them." She's new in town, and the struggle to make friends as a working adult is real.

"I work in science…kind of," I say, with beer-backed confidence. After all, I'd recently completed my certification in an A/B testing tool and was running experiments to see which landing page variation would perform better. It wasn't saving lives, sure, but it was empirical, right? Technically, when push comes to shove, I'm a woman in STEM.

She stared at me point-blank and deadpanned. "You work in math."

I took another sip and nodded humbly. She was right.

The other, more questionable scientific method

Although I was now "certified" to run "content experiments," and had to answer questions about "hypotheses" and "variables" on the company's certification test, the whole kit and caboodle of certification and experimentation was another complex marketing technique concocted to sell software.

That conversation was the first time it registered that "computer science," once you get past the hardware and exit the university physics department, doesn't much resemble science, at least the kind that directly relates to life on Earth. "Data science" and the 12-week professional certificate programs that supplement its popularity are even less scientific.

From my perspective, learning to code is more like learning another language. Or attending logic class (historically stationed in the philosophy department). Teasing out the organic structures of proteins in a laboratory environment is something else entirely. Understanding predictive modeling can help you play moneyball temporarily, but it doesn’t scale as anticipated. And the scientific method — the process of isolating a variable in a controlled setting to prove or disprove a hypothesis — completely falls apart when applied to the chaos of the internet.

But it’s called science because, well, STEM gets funded. Humanities do not. I can attest that when I shifted my career from pure creative production to the discipline known as digital strategy, I made significantly more money and had far more job security.

What are data science and predictive analytics?

Because I work in digital strategy, intelligent people on the creative production side occasionally imply that my work is somewhat scientific. When I collaborate with editorial professionals, some say, "You're the science and I'm the art!" I always gently correct them: We are neither. We are in business.

Technically, my work in digital research relates to the practice of data science, the mostly 21st-century discipline of using statistics to understand human behavior via unstructured data. "Data science" is the university-enabled brand for what's more accurately known as "predictive analytics," or the idea that we can calculate ourselves into knowing the future. You can read the Wikipedia, but the gist behind common applications of data science is that if you look at enough data of how people have behaved in the past, you can predict how most people will behave in most situations.

Predictive analytics rests on the assumption that humans will act predictably in every situation, whether moved by logic or emotion. If we can simply measure all the behaviors, data science says, we can statistically determine what humans will do next. It maths aside the concept of free will, resembles the Calvinist concept of predetermination, and deeply contradicts my own lived experience, which is that whether in business or in life, people rarely act as one expects.*

The assumptions of data science work most convincingly in homogeneous situations and populations. Data science can predict what will likely work for most people in the population of “everybody online,” but not every individual. From personalization to autocorrect to recommender systems, most calculates, quite literally, the lowest common denominator of online behavior.

The assumptions of data science — that most people who do one thing are likely to do another — replicate what’s called the Hypodermic Needle theory in communication studies.

Similar to the concept of technological determinism, the Hypodermic Needle theory assumes that audiences have no agency or critical thinking, that if they see a message on television, they will believe it. To some extent, it's true. We know that advertising sells products when you blast an audience with repeated messaging. We know that people who only consume one source of media tend to parrot that media’s point of view.

But as an anti-authority contrarian who rarely sees her perspective reflected in mainstream media, the Hypodermic Needle has never held much weight with me, especially when I learned the intricacies of polling, quant/qual research methods, and communications theories in graduate school. (Bring on the Raymond Williams!)

I'm more of a Uses and Gratifications girl: people use media in a variety of ways to satisfy different needs. We do not all operate on the same wiring. Accommodating for diversity of thought and behavioral preferences is not only ethical but also necessary in 21st-century business.

*It's also helpful to understand that predictive analytics was popularized in mass culture by the midcentury science fiction writer Isaac Asimov. Asimov’s fans include many tech industry bigwigs who, having learned that Jules Verne predicted submarines, saw an economic opportunity and placed business bets on another sci-fi writer. I am being facetious and glib here, but while Asimov may have been a biochemistry professor, it’s important to remember that he wrote fiction. The Foundation series has as much to do with the material circumstances of the real world as The Devil Wears Prada and The Pelican Brief, two best-sellers also based on professional experience.

When 20th-century methodologies fall apart

Inconsistencies like cultural nuances, value differences, and uneven technological adoption make predictive analytics far less effective. Accurately sampling diverse populations remains extremely difficult. Most existing media measurement systems account only for demographic variations, not behavioral differences. The shift from demographic to behavioral profiling has been a challenge for advertising, data science, and mass media publishing alike.

For example, electoral polling was considered accurate in the 20th century when everyone had a phone line, watched the same three major network TV stations, and read the same local paper. Information production was codified in the news publishing industry’s relatively homogeneous adoption of journalism ethics. Niche media might have influenced a few people, but most information about political candidates came from the same places and could be accessed the same way. Polling scientists could make reasonable assumptions about people based only on demographic and class differences, and technological adoption didn't matter.

With the introduction of cable networks, then cell phones, and finally social media, people bypassed landlines entirely and had a reasonable alternative to watching TV. While some shifts in media consumption correlated to demographic and class categories, it made the dataset much less predictable.

There could be many reasons for this change. Pollsters couldn't access all populations via landline. The general population's willingness to participate in polling was marred by the feeling that posting online had the same impact of answering a survey call. And pollsters considered race, class, age, and gender to be the only indicators of diversity. The homogeneity of 20th-century institutions never accounted for diversity of behavior introduced by multiple media sources and rapid, unprecedented technological adoption.

That’s why polling in the 2016 U.S. election was so far off from the actual election results. People with landlines or listed phone numbers were over-indexed in the samples. Media sources who knew polling used to be accurate were convinced people would behave the same way in 2016 as they had in 1996 or even 2008. Online audiences were represented but their behaviors were misunderstood. Not only had the demographics shifted, but device behavior and media consumption had so radically changed, the old sampling methods rooted in media stability were no longer predictable.

**That’s also why I’m a big fan of basing content and media buying strategies on online behavioral patterns and contextual advertising, instead of simple demographic targeting, which is usually stereotypical, if not flat-out racist.

Is it data science or is it software marketing?

Despite many instances of misapplication of data science across culture, the discipline’s positivist believers still shout from the rooftops of progress and profits. For the past decade, I've watched digital strategists declare the winners of a single marketing A/B test as if they've discovered the law of gravity. Even when the data show that 99.5% of people rejected both the A and B variations.

Many prominent SEO experts announce results from “empirical” tests on Google organic search algorithms as if they are discovering a life-saving synthetic protein. In reality, the search engine is just a Google product that is adjusted, often, by Google, whenever Google wants.

In the world of AI testing and critique, software developers and internet journalists purposely attempt to break the product and call it “science.” While I understand the motivations — ChatGPT has many flaws — I hesitate to call any software testing scientific. If I take a hammer to my laptop to see if it survives a few taps, does that make me a physicist?

And if the software changes all the time because of “agile” and “minimum viable product” business methodologies or because of “machine learning,” is the scientific method even remotely relevant?

Keep it small, scientists: When statistical modeling works for content recommendation

It's not that the entire practice of data science is snake oil — although I'd argue that we need to question the label of "science" along with the aggressive university marketing budgets and certificate programs that have powered its rise. Statistical modeling still has its place. Machine learning can find patterns and trends that humans never would.

But when it comes to digital strategy today, “informed guessing to affect business results” or "editorial judgment based on various inputs" is more accurate than “data science.”

In small populations with similar cultural affinities (B2B software buyers, movie fans, etc.), data science can be extremely accurate in determining which content or idea might resonate the most. If you have a critical mass of users who share similar behaviors, you can determine which movie they might like best (Netflix accurately being able to predict your movie taste on a scale of 1-100) or what content they might need to select an SEO software vendor (why account-based marketing resonates with B2B buyers).

As I’ve written before, content personalization works when businesses invest in a wide variety of content designed to accommodate the diversity of the human experience and consumer preferences. But results evaporate when we boil preferences down to a few limited types (say, OCEAN personality traits or Berkley, Burlington, and Cambridge) or optimize toward mass adoption (Netflix scaling down from 5-star ratings to thumbs up/thumbs down to encourage more users to rate).

When predictive analytics that originate in online behaviors are extrapolated to apply to much larger populations, and the people who are telling stories about the data science understand neither the mechanics of the theory nor the finer points of what they are dealing with, the applications get, well, fuckity. Enshittified. Frustratingly inaccurate for those of us who enjoyed the nuanced user experience that niche datasets accommodated.

In next week’s exciting conclusion, we’ll discuss some better ways to use data to inform digital content strategy.

Content tech links of the week

On February 27, I'm speaking at the virtual TBD Conference. Join us for big ideas and future-forward thinking!
The other bad "science" contemporary tech overuses: Behavioral economics. This NYT op-ed critiques the "nudge," a shaky but commercially palatable idea that's backed many content projects I've worked on and even more digital practices I absolutely despise (incessant push notifications! the idea that everything needs to be a "habit"! health "advice" from my insurance company without consent!). Like predictive analytics, behavioral economics creates the illusion that we can manipulate human actions through automation, but it's not that simple.
The end of the authentic moment: Aboard agency with some big thinking on why "authenticity" almost never is, at least when it's advertising.
Our friends at Storythings ran a remarkable series about how content discovery is changing, and we absolutely love the final chapter: Build your own archive.

The Content Technologist a company based in Minneapolis, Los Angeles, and around the world. We publish weekly and monthly newsletters.

Publisher: Deborah Carver
Managing editor: Wyatt Coday

Collaborate | Manage your subscription

Affiliate referrals: Ghost publishing system | Bonsai contract/invoicing | The Sample newsletter exchange referral | Writer AI Writing Assistant

Did you read? is the assorted content at the very bottom of the email. Cultural recommendations, off-kilter thoughts, and quotes from foundational works of media theory we first read in college—all fair game for this section.

I'm game for science fiction, but I prefer to read the more cynical kind. Y'know, the tomes that lean heavily on the fi (it bears repeating: all sci-fi is fiction). This month I decided to reread Kurt Vonnegut's Breakfast of Champions, a book that deftly links mass publishing of pornography to sci-fi's pervasiveness in the minds of midcentury American men. While some portions are out of date (all-night porno theaters are certainly a thing of the past), much of this 50-year-old novel aligns with today's progressive sensibilities. Take this paragraph from the novel's opening history of America:

"Actually, the sea pirates who had the most to do with the creation of the new government owned human slaves. They used human beings for machinery, and, even after slavery was eliminated, because it was so embarrassing, they and their descendants continued to think of ordinary human beings as machines."

It's glib, of course (it's Vonnegut). But also, I can't read the phrase "artificial general intelligence" without thinking about it.

CT No.194: Free will v. automated content recommendations

Learn sustainable, algorithm-proof content strategy

When science isn't: The over-STEMification of digital content experiences

The other, more questionable scientific method

What are data science and predictive analytics?

When 20th-century methodologies fall apart

Is it data science or is it software marketing?

Keep it small, scientists: When statistical modeling works for content recommendation

Content tech links of the week

Related posts

CT No.220: My brain hurt like a warehouse; it had no room to spare

CT No.219: The reluctant capitalist's tech stack

CT No.218: Truth in AI-dvertising

CT No. 217: Digital performance data that's rooted in reality

The latest posts + email-exclusive content in your inbox every week