Skip to content
CD constituencydata

Research

Why your mailer sounds like a 2008 attack ad (and how to know)

Political language is a solved corpus. Training language models on decades of speeches, ads, and mail lets Civitas tell a first-time candidate — before they print 40,000 pieces — exactly how their message will land.

ConstituencyData Research · ·10 min read
LLMs messaging rhetoric methodology

A first-time candidate — we’ll call her Maria — is running for a state assembly seat. Her consultant drafts a mailer. Maria reads it, thinks it sounds sharp, signs off. The print run is 42,000 pieces. Three days after the mailer hits, her opponent’s surrogate posts it on Facebook with the caption: “This is who we are dealing with.” Within forty-eight hours, Maria is losing three news cycles defending language she didn’t realize she’d used.

The words in the mailer weren’t lies. They weren’t even particularly aggressive. The problem was tonal: the mailer used a phrase structure that pattern-matched, in the collective political memory of her district, to a notorious attack ad from 2008. Nobody on Maria’s team remembered that ad. The voters didn’t consciously remember it either. But the resonance was there, it fired, and the mailer landed wrong.

This is the failure mode our message stress test is built to catch.

The premise: political language is a solved corpus

Almost nothing about American political rhetoric is new. The structures, frames, appeals, and evasions that work — and the ones that detonate — have been performed in public for two centuries. Stump speeches, campaign ads, debate soundbites, mailers, press releases, attack ads, concession speeches, inaugural addresses. All of it sits in the public record.

Crucially, a huge fraction of it sits in the public record with outcomes attached. We know which ads moved numbers. We know which phrases appeared in the 72 hours before a candidate’s polling collapsed. We know which speech frames correlate with turnout lifts and which correlate with backlash.

What we didn’t have, until recently, was a way to process that corpus at scale. Large language models changed that. Training or fine-tuning a model on political rhetoric labeled with outcome data produces a system that can read a draft mailer and tell you — with evidence — that it echoes the tonal structure of a message that blew up in 2008.

What we study (and why it includes propaganda)

People sometimes flinch when we say we study propaganda. We understand why. The word has a specific weight, and nobody wants to be associated with it.

We study it because it works. Propaganda is the applied science of moving a population’s opinion at scale, and decades of political-communication scholarship have catalogued its techniques with enormous rigor — from Lasswell’s 1927 work on WWI messaging to contemporary studies of state-media framing in multiple countries. The techniques aren’t secret. They’re documented.

Understanding them matters for two reasons. First, detection: a candidate needs to know when their own draft has accidentally drifted into techniques that will be read as manipulative, because that read is lethal in modern media. Second, resistance: when an opponent runs a classic manipulation frame — fear appeal, scapegoat pivot, false-equivalence gambit — the candidate’s response needs to neutralize the technique specifically, not just contradict the content.

The Civitas model knows all the classical techniques. It flags them in your drafts. It flags them in opponents’ drafts. It tells you what’s happening. What you do with that information is up to you.

How the stress test actually runs

A campaign pastes in an asset — mailer, email, script, social post, op-ed, press release, whatever. The model runs it through several passes:

Tonal fit. Does the language register as credible, condescending, aggressive, defensive, stilted, or organic? Against what baseline? Against the tonal pattern of messages that have historically performed well in this specific constituency.

Historical resonance. Does this phrase structure pattern-match to a message with a known outcome? Both positive (“this echoes a turnout-lifting frame used in 2018 by comparable campaigns”) and negative (“this phrase structure appeared in an attack ad that collapsed in 2014”).

Rhetorical-technique flags. Which classical persuasion techniques are in use — appeal to authority, scapegoat, in-group/out-group, false dilemma, etc.? Most uses are fine. Some are tonally dangerous for this candidate’s brand.

Constituency fit. How does the language register against the actual linguistic patterns of the district? A phrase that works in urban Northern California reads as cold in rural Nebraska, and vice versa. We have decades of local political speech data for most regions.

Legal and compliance flags. Any language that creates legal exposure (unapproved claims about an opponent, FEC disclaimer issues, state-specific election-code exposures) gets flagged — not as legal advice, but as a “stop and check this with a lawyer” pointer.

The output is a single score card: letter grade per dimension, flagged phrases inline, and generated variants showing what the piece would look like with the riskiest phrases rewritten.

An example we can show

Here’s a redacted, real-ish example. A mailer opens with:

“They want to take away your schools, your community, and your voice. We’re the only ones who can stop them.”

The stress test returns:

  • Historical resonance: D. This phrase structure — “they want to take away…we’re the only ones who can stop them” — matches the tonal pattern of several ads that backfired in 2008 and 2014. The “only ones” framing reads as siege mentality to moderate voters. The “they want to take away” triptych reads as paranoid to voters under 45.
  • Tonal fit: C−. Aggressive siege frame. Will energize the committed base. Will alienate the persuadable middle, which is your stated target.
  • Constituency fit: C. This district trends pragmatic-moderate. Siege framing is 18 points lower on persuasion than pragmatic-hope framing in 2022 Civis testing data (proxy: comparable districts).
  • Rewrite suggestion: “Our schools are strong because this community built them. I’m running to make sure the next generation gets the same shot — not to take anyone’s voice away, but to add more voices to the table.”

The candidate may disagree with the rewrite. That’s fine. But the original, now, is a conscious choice rather than a default. And the candidate is not going to be surprised when their mailer drops and a political scientist tweets “this is textbook siege framing, straight out of the 2008 playbook.”

What the Meta tribe model will add

Meta has announced a forthcoming audience-simulation model (“tribe”) designed to predict how specific audience segments react to ads, based on anonymized behavioral signals across their platforms. When it ships, we plan to integrate it as an additional layer in the stress test — specifically, for campaigns that will be spending on digital ads, simulating how each creative performs against real audience archetypes before spending media dollars.

Until then, the in-house model handles this, with explicit caveats about what it can and can’t know. The in-house model is trained on language outcomes, not ad-platform outcomes; it’s strong on rhetoric and weak on creative. The tribe-model integration will fill the creative gap.

What it doesn’t do

A few boundaries we’re explicit about:

  • It doesn’t write your campaign for you. Generative suggestions are offered as starting points, not finished copy. A real human voice — yours — is still the point.
  • It doesn’t predict election outcomes. A well-stress-tested mailer still has to run into a real electorate. The stress test raises the floor, not the ceiling.
  • It doesn’t tell you what your values should be. Tonal flags are tonal. A candidate with a sharp, confrontational brand may want to keep a flagged phrase. The model informs the decision; it doesn’t make it.
  • It doesn’t replace a communications director. If you have a good one, the stress test gives them a faster feedback loop. If you don’t, it’s not a substitute for experienced judgment.

Why this sits inside the data platform

We built the stress test inside the same platform as the voter data and the donor models because the three decisions — who to talk to, what to say, how to fund saying it — are one strategic problem, not three. A campaign that stress-tests its message in isolation from its audience is stress-testing against an abstraction. A campaign that stress-tests its message against its specific district’s rhetorical history gets a different, better answer.

The corpus is public. The techniques are documented. The computing power is affordable. The only missing piece was someone willing to sit down, build the thing, and ship it to candidates who don’t have a congressional-campaign comms budget.

That’s what we did. It’s ready in the beta. We’d love to know what it flags on your next mailer.

Keep going

Turn this into action.

Everything discussed above is queryable inside Civitas. Free account, 50 queries a month, no credit card.