Not so anonymous: how forensic linguists figure out your identity

Subtitle
12 min readAug 27, 2020

--

Photo: Marco Verch/Flickr Creative Commons

This is a transcript of a Subtitle podcast episode

Kavita Pillay: It’s one of the biggest mysteries in the US right now.

NPR report: “This was someone who said, ‘I am working in the Trump White House, but I’m secretly part of the Resistance, and I’m trying to foil him.’”

Kavita Pillay: As we’re recording this, we’re a couple of weeks away from the publication of a book by Anonymous. It’s called A Warning. The book is an expansion of an op-ed piece from last year by the same author, which sent people in the White House and beyond on a frenzied search. So, who is it?!

The Late Show with Stephen Colbert: “Sources say Trump is still obsessed with finding the person.”

NBC report: “The White House and reporters trying to decode phrases in the call, looking for patterns, without success.”

NPR report: “We’ve got denials from the vice president’s office, secretary of state, director of National Intelligence, Ben Carson, Jim Mattis, Steven Mnuchin, Rick Perry, Nikki Haley, Mick Mulvaney.”

Kavita Pillay: The op-ed was 965 words long. Armchair sleuths of all types took it upon themselves to analyze every single one.

The Late Show with Stephen Colbert: “The mystery of the Op-Ed writer’s identity has been cracked by Wikileaks, who tweeted, ‘Based upon our statistical analysis of the language used in the NYT anonymous op-ed, the author is likely to be an older, conservative male.’ Really? Really?”

Kavita Pillay: To many people, one word stood out.

Mike Pence: “That’s going to continue to be a lodestar…”

“…to our lodestar.”

“...It really was the lodestar.”

Kavita Pillay: Vice President Mike Pence really likes saying that! Lodestar is an unusual word, and if you’re like me, you may not remember what it means out of context. It’s a person or thing that serves as an inspiration or guide — and Pence has used it enough over the years for it to become part of his linguistic DNA. Others in the administration have come under suspicion. All of them, including Pence have denied writing the op-ed.

At least for now, Anonymous remains anonymous. But on November 19th, 272 pages of their handiwork will be available to the world. You can bet that the linguistic Sherlocks will be analyzing every word, every phrase — even every comma — to try to solve America’s biggest linguistic whodunnit.

Patrick Cox: From Quiet Juice and the Linguistic Society of America, this is Subtitle. Stories about languages and the people who speak them. I’m Patrick Cox.

Kavita Pillay: And I’m Kavita Pillay. The way you — and I and everyone you know — speaks and writes is unique. Like a fingerprint. And the people who read those fingerprints are upping their game.

Patrick Cox: Including one guy who walked away from musical stardom. Instead, he became one of America’s leading forensic linguists.

Patrick Cox: Rob Leonard grew up in Queens, New York. He came of age in the late sixties, at the height of the counterculture when Americans were almost at war with each other.

Rob Leonard: Very reminiscent of now where we had rightists and leftists and then a whole bunch of people in the middle. And there were fistfights all over.

Patrick Cox: Especially on campus. Rob was a student at Columbia and there were two things he loved: language and music. He studied languages. And he hung out with musicians. He and his friends started a group, but they didn’t sing the revolutionary hippie stuff of the time.

Rob Leonard: We really liked harmony and we liked fifties doo-wop music.

Music: Rock ’n Roll is Here to Stay

Patrick Cox: Rob’s group was small-time, just a side project. They played tiny, odd venues, like the psych ward of St Luke’s Hospital, a block from campus. But a weird thing happened. The hippies and the revolutionaries — they loved this retro music. Some liked the feel-good sound. Others thought it was some kind of performance art. Either way, the group — Sha Na Na — took off.

Music: Teenager in Love

Patrick Cox: Sha Na Na’s big break came when Jimi Hendrix arranged for them to play at Woodstock.

Rob Leonard: And then we performed right before Jimi.

Patrick Cox: In gold lamé suits.

Music: At the Hop.

Patrick Cox: The rest is doo-wop history. Sha Na Na played all over the world, had their own TV show, performed in the movie Grease with John Travolta and Olivia Newton-John. And they’re still performing today, at least some version of the band is. But Rob Leonard didn’t go along for the ride. That, of course I why I am telling you this story.

Rob Leonard: We were very busy. We were very popular, and we had such a touring schedule every week and we were all in school.

Patrick Cox: And Rob was a serious student. He wanted to study another language. But he wasn’t quite ready to quit the band, which was getting booked for gigs every weekend, often out of town. He knew the only way he could take up another language was if the classes weren’t on travel days, Mondays or Fridays.

Rob Leonard: And out of the 55 languages taught at Columbia, guess which one was not taught Monday or Friday? Swahili.

Patrick Cox: So, Swahili it was, which changed his life. Rob moved to Kenya to expand his studies. He learned several dialects of Swahili, as well as a handful of other languages. He eventually become a professor of linguistics. And over time, he became drawn to the new discipline of forensic linguistics, the scientific study of language as it relates to the law — and often to verbal or written evidence. For years, Rob kept track of this emerging area of inquiry.

Rob Leonard: But it wasn’t until the Pennsylvania State Police came to me out of the blue one day and said to me we’d like you to take a look at these two letters.

Patrick Cox: More on those letters, and how Rob interpreted them, in a couple minutes.

Kavita Pillay: All of us have practiced some form of forensic linguistics, whether we think of it that way or not. Is this note really from the Tooth Fairy? Did a child forge this report card? Is this will and testament authentic? But the science of forensic linguistics became prominent in the mid-90s because of one case.

(From The Oprah Winfrey Show) Winfrey: “How did you, David, come to suspect that your brother was the Unabomber?”

David Kaczynski: It actually began with my wife, Linda, She did at that point say, ‘Look, the Unabomber has sent this manifesto to the Washington Post. Would you read it and tell me what you think?’”

Kavita Pillay: The Unabomber had mailed 16 bombs over the course of 17 years. He killed three people and maimed many others. He took great care to leave no fingerprints or DNA, but his 35,000-word manifesto became the strongest evidence against him.

(From the New Yorker) Reporter Jack Hitt: He referred to women as ‘broads.’ He referred to black people as ‘negroes.’ So it obviously put him, sort of, coming of age before the civil rights movement.”

David Kaczynski: There was a particular phrase where he had called modern philosophers, ‘cool headed logicians,’ and I recalled a similar phrase in a letter he had once sent me.

Kavita Pillay: A judge granted an arrest warrant based primarily on linguistic evidence. This was a first.

ABC News report: “Late this afternoon, Ted Kaczynski was taken from his backwoods cabin, leaving people in the nearby town of Lincoln stunned.”

Kavita Pillay: Kaczynski is a Harvard grad with a PhD in math, but his calculated attempt to hide that fact backfired.

(From ABC News report) FBI Detective James Fitzgerald: “And it says there, ‘People with advanced degrees aren’t as smart as they think they are.’ Huh, so he doesn’t have an advanced degree? Or he does and he’s lying.”

Kavita Pillay: It makes me think again about that word from the Anonymous New York Times op-ed

Mike Pence: That’s going to continue to be a lodestar.

Kavita Pillay: If you want to stay in the shadows and you’re not Mike Pence, but you know he likes saying lodestar, wouldn’t you be tempted to throw in such a shiny word and watch the world pounce?

Patrick Cox: After the Unabomber case, investigators turned to linguists more often to help solve crimes. That was what the Pennsylvania state police were doing in 2006 when they asked Rob Leonard to look at two letters. The first letter appeared to be from a stalker. A man named Brian Hummert told police he found it on his windshield.

Rob Leonard: And it said, this is the proof your wife is terrible all sorts of nasty sexual things and I had an affair one night with your wife, and I’m back in town now. And now is the time for payback.

Patrick Cox: Not long after, Hummert’s wife Charlene was found strangled to death in her car.

Rob Leonard: Another letter was sent to the press and to the police from a self-professed serial killer who said, “You’re looking in the wrong place, I killed her. This is the fifth women I’ve killed. She was having an affair with me. She wanted to break it off so I broke her neck.”

Patrick Cox: So, here were the cops asking Rob to look at these two letters.

Rob Leonard voiceover: On the surface they look wildly different, different styles. One was typed. One was sort of scrolled. One had mistakes, which we ascertained were probably disinformation mistakes because they weren’t of a piece with the rest of the style of writing.

Patrick Cox: And there was something else: the underlying narrative frameworks of both letters were complicated but structurally smooth. Effortless.

Rob Leonard: There were flashbacks. There were flash forwards. The author stepped aside for a moment to comment on what was going on and he did it so well that you weren’t even aware of it. I mean, that’s good writing.

Patrick Cox: There were other clues. The phrase: “She wanted to break it off, but I broke her neck.” Versions of that little-used rhetorical device — ironic repetition — appear in both letters. The final clue was the use of contractions. The writer never contracted positive verbs; he never said “I’m.” He said “I am” — though he often contracted negative verbs: “I’m not”.

Rob Leonard: And that was the same pattern we found in the stalker letter the serial killer letter.

Patrick Cox: And they found it in the writing of Charlene Hummert’s husband. In a case like this, Rob Leonard is careful never to point the figure at someone — and say, “They wrote this thing.” But his expert testimony linking Brian Hummert’s writing style to the two letters helped convict Hummert.

Patrick Cox: Hiding behind words, concealing our identity: it’s more difficult than most of us think.

Kavita Pillay: Joe Klein learned that lesson the hard way.

Joe Klein: My name is Joe Klein and I wrote ‘Primary Colors.’ I did it by myself with no help, no secret sources.

Kavita Pillay: It’s July 1996 and the Unabomber has been arrested based on his manifesto just two months before. Klein, on the other hand, was a Newsweek reporter who had published a best-selling fictional account of Bill Clinton’s first presidential campaign — and he’d done it anonymously.

Joe Klein: And I want to tell you, it was great, it was a lot of fun, in fact it was the most fun I’ve ever had with a keyboard.

Kavita Pillay: Joe Klein was unmasked by a literature professor named Donald Foster. Up until that point, Foster had gained notice in his field for working on mysteries around 17th century sonnets: trying to figure out whether Shakespeare did or did not write them. But when Primary Colors came out, he applied a basic forensic linguistic technique called “word frequency” to show that Klein was the most likely author. Soon after, Foster was pulled into an infamous case that centered on a ransom note.

ABC News report: It’s been more than a week since the murder of JonBenét Ramsay, the six-year-old beauty queen shocked the town of Boulder. The detectives enlisted the help of this man, Prof. Donald Foster of Vassar College. Reporter: “You look at something and you figure out who wrote it, in essence.” Foster: “Yes, that’s what I do best.”

Kavita Pillay: Foster is not a forensic linguist, but he went on to name a suspect in the JonBenét case: someone who‘d already been cleared. After Sept. 11th, he wrongly accused a bioweapons expert for sending anthrax-laced letters around the US. And his claim about a Shakespeare poem? Other Shakespeare experts disagreed.

If there’s a lesson in Professor Foster’s rise and fall, it seems that the power of forensic linguistics must be balanced by caution and humility. A literature professor at the heart of a modern morality tale. Imagine that.

Rob Leonard: People have seat of the pants knowledge. Lawyers and judges are fabulous self-taught linguists.

Patrick Cox: This is Rob Leonard again. Rob, by the way, is the founder of the first forensic linguistics graduate program in the United States, at Hofstra University on Long Island, New York.

Rob Leonard: You don’t learn scientific linguistics just by being immersed in language any more than you could consciously explain to somebody as a native speaker of English when you do and do not use the word “do.” “I do like ice cream,” and “He did do all his homework,” and “Do I?” We just don’t have conscious knowledge of this stuff. We have a different toolkit than everybody else.

Patrick Cox: Rob recalls a case last year when he had a team of assistants working on some written evidence where one piece of punctuation was key.

Rob Leonard: And they came up with twenty-three different categories of comma use that they found.

Patrick Cox: That’s so funny. I mean you, you think all of the Twitter debates and people getting really hot under their collar about things like the Oxford comma. And yet here you’ve now come up with an example of common use that is completely neutral in its judgment of them but just sees patterns and as a result is able to help justice.

Rob Leonard: Yes, further the cause of justice. I often sort of blush at least internally when I say we are trying to analyze language to further the cause of justice. But it’s true.

Patrick Cox: But before anyone gets too carried away, it’s worth noting that linguists don’t yet know much about how language works or how to accurately analyze it. Forensic linguistics is still in its infancy. And that toolkit that Rob talks about.

Not all linguists agree on what should be in it. There’s an especially wide variety of opinion on how much linguists should draw on the big data of real-life conversations and writing. The databanks are larger than ever, and the tools to analyze them more sophisticated. One camp of linguists relies mainly on these repositories. The other camp, put simply, are the human analysts, people who rely on their own ability to perceive patterns in the language. Rob sometimes finds himself placed in this second camp. But he actually thinks progress lies in marrying the two. Drawing on a huge body of data — and then contextualizing it down to a specific set of circumstances — all the facts surrounding the crime: that’s something that may require a team of humans.

Criminal law is only a part of the law of course; forensic linguistics can be used in all parts of it. And cracking cases for the prosecution is only a small part of criminal law. Helping disprove a case is just as important as helping prove one.

Rob Leonard: With a distinguished professor of constitutional law here at Hofstra a man named Eric Freedman, we founded the Hofstra Forensic Linguistics Capital Case Innocence Project. We re-analyze language data, language evidence that may have falsely put people on death row or in jail for extended periods. And my students get to — not just get to, they are required to — work on these live cases either of exoneration or at the moment like some of these death penalty cases we will be working on.

Patrick Cox: And we’re following one of those cases, hoping to bring you an episode on it, later in the season.

Kavita Pillay: Forensic linguistics is also entering all of our lives, whether we like it or not, or are even aware of it. Like right now, there’s a piece of legislation called The RESPONSE Act, where schools across the country would be required to use software to look at the emails and social media and class assignments of kids to see if there are trigger words that might indicate that someone is planning a mass shooting.

Are you concerned that there are privacy issues there for kids? Or is your workplace using this kind of software to monitor your email? What about on Facebook, are you seeing conversations monitored for hate speech that are turning up false positives? We want to hear from you! Tell us about it. Our email address is subtitlepod@gmail.com. We’re also on Twitter at lingopod.

Patrick Cox: And if you want to hear more about forensic linguistics, there’s a podcast about it. It’s called En Clair, that’s E-N new word, CLAIR. French, nice! It’s the work of Claire Hardaker, who’s a linguist at Lancaster University in the UK. Each episode looks at a case with a strong forensic linguistics dimension. It’s a great listen.

This episode of Subtitle was reported by Patrick Cox and Kavita Pillay. The podcast version of the story is available here and on Apple podcasts.

Subtitle’s sound designer is Tina Tobey. Thanks to Alyson Reed, Tracey Strain, Carol Zall, Nina Porzucki, Alina Simone, Barbara Bullock, Jacqueline Toribio, Tammy Gales, April Kalix-Cattell, Kirk Chao, Jackie Mow, Nola Cox, Sauli Pillay and our partners at the Linguistic Society of America and the Hub & Spoke audio collective.

Subtitle is funded by the National Endowment for the Humanities.

--

--

Subtitle
Subtitle

Written by Subtitle

A podcast about languages and the people who speak them. Co-hosted by @patricox and @kbpillay. Twitter: @lingopod

No responses yet