so i just googled the phrase “toeing out of his shoes” to make sure it was an actual thing

and the results were:


it’s all fanfiction

which reminds me that i’ve only ever seen the phrase “carding fingers through his hair” and people describing things like “he’s tall, all lean muscle and long fingers,” like that formula of “they’re ____, all ___ and ____” or whatever in fic

idk i just find it interesting that there are certain phrases that just sort of evolve in fandom and become prevalent in fic bc everyone reads each other’s works and then writes their own and certain phrases stick

i wish i knew more about linguistics so i could actually talk about it in an intelligent manner, but yeah i thought that was kinda cool

Ha! Love it!

One of my fave authors from ages ago used the phrase “a little helplessly” (like “he reached his arms out, a little helplessly”) in EVERY fic she wrote. She never pointed it out—there just came a point where I noticed it like an Easter egg. So I literally *just* wrote it into my in-progress fic this weekend as an homage only I would notice. ❤

To me it’s still the quintessential “two dudes doing each other” phrase.

I think different fic communities develop different phrases too! You can (usually) date a mid 00s lj fic (or someone who came of age in that style) by the way questions are posed and answered in the narration, e.g. “And Patrick? Is not okay with this.” and by the way sex scenes are peppered with “and, yeah.” I remember one Frerard fic that did this so much that it became grating, but overall I loved the lj style because it sounded so much like how real people talk.

Another classic phrase: wondering how far down the _ goes. I’ve seen it mostly with freckles, but also with scars, tattoos, and on one memorable occasion, body glitter at a club. Often paired with the realization during sexy times that “yeah, the __ went all they way down.” I’ve seen this SO much in fic and never anywhere else

whoa, i remember reading lj fics with all of those phrases! i also remember a similar thing in teen wolf fics in particular – they often say “and derek was covered in dirt, which. fantastic.” like using “which” as a sentence-ender or at least like sprinkling it throughout the story in ways published books just don’t.


I love this. Though I don’t think of myself as fantastic writer, by any means, I know the way I write was shaped more by fanfiction and than actual novels. 

I think so much of it has to do with how fanfiction is written in a way that feels real. conversations carry in a way that doesn’t feel forced and is like actual interactions. Thoughts stop in the middle of sentences.

The coherency isn’t lost, it just marries itself to the reader in a different way. A way that shapes that reader/writer and I find that so beautiful. 


and it poses an intellectual question of whether the value we assign to fanfic conversational prose would translate at all to someone who reads predominantly contemporary literature. as writers who grew up on the internet find their way into publishing houses, what does this mean for the future of contemporary literature? how much bleed over will there be?

we’ve already seen this phenomenon begin with hot garbage like 50 shades, and the mainstream public took to its shitty overuse of conversational prose like it was a refreshing drink of water. what will this mean for more wide-reaching fiction?



I’m sure someone could start researching this even now, with writers like Rainbow Rowell and Naomi Novik who have roots in fandom. (If anyone does this project please tell me!) It would be interesting to compare, say, a corpus of a writer’s fanfic with their published fiction (and maybe with a body of their nonfiction, such as their tweets or emails), using the types of author-identification techniques that were used to determine that J.K. Rowling was Robert Galbraith.

One thing that we do know is that written English has gotten less formal over the past few centuries, and in particular that the word “the” has gotten much less frequent over time.

In an earlier discussion, Is French fanfic more like written or spoken French?, people mentioned that French fanfic is a bit more literary than one might expect (it generally uses the written-only tense called the passé simple, rather than the spoken-only tense called the passé composé). So it’s not clear to what extent the same would hold for English fic as well – is it just a couple phrases, like “toeing out of his shoes”? Are the google results influenced by the fact that most published books aren’t available in full text online? Or is there broader stuff going on? Sounds like a good thesis project for someone! 

See also: the gay fanfiction pronoun problem, ship names, and the rest of my fanguistics tag.

Interesting stuff about fanfiction language patterns!


How the movie Arrival made the linguist’s office 

When the trailer for the xenolinguistic movie Arrival first came out, I mentioned that I knew the linguists who consulted for the film. Ben Zimmer followed up with them for Language Log

For the film, the linguistic consultants included Jessica Coon, Lisa deMena Travis, and Morgan Sonderegger, all from McGill. The set designers spent time with Travis and Coon in their offices and ended up borrowing many of their books, as well as reproducing other items from their offices, in order to create the office set.

Via e-mail, Coon writes:

The set crew came to my office first and took a lot of pictures (they liked my tea kettle and plants, and they wanted to know what kind of bag I carry). They needed to rent a certain number of feet of books, but I didn’t have enough, so we went up to Lisa’s office. I keep a fairly tidy office… they liked Lisa’s much better. […]

But Travis told me that the set designers were less interested in titles than colors: they were particularly interested in borrowing blue and beige books. Fortunately, she had plenty of both. Many of the blue ones are in the Linguistik Aktuell series from John Benjamins (Travis serves on the advisory editorial board). And she had lots of beige-colored journals (e.g., Language, Natural Language and Linguistic Theory,Linguistic Inquiry, Oceanic Linguistics) and conference proceedings (e.g., NELS, short for the North East Linguistic Society).

An article for McGill News interviews Jessica Coon about what she wrote on the whiteboard: 

Sequences that take place in the code-breaking tent where military cryptographers struggle to crack the aliens’ language, feature words that Coon wrote on whiteboards to lend that process a more authentic air. Words like “articulators.”

The articulators that humans use to form speech include our tongue, teeth and lips. “[The aliens] don’t look human at all, their vocal tracts and their mouths or whatever they’re using to make language is nothing like ours,” says Coon. In a situation like that, the military experts would probably be thinking a lot about articulators.


US government plans to use drones to fire vaccine-laced M&Ms near endangered ferrets.

Yet another entry in “linguists are not kidding when they say that your command of English enables you to understand sentences that have never occurred before in the entire history of the human species.” (via allthingslinguistic)

“A’ghailleann”: On Language-Learning and the Decolonisation of the Mind – The Toast


A beautiful article on The Toast by Iona Sharma about heritage language learning and decolonization. It’s worth reading the whole thing, but here’s the beginning: 

Here are the things you need to know first. I am thirty years old. I am Indian. My parents arrived in Scotland as newly minted immigrants in the eighties, thinking they’d go home after I was born. Decades later, we’re still here.

My parents, grandparents, cousins, aunts and uncles, their friends and their community, speak Hindi as a first or joint first language. I do not. I stopped being a fluent Hindi speaker at the age of six, perhaps earlier. The school didn’t like it. Too confusing to educate a bilingual child. If you don’t speak to her in English at home, she’ll never learn.

Gaelic, sometimes referred to as Scottish Gaelic to differentiate it from Irish and known to its own speakers as Gàidhlig, is a Celtic language spoken by just over 58,000 people. It has been in decline for centuries. Anglicisation, colonisation and the Highland clearances all had a role in destroying its traditional heartlands, driving it to the far northwest of Scotland and the islands. In the nineteenth century schoolchildren were forbidden to use it in the classroom; by the 1970s the last monolingual speakers were gone. To speak Gaelic now is a political act.

I am not very good at languages.

Read the whole thing

Stunningly beautiful writing from Iona Sharma about belonging, decolonization, and linguistic identity. Unquestionably worth reading the whole thing. 

“A’ghailleann”: On Language-Learning and the Decolonisation of the Mind – The Toast

Translating Gender: Ancillary Justice in Five Languages


A really interesting article interviewing five different translators of a book that does interesting things with gender, and how each of them dealt with that: 

In Ann Leckie’s novel Ancillary Justice (Orbit Books: 2013), the imperial Radch rules over much of human-inhabited space. Its culture – and its language – does not identify people on the basis of their gender: it is irrelevant to them. In the novel, written in English, Leckie represents this linguistic reality by using the female pronoun ‘she’ throughout, regardless of any information supplied about a Radchaai (and, often, a non-Radchaai) person’s perceived gender. This pronoun choice has two effects. Firstly, it successfully erases grammatical difference in the novel and makes moot the question of the characters’ genders. But secondly, it exists in a context of continuing discussions around the gendering of science fiction, the place of men and women and people of other genders within the genre, as characters in fiction and as professional/fans, and beyond the pages of the book it is profoundly political. It is a female pronoun.

When translating Ancillary Justice into other languages, the relationship between those two effects is vital to the work.

After reading a comment by the Hungarian translator, Csilla Kleinheincz, posted on Cheryl Morgan’s blog, we wanted to know more about this. We invited the translators of the novel into Bulgarian, German, Hebrew, Hungarian and Japanese to discuss the process, with particular interest in the translation of gender. What emerges is an insight into the work of translators and the rigidity and versatility of grammatical gender in the face of non-standard demands. Where necessary, translators turned to innovative and even inventive ways to write their languages.

(Read the whole thing.)

Previously about gender in Ancillary Justice

This is SUPER COOL! I’ve been giving a lot of thought lately about the challenge of gender-neutral pronouns (and speech more generally) in languages like French or German which are just crammed with grammatical gender, and this addresses the issue in a way I never even considered. I wish my German was up to the task of reading this book in more than one language (actually, I wish I could read Bulgarian, because of all the approaches, that one seems the most exciting to me)!

Translating Gender: Ancillary Justice in Five Languages


Book update #2: I have a (very rough) draft!

Here’s a celebratory screencap! I thought of printing off the pages and taking a photo of them all stacked up for the sake of ~aesthetic, but 248 pages is an awful lot of paper just for a photo, especially since I’m at a stage where the next step is more typing, not writing stuff in red ink in the margins. So you’re getting a screencap and some thoughts about writing instead.

(Previously: Book update #1: I’m writing a book about internet language!)

To put this achievement into context, my target wordcount is 80k words, which is a typical length for a book in the pop science genre (and in fact, pretty typical for fiction as well). For example, that’s halfway between the lengths of the first and second Harry Potter books. So 100k words is exciting because it means I definitely have at least enough to say about internet language to fill a book! I mean, I thought I did, that’s why I signed a contract to write it, and Penguin thought I did, or they wouldn’t have offered it to me. But it’s nice to know for sure.

However, these are not final words — I still have a lot of work to do on them. The way I’m going about drafting is that I first created a document with all the chapter divisions that were in my book proposal, and then I started throwing in rough thoughts and freewriting and snippets from various blog posts and my research document in under their appropriate chapters, making further subheadings as I went along. This let me see how much material there was in each chapter, so I ended up splitting one chapter in half, recombining a couple others, and changing the ordering several times. I had a daily wordcount goal, NaNoWriMo-style, to keep myself focussed on getting ideas down on the page, and so I wouldn’t get distracted about things like precise wording or capitalization and punctuation.

The next step, what I’m working on now, is to do the part where I take all these assorted thoughts and sort them out into real paragraphs that follow each other in a logical sequence — in the process hopefully cutting at least 20k words. After that, I can start showing the draft to my editor and various beta readers, since at the moment there’s no point in someone telling me “um this is not how you paragraph.” It also doesn’t make sense to use wordcount as a target anymore, so I’ve switched to keeping track of pomodoros and sections instead.

Content-wise, I can’t say much yet, but a big overarching issue that I’ve been working on is how to write a book about the internet that won’t be out of date before it’s even published. One way I’m addressing this is by making the chapters about the themes and the problems we’re trying to solve in internet language, rather than the particular ways that we’re currently solving them.

For example, I’ve talked a lot about emoji, so several people have (very reasonably) asked if there’s going to be an emoji chapter. But just like emoji are currently displacing emoticons, emoji themselves might get replaced by some newer thing in a few years. So instead of an emoji chapter, I have an emotions chapter. Of course it’s going to include emoji, but it’s going to put them in a broader context of other ways that we convey emotion online. Emoji might just be a trend, but emotions have been around for all of recorded history and presumably earlier — I feel like they’re a pretty safe bet.

Here’s a distinctly uninformative ~sneak peek~ of some things that will definitely not be in the next draft:

“idk” 32 times
“lol” 122 times
“wtf” 25 times

“so maybe i’m just swapping the order of chapters 9 and 10? idk [cut for spoilery discussion] hm k i buy that for now, done, moved. k battery officially dead”

(Some of the idk/lol/wtf instances are examples and will stay in, but since one of the ways I deal with writer’s block is by codeswitching into internet slang, well…there are definitely more of these than necessary at the moment. 10/10 would recommend codeswitching as a writer’s block strategy though.)

The writing advice I’ve found most useful as I’ve been working on this is from a quote that I saw on tumblr but of course I can’t find it now (it might have been by Neil Gaiman?). Anyway, it goes something like: “How do you write a book? Well, I don’t sit down each day and think ‘I need to write a book.’ I think ‘I need to finish chapter three’ or ‘I need to figure out what’s going on in this section.’ And when I add all those tasks up together, I’ve written a book.” If anyone can find the original version of this quote, do let me know! In the meantime, perhaps this paraphrased version will be helpful to someone.

I’ve also made a book update email list, so you can put your email here for very occasional book updates if you want to make sure that you don’t miss it on social media. (The signup link is embedded on the All Things Linguistic facebook page, but that was just the easiest way to host a MailChimp signup form if I didn’t want to inflict annoying popups on you all — it’s not actually a facebook thing.) Also, I won’t spam you or do other nefarious things with your email, it’s just for a couple book updates.

While we’re at it, the regular kinds of updates, as usual, can be had as daily blog posts via rss, tumblr, twitter, facebook, or google+; as monthly summaries via wordpress/email; and as mostly but not entirely linguistic thoughts at unpredictable intervals on my personal twitter (I made a great garden path joke there yesterday, so you should definitely check that out).


Good continued booking!

Simon Snow, Good Omens, and Stylometrics


So we’ve had a couple of questions regarding our interview with Lisa Pearl
and what she had to say about textual analysis and writeprints, the
ways in which we signal who we are by how we use language. Like, for
example, Gretchen McCulloch on All Things Linguistic asked about telling apart the different writing done by different characters in Rainbow Rowell’s Fangirl vs. Carry On. And in the YouTube comments, Valdagast asked about Good Omens, a book co-written by Neil Gaiman and Terry Pratchett, regarding whether we could use this kind of analysis to figure out which parts were written by which author.

went ahead and ran these questions by Dr. Pearl, and I’ve got her
answers below here! And I’m glad she answered them, because wow, this is
not my area of expertise.

Keep reading

This is incredibly cool and I’m excited I got to be (a tiny) part of the process of making ithappen. Thank you Dr. Pearl! ^_^



We’re really excited to have gotten to interview Lisa Pearl recently! Dr. Pearl is an Associate Professor at the University of California, Irvine, and the director of their Computation of Language Laboratory. She’s published numerous articles on how to use statistical models to check different hypotheses about what kids do to learn language, as well as about natural language processing and textual analysis. We’ve been talking about her work since our very first episode!

We got to ask her about a lot of great topics, including:
– what statistical models can tell us about how kids acquire language
– what’s under our control in our writing, and what we unconsciously show as our write-print
– why computers are so bad at detecting tone and picking out the right meanings of words
– how statistical models and Universal Grammar interact
– a question from one of our viewers about how to approach modeling for second language acquisition

And more! Hope you all enjoy it, and thanks to Dr. Lisa Travis and the Department of Linguistics at McGill University for letting us film there.

This discussion of “write-prints” (like a fingerprint, but for writing) and epistolary novels makes me wonder if Lisa Pearl could be the linguist who could do the study I’ve always wanted to see, which would compare the linguistic characteristics of the three types of text in Rainbow Rowell’s Fangirl: the Gemma T. Leslie Simon Snow text, Cath’s fanfic of the Simon Snow books, and the body tex. Which, for example, would be most similar to Carry On, the not-quite-fanfic of the Simon Snow portions of Fangirl?

ommggg Moti can we make this happen?



Bruce Banner in Avengers: Age of Ultron (2015): It’s a word in an African dialect meaning ‘thief’… in a much less friendly way.

Phil Coulson in Thor (2011): Get somebody from linguistics down here.


As excited as I was back in 2011 to learn that S.H.I.E.L.D. has a linguistics division, I was equally upset in 2015 to learn that Marvel does not. So here we go: Wakandan may be fictional, but it is not an “African dialect.” That’s because there’s no such thing as an African dialect! Dialects are minor variations of a common language, and as Africa is a huge continent with many diverse peoples, nations, and cultures, there is no single African language that they all share. Rather, there are thousands of different African languages that are not mutually intelligible with one another.

Africa is home to six or more language families, and each of those families contains as much linguistic diversity as the Indo-European family that English, Spanish, Russian, Sanskrit, and Greek (among many others) are all a part of. Based on Wakanda’s supposed location in the Marvel Cinematic Universe near real-life Ethiopia, Somalia, and Kenya, the Wakandan language is probably in the Afroasiatic language family. But that’s still a family with over 300 distinct languages in it.

Some Afro-Asiatic languages have multiple dialects, but Age of Ultron didn’t call Wakandan a dialect of a real language like Oromo (a plausible candidate, given the region). It didn’t even call it Afroasiatic. Instead, this line in a blockbuster with a budget of over two-hundred-million dollars called Wakandan “an African dialect.”

Why does this matter? Because referring to a dialect of a continent implies that that continent is home to a single common language, as Africa is most certainly not. Because Africa is not monolithic, although it’s often treated that way in Western cinema. Because Marvel is owned by Disney, who spent hundreds of millions of dollars perfecting this film, but didn’t think it was a priority to spend any of that money on a consultant who knew anything about Africa. Because Africa itself was so obviously not a priority here.

This was a small line in a major motion picture, mainly included to set up the connection to the fictional country of Wakanda for future Marvel projects like Captain America: Civil War (2016) and Black Panther (2018). But I really hope that Marvel is taking more care with how it discusses Africa in those properties than it did here.

These are excellent points, but I would like to submit a proposal that the major language spoken in Wakanda should be a Bantu language, rather than an Afro-Asiatic one. First of all, “Wakanda” certainly sounds like it fits Bantu phonology: most Bantu languages have only open syllables, and prenasalized stops are common. Secondly, this would give us a few candidate words in the language already: for example, the name of that language would be Kikanda or Sekanda following regular Bantu language naming conventions (see for example Kiswahili, Kinyarwanda, Kirundi, Kikongo and Setswana, Isixhosa, Isizulu, Sesotho). Similarly, a person from this group would probably be Akanda, several people Bakanda. 

I am willing to entertain a compromise that there are both Afro-Asiatic languages and Bantu languages spoken in Wakanda, since linguistic diversity is a thing, but I still maintain that the name of the country comes from one of its Bantu languages. Well, okay, actually, w-k-n could also be a triconsonantal root in a fictional Semitic language spoken in Wakanda (it can’t be k-n-d because triconsonantal roots don’t contain consonants with the same place of articulation). There is already history to both Bantu languages reanalyzing borrowed words as if they have noun class prefixes and to Semitic languages reanalyzing borrowed words as if they’re composed of triconsonantal roots, so you could assume the borrowing happened in either direction. (But nd- sequences are more common in Bantu than in Semitic, so that’s my vote.)

Anyway, I hope there are some conlanging Marvel fans who are going to make these languages now, even if Marvel itself can’t be bothered to figure out the difference between a language and a dialect, let alone hire an actual conlanger. 

I’m writing a book about internet language!


I’m very excited to announce that I’m writing a pop linguistics book about internet language!

It’s still in early stages, so stay tuned for updates when I have an official publication date, cover design, and so on, but it’s a real thing that’s really happening so I’m Officially Allowed to talk about it now! I have an editor and a publisher, Courtney Young at Riverhead Books – Riverhead is a division of Penguin, and Courtney may be familiar to you as the editor of Randall Munroe’s What If

(Excuse me while I search for my ability to even. I seem to have misplaced it somewhere…)

If you like the kinds of things I’ve posted in my internet language tag, then this is definitely the book for you, and if you’ve read all 17 pages (!) of posts in that tag, don’t worry, there will be plenty of new material! Other internettish phenomena! Deeper but still highly accessible linguistics! Broader themes than you can fit into in a thousand-word article! And more that I haven’t even written yet! 

The blog will keep running like normal, since I can’t imagine removing myself from the internet in order to write about it, and, as always, feel free to continue pointing me at things that I should analyze and/or cite along the way. 

This is extremely exciting!! I will be all over that book. Congrats!