Has AI Spoiled the Use of the Em Dash and Semicolon?

AI use of em dash and semicolon

A client recently gave me some feedback on an article I wrote, and they picked up on my use of en dashes (in this piece, mostly to offset parenthetical information) and thought it best to avoid them, as these dashes are now associated with AI-generated content.

It’s true – these dashes (the one I just used there), although really the longer em dash, are used liberally by AI tools like ChatGPT. The reason AI uses them is that, well, humans used them first, and these large language models have been trained on extremely large amounts of human writing. As Brent Csutoras writes on Medium, “Turns out, em dashes are absolutely everywhere in the training data. In books, articles, essays, humans used them so often that AIs learned them as a default natural flow. It’s like asking a bird not to chirp.”

I often use the en dash, and have done so long before ChatGPT emerged. (The en dash – longer than a hyphen and shorter than an em dash – can function the same as the em dash, even though it’s also used in number and date ranges, like 1–10 and 2001–2010, or in connected pairs, such as London–Paris. I’ve just ended up using the en dash as a break, like the em dash, out of stylistic habit. Apparently, this is more common in the UK as well.) I use the en dash not just as parentheses, to interject an explanation or tangential thought, but also for emphasis, rhythm, changes in thought, or afterthoughts. In these cases, a single en dash at the end of a complete sentence can also offset additional information – like this.

In a piece for The Guardian, bemoaning AI’s use of the em dash, James Shackell writes:

And here I was being told the humble em dash—friend to poorly paid internet hacks everywhere—was now considered a sign not of genuine intelligence, but the other sort. The artificial sort. To the extent that I have to go through and manually remove them one by one, like nits. The absolute cheek. Not only am I losing my livelihood to AI—I’m losing grammar too.

However, he goes on to say, “Why couldn’t machines have embraced the semicolon? No one gives a fig about those.” Unfortunately, AI has fully embraced the semicolon, too (although it doesn’t use it as frequently as the em dash). Unlike Shackell, I do care about the semicolon. Along with the en/em dash, it can be a highly effective punctuation mark, helping to join connected ideas together, acting as a stronger pause than a comma, or separating items in a list that already include commas. One thing I appreciate about the writing I value is the tasteful use of semicolons and em dashes. It’s also something I’ve enjoyed trying to introduce into my own writing.

Skilful uses of these punctuation marks can elevate writing; they can alter the flow, pacing, and impact of the written word, which, as a purely non-fiction writer, is useful for making writing more engaging and enjoyable to read. The semicolon and em dash are also useful for translating the natural rhythms of thought. Sometimes, we think in afterthoughts, or we think paranthetically, or in terms of points we want to emphasise, or how we want to make an argument, and semicolons and em dashes can help represent these different patterns of thinking.

Yet, if the semicolon and em dash were once thought to be distinctly human, they’re now widely associated with the non-human – the artificial, the automatic, the thoughtless, and the generic. It’s a real shame that writers, such as myself, are being encouraged to drop (or at least massively reduce) the use of certain punctuation marks. It restricts options for expression. It also means (and this is something else I’ve seen) that someone can be accused of using AI to generate content, even when they haven’t, and this may be purely because of how they’re using semicolons and em dashes.

If AI is being trained on human writing patterns and consistently uses those patterns to appear like a human writer, then when people use the em dash and semicolon as they always have, they always run the risk of being accused of using AI-generated content. In the age of ChatGPT, false positives exist: AI checkers can check genuinely human-generated content and flag it as AI-generated. It can sometimes be hard to distinguish real human writing from that generated by Chat-GPT.

(On the other hand, there are now many clear signs when AI has been used, as noted in a Reddit post, with two of the most annoying I notice being these two: “contrast framing everywhere, it is not x, it is y, repeated over and over” and “fragmented, pseudo profound sentences. short. isolated. trying to feel reflective”. A lot of AI-generated writing reads like a LinkedIn post, with a marketing kind of tone, using vacuous, stock phrases like “And honestly? That’s rare.”)

In order to avoid being accused of using AI-generated content, people may try to restrict their natural use (and genuine fondness for) em dashes and semicolons. Or they may be asked to do so, in my case. This isn’t an issue for me in my professional work: I’m used to adapting my writing. It’s part of the job. But, I detest the idea that this would, or should, become something I habitually do in my writing in general, out of fear that others will think I’ve used ChatGPT or some other AI tool to do my writing for me. For clarification, I never use AI to write articles, and I’m completely against doing so for various reasons, although I still appreciate AI’s uses as a writing assistant, not as a ghostwriter. 

When we become vigilant and critical of AI-generated content (justifiably), we will, in many cases, spot actual cases of it; but the inundation of em dashes and semicolons may have the unfortunate side effect of causing us to dislike, be sceptical of, or lose trust in human writing that features these punctuation marks. When I’m writing in the way that feels natural to me, I still use en dashes and semicolons, and I find I can further develop as a writer when I think about how these punctuation marks can be more skillfully used. And I still appreciate reading writers who I perceive as using them to create more beautiful and impactful writing. 

Yet, AI’s use of em dashes and semicolons sometimes makes me wonder whether I’m inadvertently sounding like AI when I use them myself. There’s always the risk that our overreliance on and constant exposure to AI content could unconsciously affect how we write. After all, how we write is, to some extent, influenced by what we read. The influence is often unconscious; a style of writing appeals to us aesthetically and psychologically, and it’s natural for that to sometimes come through in our own writing. Of course, every writer wants to (and often does) develop their own voice and style, but this voice and style never occur in a vacuum. So, perhaps some writers have inadvertently had AI patterns of writing creep into their work, despite not using AI to generate it.

I understand why my client wanted me to avoid en dashes. If there’s any hint that AI was used to generate the content being read, the reader can instantly lose trust. I certainly would. Obvious AI-generated content makes me cringe, and I dismiss it as lazy, uninteresting, and useless. But if there’s uncertainty about whether AI was used – perhaps if a mix of AI and human writing were used, or a person edited AI content – this also devalues the content. This is because we should be able to read, enjoy, and benefit from writing without the distraction, scepticism, and wariness of whether the content is human-generated or not. But this is the reality of the situation that AI has created.

It’s worth calling attention to obvious or likely cases of AI-generated content when someone tries to pass it off as their own writing or thoughts; I think that’s a small way to try to maintain the value of human creativity (and the disvalue of deception). But what about those cases, which I’ve described, where AI hasn’t been used, but the use of em dashes and semicolons makes people suspicious and associate the writing with AI? It’s a hard question to answer, but instinctively, I want to say that we should resist letting AI take ownership of the em/en dash and semicolon.

I understand the hesitancy about using them in the case of promoting content on behalf of a company, which has a brand and reputation to protect. In my own writing, like on this blog, on the other hand, I don’t plan on abandoning or restricting my use of them. While I may try to be more alert to whether AI has influenced my writing at all, or if some writing could be perceived as AI-generated, it would be a loss to human expression and spontaneity to make a real effort to avoid or cut down on certain punctuation marks. There’s a separate argument about overusing em dashes and semicolons in writing; that was a potential problem before AI content emerged. That’s something I’d also want to pay more attention to and take note of, as well as other overuses of punctuation marks (I know that I’m very fond of using brackets myself, although I don’t know if or to what extent that habit negatively impacts my writing). What I don’t want to do, however, is let AI content dictate the punctuation I use.

Maybe optimistically, I do think we can retain our natural uses of em dashes and semicolons, as we’ve always used them, without risking accusations or loss of readership based on AI patterns of writing. Respected and reputable writers continue to use these punctuation marks without sacrificing their respectability or reputation.

There are so many more important (and more annoying and cringeworthy) signs of AI writing, which have become generically AI, that distinguish it from human writing. It wouldn’t be a massive loss if human writers avoided these other tell-tale signs; in fact, it could be a boon to human writing, helping us focus on what sounds most genuine, rather than generic, self-helpy, pseudo-profound phrasing and patterns of writing. The same doesn’t apply to em/en dashes and semicolons. These are valuable tools in writing, each with distinct uses; we gain much by using them skillfully – and lose much without them.

Leave a Reply