An analysis of nearly one billion Tweets maps the emergence of new words across the USA in unprecedented detail.
Feeling scute with your on fleek eyebrows or with your newbalayage? Or are you rekt and baeless?
The English language is forever in flux, as new words are born and old ones die. But where do these terms come from and what determines whether they survive?
Charting linguistic change was once a painstakingly slow task, but a new analysis of nearly one billion Tweets – presented on 17 April at the Evolang International Conference on Language Evolution in Torun, Poland – now offers us an unprecedented glimpse of this process in action.
According to this new research, most of the more recent coinages will have originated in one of five distinct hotspots that are driving American English through continual change.
Candids and gainz
Compared to historical linguistic corpuses, the data available online is staggering. More than 20% of Americans were using Twitter at the time of the study – and each Tweet is timestamped and geocoded, offering precise information on the time and place that particular terms entered conversations.
The researcher behind the study, Jack Grieve at the University of Birmingham, UK, analysed more than 980 million Tweets in total – consisting of 8.9 billion words – posted between October 2013 and November 2014, and spanning 3,075 of the 3,108 US counties.
From this huge dataset, Grieve first identified any terms that were rare at the beginning of the study (occurring less than once per billion words in the last quarter of 2013) but which had then steadily risen in popularity over the course of the following year. He then filtered the subsequent list for proper nouns (such as Timehop) and those appearing in commercial adverts, and he also removed any words that were already in Merriam-Webster Dictionary. Acronyms, however, were included.
The result was a list of 54 terms, which covered everything from sex and relationships (such as “baeless” – a synonym for single), people’s appearance (“gainz” to describe the increased muscle mass from bulking up at the gym), and technology (“celfie” – an alternative spelling of selfie). Others reflected the infiltration of Japanese culture (such as “senpai”, which means teacher or master). They also described general feelings, like “litt” (or “litty” – which means impressive or good – or affirmations such as “yaaaas” (as an alternative to yes.) Interestingly, some of these terms such as “candids” (a noun describing photos taken without the other person’s knowledge) have been around for years, but were extremely rare until seeing a sudden rise in popularity.
Having compiled this new lexicon, Grieve next used Twitter’s geocoded data to track its origins and spread across the USA. Baeless, for instance, appeared to crop up in a few different counties across the south, before building in popularity and then spreading north and west.
In total, Grieve identified five hubs driving linguistic change. In order of importance, they were:
The West Coast
Encompassing Los Angeles, San Francisco, San Diego and Las Vegas, the West Coast primarily introduced concepts such as cosplay, technical jargon like “faved” (favourited) and gaming-language such as “rekt” (defeated). Terms from the West Coast often spread far and wide and were frequently picked up by people in the North East.
Notable terms: amirite (Am I right?); baeritto (a lover you’d like to wrap your arms around like a burrito); figgity (intoxicated/very); slayin (looking great) and waifu (wife).
The Deep South
Grieve found that three distinct southern hubs were responsible for a large amount of the lexical innovation on Twitter. The most important area centred on the Atlanta metropolitan area. Grieve suggests that the lexical innovation may result from its large African American population (Atlanta is often considered the centre of African American culture), which is bringing its colourful expressions online.
Notable terms: baeless (single); boolin’ (chilling), famo (family and friend); traphouse (drug house).
North East
With its dense population, it’s little surprise that New York would be a centre of linguistic innovation. Surprisingly these words did not have a large geographical reach, however, and tended to be contained within neighbouring states. Interestingly, however, New Yorkers would often follow the West Coast trends, and vice versa.
Notable terms: balayage (a hairstyle); litt or litty (good); lituation (a ‘litt’ situation)
Mid-Atlantic
This was the second southern region to emerge as a hot spot, centred on Washington DC and Baltimore and extending into Virginia. Once again, the greatest creativity appeared to arise in the regions with the largest African American populations.
Notable terms: on fleek (on point/flawlessly styled); shordy (short); wce (woman crush everyday)
Gulf Coast
The third (and final) southern region to feature in Grieve’s analysis, this hub centred around New Orleans, extending across Louisiana and into eastern and coastal Texas and along the Mississippi to Memphis. One of the region’s most noteworthy contributions – idgt (I don’t get tired) – became a catchphrase of the rapper Kevin Gates, who grew up in Baton Rouge, the state capital of Louisiana, and released a single of the same name in 2014.
Notable terms: bruuh (bro’): idgt (I don’t get tired); lordt (Oh Lord!)
Grieve says he was surprised by the results. According to linguistic theory, you would expect new words to arise in areas with the highest population density – but this could not explain all the variation. Grieve’s data confirms that the cultural (and linguistic) importance of a region is only loosely connected to its actual size.
Grieve’s latest studies have examined the characteristics of the new terms themselves and the reasons they thrive. One factor was length – as you might expect, we prefer shorter words and acronyms that are more efficient to type and to read. And in the same way that living creatures are more likely to propagate in ecosystems with few competing species, the most popular coinages were also more likely to occupy their own “semantic niche”: they defined an entirely new concept with no direct synonym. Balayage, for instance, which describes a very particular hairstyle, fared better than baeless, which had to compete with the existing term “single”.
Besides these particular results, Grieve’s report was an important proof of principle, and he told the Evolang conference that it may be the first of many similar analyses to probe the secret lives of words online. To which the only appropriate response is yaaaas!