What happened towards average length of tweets?

Brand new doubling of maximum tweet duration provides for an appealing possible opportunity to look at the the consequences of a pleasure out of size limitations toward linguistic messaging. And a lot more remarkably, how did CLC change the build and you may term use during the tweets?

The necessity for a benefit away from expression diminished blog post-CLC. Ergo, the very first hypothesis claims you to article-CLC tweets contain apparently smaller textisms, like abbreviations, contractions, signs, or other ‘space-savers’. Additionally, i hypothesize that CLC inspired the latest POS design of tweets, who has seemingly far more adjectives, adverbs, content, conjunctions, and you can prepositions. These POS groups hold info concerning the situation getting discussed, brand new referential condition; such features of organizations, the fresh new temporal purchase out of situations, urban centers away from events or objects, and you will causal contacts anywhere between events (Zwaan and you may Radvansky, 1998). That it structural transform along with entails that phrases could be stretched, with increased words each phrase.

Gligoric et al. (2018) opposed pre and post-CLC tweets having an amount of whenever 140 letters. It learned that pre-CLC tweets within this profile range were apparently much more abbreviations and you can contractions, and you can a lot fewer special blogs. In the modern study, we utilized an alternate approach one contributes subservient really worth toward previous results: i performed a content analysis towards a dataset around 1.5 million Dutch tweets plus most of the selections (we.age., 1–140 and you may step one–280), unlike interested in tweets inside a particular reputation assortment. The dataset comprises Dutch tweets that were authored between , put simply 2 weeks ahead of and two months just after the new CLC.

We performed a general analysis to analyze changes in the amount out-of emails, terms and conditions, phrases, emojis, punctuation scratching, digits, and you will URLs. To check on the original hypothesis, i performed token and you can bigram analyses so you’re able to position all alterations in the fresh new cousin frequencies out of tokens (i.elizabeth., individual conditions, punctuation scratches, numbers, special letters, and icons) and bigrams (we.elizabeth., two-word sequences). These changes in cousin wavelengths you certainly will upcoming be used to recuperate new tokens which were specifically impacted by this new CLC. On top of that, a good POS data try did to check on the next hypothesis; that is, if the CLC inspired this new POS structure of sentences. An example of each investigated POS https://datingranking.net/sugar-daddies-usa/sc/ category are presented in the Desk 1.


The content range, pre-operating, quantitative studies, figures, token research, bigram data, and POS studies were did using Rstudio (RStudio People, 2016). The latest R packages that have been made use of are: ‘BSDA’, ‘dplyr’, ‘ggplot’, ‘grid’, ‘kableExtra’, ‘knitr’, ‘lubridate’, ‘NLP’, ‘openNLP’, ‘quanteda’, ‘R-basic’, ‘rtweet’, ‘stringr’, ‘tidytext’, ‘tm’ (Arnholt and you can Evans, 2017; Benoit, 2018; Feinerer and you may Hornik, 2017; Grolemund and Wickham, 2011; Hornik, 2016; Hornik, 2017; Kearney, 2017; R Core Cluster, 2018; Silge and you can Robinson, 2016; Wickham, 2016; Wickham, 2017; Xie, 2018; Zhu, 2018).

Chronilogical age of attract

The CLC took place with the in the a beneficial.m. (UTC). The fresh new dataset constitutes Dutch tweets that were composed inside a fortnight pre-CLC and two days post-CLC (i.age., out of 10-25-2017 so you’re able to eleven-21-2017). This period was subdivided to the day step 1, day dos, times 3, and times 4 (select Fig. 1). To research the outcome of your CLC we compared the words use within the ‘few days step one and you will week 2′ towards vocabulary utilize from inside the ‘few days step 3 and few days 4′. To acknowledge the fresh new CLC perception off sheer-feel consequences, a running testing try developed: the real difference into the vocabulary utilize anywhere between times step 1 and you will times 2, called Baseline-split up We. Furthermore, the latest CLC might have started a trend regarding the code use one to changed much more profiles turned always this new maximum. Which pattern could well be shown from the evaluating few days 3 which have times cuatro, named Standard-split II.

Swinging mediocre and you may practical error of your own profile incorporate through the years, which will show a rise in profile usage post-CLC and an extra increase between day step 3 and you can 4. For every tick scratches the absolute beginning of the day (i.elizabeth., a good.meters.). The time frames indicate the comparative analyses: month step one which have week 2 (Baseline-separated I), day 3 having week cuatro (Baseline-separated II), and you can month 1 and you can 2 with month step 3 and you may 4 (CLC)

