Whose Fingerprint Does the News Show? Developing Machine Learning Classifiers for Automatically Identifying Russian State-Funded News in Serbia

Ognjan Denkovski, Damian Trilling


Democratic nations around the globe are facing increasing levels of false and misleading information circulating on social media and news websites, propagating alternative sociopolitical realities. One of the most innovative actors in this process has been the Russian state, whose disinformation campaigns have influenced elections and shaped political discourse globally. A key element of these campaigns is the content produced by state-funded outlets like RT and Sputnik, whose articles are republished by underfunded or sympathetic local media, as well as coordinated groups that attempt to shape mainstream political narratives. Using a tailored text-as-data approach, we examine the thematic and linguistic differences in articles produced by U.S. and Russian state-funded and mainstream outlets in Serbia. We use 11 features (frames and in-text characteristics) to construct an article country-source classifier with a high degree of accuracy. The article contributes toward an understanding of the structural characteristics of Russian state-funded news in the Western Balkans, enhances the application of computational text analysis in Serbian, and provides suggestions for the application of text-as-data methods to the study of online disinformation.


disinformation, computational text analysis, text classification, automated frame identification, fragmented audience, Western Balkans

Full Text: