It inquires into the current tensions between supranational EU governance and popular mobilisation at the national level critically questioning EU driven policies and EU legitimacy. It proposes remedial actions based on sound empirical research on the relationship between public opinion, national and supranational political elites. Financed by the European Commission.
MRC’s role in the EUENGAGE project was to carry out a systematic mass media content analysis (WP7) of 30 most prominent news outlets in 10 European countries. By means of automated text analysis, the study uncovered how the press in different national contexts prioritises and frames controversial European-level topics like Brexit, immigration, economy and security, and how the EU is depicted when media approaches these topics.
The archive contains:
a) A main dataset which comprises metadata about news articles in 30 different online media outlets from the 10 countries studied in the project, on 4 topics: Brexit, Economy, Immigration and Security.
The articles covered were published during January 1, 2016 – October 31, 2016. The articles have been collected either by developing crawlers adapted to the specifics of each individual website, or directly from the Lexis Nexis archive.
The dataset was used as input for two working papers on cross-national differences in the media coverage of the project’s main topics as well as how the constellation of actors, topics, events, frames and media characteristics are related to attitudes towards the EU).
The format of the final dataset is xlsx and it comprises 121,170 cases (news articles) and 11 variables which are described in the Codebook accompanying it. The dataset does not include the full text content of the news articles collected, which is subject to copyright laws, but it includes information that can be used to check and replicate the data collection process, like the Article Headlines, Outlet Name, Publication Date and, in some cases (for the websites that we scraped), the article URLs.
We are publishing this information to increase the transparency of our research and compensate for the copyright restrictions that do not allow the public sharing of additional, valuable text data. In addition, the data we provide is a useful snapshot of the overall coverage from different countries that mass-media scholars can use, and the news headlines available in the dataset can be used as corpus for further analyses.
|How to cite
Popescu, Marina, Toka, Gabor, Marincea, Adina, Schoonvelde, Martijn, Medzihorsky, Juraj, De Vries, Erik. (2018). Online media dataset EUENGAGE. Bucharest: Median Research Centre. Available at: http://medianresearch.ro/en/projects/eu-engage-project/
b) A list of six csv-format datasets with main outputs of the text analysis that are referred to in the two working papers we produced within EUENGAGE:
- Number of articles downloaded from each of the 30 news outlets on each of the four topics (Brexit, immigration, economy, security) – “outlets_topics_N.csv”
- Percentage of articles downloaded for each topic out of all the four topics for each of the 30 news outlets – “outlets_topics_perc.csv”
- Number of references to main EU representatives and institutions in each of the 10 countries’ media on each of the four topics – “EU_actors_N.csv”
- Percentage of references to main EU representatives and institutions in each of the 10 countries’ media on each of the four topics (calculated for each EU actor out of all EU actors) – “EU_actors.csv”
- References to EU actors in article texts compared to article headlines (in raw numbers and percentage) – “EU_actors_comp.csv”
- Sentiment towards EU in each of the 30 media outlets for each of the four topics – “sentiment.csv”
c) A technical report comprising information about the selection of the 30 media outlets analysed, the search queries (keywords) used for selecting the corpus, for each of the four topics and each of the ten languages, and codebooks for the six csv datasets together with citation instructions.
Martijn Schoonvelde (Vrije Universiteit Amsterdam)
Juraj Medzihorsky (University of Gothenburg)
Erik De Vries (University of Stavanger)