Skip to content


Folders and files

Last commit message
Last commit date

Latest commit


Repository files navigation


Repository for Master's Thesis / Markus Rottmann, 97-919-294

This repository contains the R- and NetLogo-code for my Master's Thesis "Simulation of Policies against Information-Pollution on Twitter".

  • all input and output data is referenced in the respective codes.
  • for easy viewing, click the links below each description or download the .html version and view it in your browser.
  • following .rds data sets, containing tweets, are not included for privacy reasons:
    • corp_all_de.rds: corpus (collection of texts) containing tweets
    • df_tweets_all.rds: data frame containing all tweets, non-clean
    • df_tweets_clean_all.rds: data frame containing all tweets, clean
    • df_tweets_clean_de.rds: data frame conatining all tweets, non-italian, non-french, clean
    • df_tweets_clean_de_w_ipind.rds: data frame conatining all tweets, non-italian, non-french, clean, with indicator of information pollution.
  • following .csv data sets, containing personal information on candidates of Swiss 2019 election, are not included for privacy reasons:
    • 2019_chvote_councilofstates.csv: data frame containing personal information of candidates for the Council of States.
    • 2019_chvote_nationalcouncil.csv: data frame containing personal information of candidates for the national council.



Code: MT_scraping_05.rmd / MT_scraping_05.html

  • This code scrapes the tweets of all candidates for 2019 elections and combines it into a data frame that also contains properties such as gender, age, party, etc. For the sake of privacy, all twitter API information is set to “YYY”. All code chunks are set eval = FALSE due to the very long running time of this code.
  • MT_scraping_05.html


Code: MT_corpus_analysis_08.rmd / MT_corpus_analysis_08.html

  • This code is to identify tweets that contain information pollution on corona in two steps. First, identifying all corona-tweets (including stratifying collection of tweets). Second, identifying information pollution within corona-tweets.
  • MT_corpus_analysis_08.html


Code: MT_networkanalysis_07.rmd / MT_networkanalysis_07.html

  • This Code is to analyze/build the network from downloaded tweets. This is done in two steps. First, analyzing who mentions whose twitter handle (e.g. @markus_rottmann). Second, transforming this information into a adjecency matrix.
  • MT_networkanalysis_07.html


Code: MT_probabilities_prediction_02.rmd / MT_probabilities_prediction_02.html

  • This code predicts the probability of all candidates in the network to tweet a certain type of information pollution tweets. These results are input for the MABS simulation. This is achieved in three steps. First, a multinominal regression is ran on all manually information pollution tweets of the corona tweets. Independent variables are gender and party. Second, The regression results are then applied to all candidates in the network, results are predicted probabilities. Third, since the probabilities to issue an information pollution tweet are very low (less than 0.1%), the probabilites enhanced, resulting in pronounced probabilities.
  • MT_probabilities_prediction_02.html


Code: MT_resultanalysis_09.rmd / MT_resultanalysis_09.html

  • This code is to analyze/build the results from the simulating each countermeasure (in NetLogo). It calculates descriptive statistics, renders plots, performs OLS, and performs the hypothesis tests.
  • MT_resultanalysis_09.html


NOTE Replication R
To replicate the R Code:

  • obtain the non-included data as described in the first paragraph.
  • copy folder R_Code_Replication, including sub-folders to you computer and open R-Project R_Code_Replication.Rproj.



Code: network_sim_warn_x02_04.nlogo / network_sim_warn_x02_04.html


Code: network_sim_susp_x02_03.nlogo / network_sim_susp_x02_03.html


Code: network_sim_ban_x02_03_universal.nlogo / network_sim_ban_x02_03_universal.html


NOTE Replication NetLogo
To replicate the NetLogo Code, copy the folder, including sub-folder Data to you computer and adjust line set-current-directory "D:\Studium\MT_MasterThesis\MT_Code\NetLogo_Code" in each model accordingly.