loading page

Web Scraping Political Content from Social Media Sites: An Exploratory Data Analysis Approach
  • +4
  • Asfandyar Khan,
  • Azam Jan,
  • Dilawar Shah,
  • Faizan Ullah,
  • Muhammad Haris Khan,
  • Shujaat Ali,
  • Muhamad Tahir
Asfandyar Khan
Hazara University Mansehra
Author Profile
Azam Jan
Hazara University Mansehra
Author Profile
Dilawar Shah
Bacha Khan University Charsadda
Author Profile
Faizan Ullah
Bacha Khan University Charsadda
Author Profile
Muhammad Haris Khan
Bacha Khan University Charsadda
Author Profile
Shujaat Ali
Bacha Khan University Charsadda
Author Profile
Muhamad Tahir
Kardan University

Corresponding Author:[email protected]

Author Profile

Abstract

In today’s rapidly evolving digital landscape, social media platforms such as Twitter and Facebook are among the most popular microblogging applications, playing an important role in quickly disseminating up-to-date information to a large user base. In addition to being valuable sources of entertainment and platforms for business campaigns, social media apps significantly impact political activities in developing democracies. However, social media networks often become sources of rapid dissemination of fake news, viral videos, hate speech, and false articles, leading to political propaganda. Existing studies need to address how Pakistan’s three major political parties use social media platforms for this purpose. In this study, we used exploratory data analysis (EDA) to explore and analysed the initial content of social networks to understand, identify, and gain insights for further analysis. We developed a web scraper, a valuable tool used in data science, to extract unstructured content from the official Twitter and Facebook accounts, primarily used to spread political propaganda publicly. The web scraper automatically extracts various information from Facebook posts, including likes, shares, comments, and views. It extracts information such as likes, comments, and retweets from tweets. The collected data is then processed and analyzed using statistical methods to gain knowledge and insights from social media sites. One month of data analysis suggests that Pakistan Tehreek-e-Insaf (PTI) posted 79.37% more content on Facebook, while Pakistan Muslim League Nawaz (PML (N)) tweeted 89.30% more on Twitter compared to other parties. This activity is part of their political propaganda to build a narrative and shape public opinion among their followers and voters in Pakistan.