Prof. Dr. Sebastian Haunss

PAPEA: A modular pipeline for the automation of protest event analysis

Protest event analysis (PEA) is the core method to understand spatial patterns and temporal dynamics of protest. We show how Large Language Models (LLM) can be used to automate the classification of protest events and of political event data more broadly with levels of accuracy comparable to humans, while reducing necessary annotation time by several orders of magnitude.

We propose a modular pipeline for the automation of PEA (PAPEA) based on fine-tuned LLMs and provide publicly available models and tools which can be easily adapted and extended. PAPEA enables getting from newspaper articles to PEA datasets with high levels of precision without human intervention. A use case based on a large German news-corpus illustrates the potential of PAPEA.

  • Haunss, Sebastian, Priska Daphi, Jan Matti Dollbaum, Lidiya Hristova, Pál Susánszky, and Elias Steinhilper. 2025. ‘PAPEA: A Modular Pipeline for the Automation of Protest Event Analysis’. Political Science Research and Methods, published online, 23 June 2025, https://doi.org/10.1017/psrm.2025.10013.

Comments are closed.