Research

My research is organized around two main axes, one substantive and one methodological.

Substantive Axis

Public Policy, Science and Technology

(Evidence-informed policymaking, impact of AI on policy processes, science-policy interface, expert influence, policy advisory systems)

15 projects Explore
Methodological Axis

Machine Learning and LLMs for Social Science Research

(Text-as-data, NLP pipelines, research infrastructure, collaborative platforms)

5 projects Explore

Public Policy, Science and Technology

(Evidence-informed policymaking, impact of AI on policy processes, science-policy interface, expert influence, policy advisory systems)

2026

Computational analysis of the relationship between scientific credibility and political influence of public health agencies in Canada.

Abstract

This study examines the relationship between scientific credibility and political influence of public health research agencies in Canada. Using advanced computational methods, we analyze how the epistemic authority of these institutions translates (or fails to translate) into influence on public policy decisions. The study focuses on several key agencies, including INSPQ, Health Canada, and the Public Health Agency of Canada, examining their role during recent health crises. The results reveal complex dynamics between scientific expertise, institutional legitimacy, and political influence.

Public health Epistemic authority Scientific advisory Canada INSPQ Computational analysis

Sweden's collaborative science-policy interface proved more resilient than Quebec's politically centralized model during the pandemic.

Abstract

This study examines how Science-Policy Interface (SPI) configurations shaped their institutional resilience during the COVID-19 pandemic through a comparative analysis of Quebec and Sweden. While both jurisdictions faced similar epidemiological conditions, their SPIs differed fundamentally: Quebec's politically subordinated model concentrated decision-making in government, while Sweden's scientifically autonomous model granted its Public Health Agency substantial independence. Using comparable public opinion data on trust, adherence to health measures, and crisis optimism, this study assesses which SPI configuration better sustained public support throughout the pandemic. Results show that Sweden's collaborative SPI demonstrated superior resilience, maintaining trust and optimism despite deteriorating epidemiological indicators. Quebec's politically centralized model proved more vulnerable, with rising death rates significantly eroding government trust.

COVID-19 Resilience Public trust Science-Policy Interface Quebec Sweden ENDURE project

Framing affects COVID-19 policy decisions in Quebec and Sweden, often independently of available evidence.

Abstract

This study investigates how framing, evidence, and the roles of scientists and political decisionmakers in policymaking influence public health policy decisions during the COVID-19 pandemic in Quebec and Sweden. Utilizing a comprehensive dataset of press conference transcripts, we apply natural language processing (NLP) to assess the impact of different framings on suppression and mitigation policies. Our analysis reveals that framing affects policy decisions, often independent of evidence. In Quebec, where political decisionmakers were central, a Dangerous framing, which emphasizes the severe health threats of COVID-19, is associated with an increase in stringent suppression policies, even in the absence of strong evidence. In contrast, Sweden's policy process, characterized by scientific autonomy, required high levels of evidence for the Dangerous framing to impact suppression policies.

COVID-19 Framing Public Policy NLP Public Health Comparative Analysis
2025

Policy decisions during the pandemic were shaped by a blend of scientific evidence and human judgment, not by science alone.

Abstract

This chapter explores the relationship between science, human judgment, and public policy during the COVID-19 pandemic. It highlights that, contrary to a prevailing rhetoric that policy decisions can be strictly science-based, the reality is far more complex. Political decisions are influenced by a blend of scientific evidence and human judgment, which includes instincts, beliefs, and attitudes. Key examples from the COVID-19 crisis illustrate this interplay.

Science and policy Human judgment COVID-19 Evidence-based policy

Worst-case scientific projections initially drive strict policies, but prolonged catastrophism erodes public support for both the policies and the experts themselves.

Abstract

In the face of protracted crises like climate change or pandemics, the influence of expert scientific projections on public policy is crucial yet evolves over time. This study offers an empirical demonstration of a previously fragmented theory: the diminishing influence of scientific projections on policy over time. Leveraging a comprehensive mixed-method analysis, we unveil the intricate interplay between expert projections, policy stringency, and public support during COVID-19 pandemic. Scientific projections that put forward worst-case scenarios have a considerable impact on policies made in the early stages of a crisis. However, as these catastrophic projections instill a sense of fatalism as the crisis lasts, they inadvertently lead to diminished public support for both the policies and the scientific projections themselves.

COVID-19 Expert influence Scientific projections Policy stringency Public support
DOI
2024

Network analysis of how density, centralization, and openness of advisory networks shaped the flow of expert advice across four jurisdictions.

Abstract

This study presents a dual-method approach to systematically analyze public health advisory networks during the COVID-19 pandemic across four jurisdictions: Belgium, Quebec, Sweden, and Switzerland. Using network analysis inspired by egocentric analysis and a subsystems approach adapted to public health, the research investigates network structures and their openness to new actors and ideas. The findings reveal significant variations in network configurations, with differences in density, centralization, and the role of central actors. The study also uncovers a relation between network openness and its structural attributes.

Policy advisory systems Network analysis COVID-19 Scientific advice Comparative analysis
DOI

NLP-derived indices show that uncertainty drove stringent policies in Quebec, while scientific discourse encouraged their relaxation.

Abstract

This article examines the interplay between uncertainty, emotions, and scientific discourse in shaping COVID-19 policies in Quebec, Canada. Through the application of natural language processing (NLP) techniques, indices were developed to measure sentiments of uncertainty among policymakers, their negative sentiments, and the prevalence of scientific statements. The study reveals that while sentiments of uncertainty led to the adoption of stringent policies, scientific statements and the evidence they conveyed were associated with a relaxation of such policies, as they offered reassurance and mitigated negative sentiments.

Uncertainty Emotions Scientific discourse COVID-19 NLP Policy analysis
DOI
2022

Sweden favoured WHO information while Quebec relied on foreign experiences, leading to radically different school closure decisions.

Abstract

The COVID-19 pandemic has given science, scientific information, and health experts a prominent role in public policy design. However, political decisions are rarely automatically derived from scientific information or expertise. In this chapter, we focus on the role of information selection and processing in policy design by studying school closure decisions during the early days of the COVID-19 pandemic. Faced with similar infection curves, Sweden and Quebec have made radically different decisions about school closures. We argue that the Swedish authorities have favoured information from the World Health Organization, whereas Quebec has relied more on foreign experiences.

Policy design Information selection COVID-19 School closures Quebec Sweden
DOI
2026

Nearly five decades of Canadian climate coverage, 9.2 million sentences annotated across 65 categories by BERT and CamemBERT classifiers.

Abstract

The Canadian Climate Framing (CCF) Database is a comprehensive, machine-learning-annotated corpus designed to enable large-scale analysis of climate discourse in Canadian print media. It comprises 266,271 articles from 20 major Canadian newspapers spanning nearly five decades (1978-2024), processed into 9,198,158 bilingual sentences (82.9% English, 17.1% French). Each sentence is annotated with 65 hierarchical categories using transformer-based classifiers (BERT for English, CamemBERT for French), trained on over 4,000 expert-coded sentences.

Climate change Media analysis Machine learning BERT CamemBERT Canada Framing
2025

Over four decades, 78% of causal links between climate events and media coverage now involve a change of frame. Science has lost its dominant position.

Abstract

In 1988, Canadian media covered climate change as a scientific problem. By 2023, the worst wildfire season in recorded history produced a political debate about carbon taxes. Analysing 266,271 articles from 20 Canadian newspapers (1978-2024), we show that climate events are increasingly reinterpreted through political and economic frames. Over four decades, 78% of causal links between events and media coverage have come to involve a change of frame. The Political-Economic configuration that dominates Canadian climate discourse forms a paradigm that does not need to be actively defended. Meanwhile, Science has lost its dominant position (from 81% to 23% of days), its share of media attention (25% to 11%), and the voice of its own messengers (39% to 21%).

Climate change Paradigm Media framing Canada Political economy Science communication

Using 250,000+ annotated articles to test whether political climate discourse follows scientific appeals in Canadian media.

Abstract

This study uses a machine-learning-annotated corpus of over 250,000 Canadian newspaper articles (1988 to present) to examine whether policymakers' climate-related media interventions follow scientific appeals. As part of the CCF-Canadian-Climate-Framing Project led by Alizée Pillod, this research investigates the temporal dynamics between scientific discourse and political responses in the Canadian context of climate change. Using advanced NLP techniques, we analyze patterns of responsiveness, lag times, and the evolution of climate framing in both scientific and political spheres.

Climate change Media framing Political discourse Science communication Canada NLP
2026

18,062 sentences from 5,842 articles across five languages, annotated along six dimensions of digital sovereignty through a hybrid LLM pipeline.

Abstract

We introduce AI.SOVEREIGNTY, a large-scale multilingual dataset of 18,062 sentences drawn from 5,842 news articles published in 39 media outlets across five languages (English, French, German, Spanish, Portuguese) between 2012 and 2024. Each sentence is annotated along six thematic dimensions of digital sovereignty (Authority, Economy, Ethics & Rights, Regulation, Security, and Technosolutionism) through a four-stage hybrid annotation pipeline combining iterative prompt engineering, a fine-tuned XLM-RoBERTa classifier for corpus filtering, generative annotation with GPT-5.2, and manual validation. Manual validation on 556 sentences yields a global Micro F1 of 72.7%.

Digital sovereignty Multilingual corpus AI governance LLM annotation Computational social science

Canonical policy process theories rest on five informational premises that AI and LLMs now render untenable. Proposes an Epistemic Policy Process (EPP) framework.

Abstract

The Multiple Streams Framework, Punctuated Equilibrium Theory, the Advocacy Coalition Framework, and the Narrative Policy Framework rest on five tacit premises about political information so self-evident when the theories were formulated that they were never explicitly stated: that political information is human-produced, filtered by identifiable gatekeepers, reflective of authentic public opinion, distinguishable from noise, and anchored in verifiable events. However, a radical transformation of the informational ecosystem completely upends these premises: machines now produce political information indistinguishable from that produced by humans. This article proposes a new Epistemic Policy Process (EPP) theory based on three dimensions to circumscribe the conditions under which the classical theories can still hold.

Policy Process Theories Agenda-Setting Generative AI Large Language Models Epistemological Crisis
2026

A continuously updated observatory of political video content, with 24,678 videos, 2.95 million annotated sentences, and 7.6 million comments.

Abstract

We present YouPol (YouTube and TikTok Political Observatory and Longitudinal database), a permanently updated research infrastructure that captures what political content creators actually say on video platforms. As of April 2026 and continuously expanding, the corpus comprises 24,678 videos from 64 channels across France and Quebec, with full speaker-diarized transcripts (605,134 segments, 2.95 million annotated sentences) and 7.6 million archived comments. The infrastructure includes an independent transcription pipeline that produces high-quality transcripts regardless of platform-provided captions, and an LLM-in-the-loop annotation framework built on the open-source LLM Tool platform.

YouTube TikTok Political discourse Far right Research infrastructure NLP
2025

An NLP-based Far-Right Ideological Score reveals a steady rise of far-right ideas in French Prime Ministers' speeches since the 1970s, driven by center and right-wing parties.

Abstract

This study analyzes the diffusion of far-right ideas in the general policy statements of French Prime Ministers (1959-2024) using Natural Language Processing methods. It develops and introduces the Far-Right Ideological Score (SIED), a quantitative indicator designed to measure the proportion of far-right ideas within political discourse. The results highlight three significant periods: a peak during the Algerian War (1959-1961), a decline following May 1968, and a steady rise from the 1970s onward, intensifying after 2005. The study reveals a cross-party diffusion of far-right ideas, particularly driven by center and right-wing forces.

Far-right ideology French politics NLP Political discourse Longitudinal analysis
DOI

Machine Learning and LLMs for Social Science Research

(Text-as-data, NLP pipelines, research infrastructure, collaborative platforms)

2026

18,062 sentences from 5,842 articles across five languages, annotated along six dimensions of digital sovereignty through a hybrid LLM pipeline.

Abstract

We introduce AI.SOVEREIGNTY, a large-scale multilingual dataset of 18,062 sentences drawn from 5,842 news articles published in 39 media outlets across five languages (English, French, German, Spanish, Portuguese) between 2012 and 2024. Each sentence is annotated along six thematic dimensions of digital sovereignty through a four-stage hybrid annotation pipeline.

Digital sovereignty Multilingual corpus AI governance LLM annotation

24,678 transcribed videos, 2.95 million annotated sentences, and a collaborative computing network for real-time updates.

Abstract

We present YouPol (YouTube and TikTok Political Observatory and Longitudinal database), a permanently updated research infrastructure that captures what political content creators actually say on video platforms. As of April 2026 and continuously expanding, the corpus comprises 24,678 videos from 64 channels across France and Quebec, with full speaker-diarized transcripts (605,134 segments, 2.95 million annotated sentences) and 7.6 million archived comments. The infrastructure includes an independent transcription pipeline and an LLM-in-the-loop annotation framework built on the open-source LLM Tool platform. YouPol also introduces the YouPol Collaborative Computing Network (YCCN), which allows any collaborating researcher to contribute processing capacity from their own machine.

YouTube TikTok Research infrastructure Transcription NLP Collaborative computing

9.2 million bilingual sentences from 20 newspapers, annotated across 65 categories with BERT and CamemBERT classifiers (macro F1 = 0.866).

Abstract

The Canadian Climate Framing (CCF) Database is a comprehensive, machine-learning-annotated corpus designed to enable large-scale analysis of climate discourse in Canadian print media. It comprises 266,271 articles from 20 major Canadian newspapers spanning nearly five decades (1978-2024), processed into 9,198,158 bilingual sentences (82.9% English, 17.1% French). Each sentence is annotated with 65 hierarchical categories using transformer-based classifiers (BERT for English, CamemBERT for French), trained on over 4,000 expert-coded sentences. The models achieve a macro F1 score of 0.866 against an independent gold standard with confirmed intercoder reliability.

Climate change Media analysis Machine learning BERT CamemBERT Canada Framing
2025

An NLP-based Far-Right Ideological Score applied to 65 years of French Prime Ministers' speeches.

Abstract

This study analyzes the diffusion of far-right ideas in the general policy statements of French Prime Ministers (1959-2024) using Natural Language Processing methods. It develops and introduces the Far-Right Ideological Score (SIED), a quantitative indicator designed to measure the proportion of far-right ideas within political discourse.

Far-right ideology French politics NLP Political discourse Longitudinal analysis
DOI
2026

An open-source pipeline that combines LLM-generated labels with BERT distillation, achieving 109-395x speedup over direct LLM annotation.

Abstract

Large language models now routinely annotate text in computational social science, but they do not hold up at corpus sizes of several million sentences. Proprietary LLMs are financially prohibitive, local open-weight LLMs take days or weeks of computation, and both remain opaque and hard to reproduce. Using LLMs to train dedicated classifiers solves these problems, yet the approach itself remains sparsely tested in the social sciences and largely inaccessible to researchers without engineering support. We present LLM Tool, an open-source Python package that runs the full hybrid workflow from a command line. On a bilingual corpus of 38,451 Canadian parliamentary debates and news media texts coded across four dimensions, classifiers trained on the best LLM labels reach a mean Micro F1 of 68.9%. Open-weight models such as GPT-OSS match one of the best proprietary models available, GPT-5, and deliver a 109-395x inference speedup over direct LLM annotation on standard workstations.

LLM BERT Automated annotation Computational social science NLP Machine learning Open source

Platforms & Infrastructures

CCF Project

266,000+ Canadian climate articles annotated by 60+ ML models, spanning 1978-2024.

YouPol

Transcribed political videos from YouTube and TikTok, with speaker diarization and annotation.