Large Language Models and the 5W1H: Strengths, Weaknesses, and Open Questions
Date:
Large language models (LLMs) are increasingly employed in social science research as tools for analyzing text, generating insights, and assisting with theory development. Yet their contributions remain uneven, with strengths in some domains and significant limitations in others. This talk evaluates LLMs’ potential and constraints through the classic “5W1H” framework: What, Who, Whom, Where, When, and How. I show that LLMs perform reliably in identifying topics, actors, and explicit contexts, but are less effective in discerning audiences, temporal dynamics, and explanatory mechanisms. These asymmetries underscore both the promise and the pitfalls of integrating LLMs into social research. By distinguishing between what is currently known, where LLMs underperform, and what remains uncertain, the talk provides a roadmap for using LLMs as complementary tools rather than substitutes for theory-driven inquiry. The goal is to highlight how LLMs can advance social science when their capacities are critically assessed, strategically applied, and embedded within established methodological frameworks.