Language Analysis Across Contexts: A Dialogue Between Forensic Linguistics and Psychology — A Rejoinder to Hunter & Grant (2025)

Before I dive into my thoughts, I want to openly acknowledge my longstanding involvement with the Linguistic Inquiry and Word Count (LIWC) software. I’ve been deeply engaged in its development and daily operations for well over a decade at this point in my life, ranging from (but hardly — and I mean hardly — limited to) continual development/improvement of the LIWC dictionary to driving major updates like LIWC-22. Although I use a fairly wide variety of NLP methods and don’t particularly consider myself a “LIWCian,” my close association with LIWC does color my perspectives. So, as you read on, please consider my insights with an understanding of this background.


Hunter and Grant’s recent critique raises some good questions about LIWC’s applicability in forensic and security contexts. Their observations highlight areas where care and context are necessary when using LIWC and, really, any type of computational tool. While I believe some of their conclusions overstate, or perhaps misplace, LIWC’s limitations1, their paper opens an opportunity to reflect on how we approach language analysis, particularly in high-stakes domains like security and extremism. I have not had any dialogue with the authors of the critique. My hope is that this post will come across as constructive rather than critical, collaborative rather than defensive. But, as we all know, language is complicated, and your interpretation may differ drastically from my intent.

1. The Challenge of Small Word Samples

One of the key features of Hunter and Grant’s analysis is their focus on a small number of (relatively) short text samples — ranging from 266 to 405 words per individual. For (what I am presuming to be) their forensic linguistics framework, which often seeks deep, precise, and rather particular/specific insights into the language of a small number of individuals, this approach makes sense. When studying a single person or a small group, the risks of misunderstanding that person or group’s language and intentions are far too high to rely solely on computational tools like LIWC. Here, a multifaceted approach — combining LIWC with manual coding, qualitative analysis, and other converging computational methods like part-of-speech tagging — is essential.

However, LIWC’s roots lie in scientific psychological research, where the goals are often radically different. Rather than understanding one person in granular detail, LIWC is typically used to analyze larger samples of language to identify patterns and measure group-level differences, uncover reliable statistical associations, detect within-person changes over time in a sample, and so on. In psychological research, the focus is less on the precise interpretation of individual texts and more on what aggregated language data can reveal about psychological constructs like emotional traits/states, cognition, or social orientation.

This is a fundamentally different mindset from the one often adopted in discourse analysis or forensic linguistics. Instead of focusing on the fine-grained meaning of a specific text or word, psychological science leverages tools like LIWC to explore generalizable patterns that hold across individuals and contexts. That’s not to say one approach is better than the other — they serve different purposes. For researchers interested in understanding one or two people deeply (e.g., in security or extremism contexts), LIWC is just one tool in a broader methodological toolkit, and it must be complemented by qualitative and manual approaches. This is crucial because the risks of misinterpreting language in such contexts are too high to rely solely on automated tools.

However, when applied to larger datasets or group-level analyses, LIWC’s strengths come to the fore. By focusing on high-frequency, high-reliability indicators of various psychological constructs, LIWC provides insights into broad patterns that might otherwise be invisible in smaller-scale or purely qualitative approaches. This tradeoff — precision over recall — is intentional and reflects LIWC’s design as a scalable psychometric tool rather than a forensic scalpel.

This difference in goals — understanding individuals deeply versus identifying patterns across many people — underscores the importance of matching methods to research questions. Both perspectives have value, and appreciating the strengths and limitations of each approach allows us to build a more nuanced and effective toolkit for studying language and psychology across a multitude of human contexts.

2. Understanding LIWC’s Design and Its Tradeoffs

Hunter and Grant critique LIWC for its limited recall in certain categories, such as negative emotion words. This observation is valid but reflects a deliberate tradeoff in LIWC’s design. LIWC prioritizes precision over recall, meaning it aims to capture the most common and reliable indicators of a psychological construct rather than attempting exhaustive coverage.

Consider an example: let’s say, for the sake of argument, that the most frequently used negative emotion words are bad, hate, and hurt, and these account for roughly 80% of all negative emotion words people use. LIWC’s strategy focuses on capturing this smaller subset of words reliably rather than including a much broader array of words that might reflect negative emotion in niche or context-specific ways. This focus reduces misclassifications and improves the tool’s reliability across a wide range of applications.

Why is this approach valuable? Imagine if LIWC tried to include every possible word or phrase that could indicate a negative emotion. We’d have to account for thousands of potential expressions, including idioms, regional slang, or community-specific jargon like “not a vibe” or “throwing shade.” While this might seem more comprehensive, it would actually increase the likelihood of errors. Words often have multiple meanings depending on context, and trying to capture every edge case would inevitably lead to more false positives — cases where the tool incorrectly classifies neutral or even positive expressions as negative. For instance, consider a word like “sick.” Depending on the context, it could mean feeling ill (negative), a synonym for something amazing (positive), or just be a filler word with no emotional tone at all.

By focusing on high-frequency, high-confidence words, LIWC ensures that its categories reflect constructs in a way that’s consistent and reliable across diverse contexts. This approach is particularly important because LIWC isn’t designed for one specific use case — it’s applied in everything from psychological research to health communication, business analytics, and even education. Trying to capture every possible nuance would make the tool less generalizable and less effective in its core purpose: providing scalable insights into broad psychological patterns. If we can be relatively confident that the LIWC categories that are intended to measure affective state show convergent validity with other types of assessment (see, e.g., Hartnagel et al., 2024), we’re on the right track. (More thoughts on this below when we come to discussing the nomological network.)

Of course, this doesn’t mean that niche or context-specific expressions are unimportant — far from it. In specific fields like forensic linguistics or sociolinguistics, capturing these nuances might be crucial. However, the tools for these contexts often need to be highly tailored and developed in collaboration with domain experts who understand the unique linguistic dynamics of the group being studied. LIWC’s generalist approach is designed to be a starting point, offering a reliable foundation for measuring psychological constructs that can be complemented with more context-specific methods as needed.

This design philosophy — prioritizing reliability and scalability — also aligns with LIWC’s roots in psychology, where researchers are often dealing with large datasets and need to ensure that they’re using the same measures in way that is (at least conceptually) consistent across different samples and studies. In these cases, being able to reliably capture 80% of a construct with confidence is often far more useful than chasing the final 20%, especially if doing so has the potential to introduce a large amount of noise or misclassification into the analysis.

Ultimately, LIWC’s focus on capturing the “core” of a construct rather than its fringes reflects a deliberate tradeoff. It’s not trying to be all things to all people — it’s trying to do one thing well: help researchers explore how language reflects psychological processes in a way that’s efficient, replicable, and applicable across a wide range of settings.

This tradeoff means LIWC isn’t perfect for capturing the nuances of language in specific contexts, like extremist forums with specialized vocabularies. But its generalizability across diverse datasets is why it has become such a widely used tool.

3. The Value of a Nomological Network

Hunter and Grant express skepticism about what they perceive as circular reasoning in LIWC’s validation process, suggesting that many researchers simply assume it works because others have used it. While this critique raises important questions about transparency and rigor, it overlooks a foundational concept in psychological science: the nomological network.

A nomological network refers to the web of evidence supporting the validity of a measure through its consistent relationships with other constructs, behaviors, and outcomes. In psychology, many constructs — like depression, stress, or social connectedness — are inherently abstract and cannot be observed directly. Instead, they’re inferred from patterns of behavior, including language use, that are consistently tied to these constructs across studies and contexts.

Take, for example, the well-documented relationship between I-words (e.g., I, me, my) and depression. Many studies have found that people who use I-words more frequently tend to show higher levels of depression or other forms of psychological duress, likely reflecting an inward, self-focused attention pattern often associated with depressive states. This relationship has been observed in a range of contexts, from diary entries to social media posts, forming part of a larger nomological network connecting language to psychological phenomena. However, it’s equally important to acknowledge that this relationship doesn’t always hold. Some studies don’t find a strong link between I-words and depression, and that’s not necessarily a failure of the measure. Instead, it prompts us to ask deeper questions: When does this relationship appear? Under what circumstances does it weaken or disappear? What other factors — personal, cultural, situational, or methodological, for example — might influence the connection?

This variability is not a flaw; it’s an opportunity to better understand the “how,” “when,” and “why” of the phenomena we’re studying. Just as a thermometer might sometimes give unexpected readings due to external conditions (like extreme humidity or altitude), variations in findings push us to refine our theories and methods. They help us identify boundary conditions, explore moderating variables, and deepen our understanding of the constructs themselves.

My good colleague Dave Markowitz and I (and others, like Andy Schwartz, among hundreds of others in the field) have written extensively on multiple occasions about how tools like LIWC play a pivotal role in building and testing such networks of understanding. The strength of a tool like LIWC lies not in its ability to perfectly capture every nuance of a construct in every context, but in its ability to consistently reveal meaningful patterns that align with broader theoretical frameworks. For instance, even if the rate at which people use I-words isn’t always linked to depression in every dataset, the numerous studies that do find this connection contribute to a reliable and replicable foundation for understanding how self-focus relates to mental health.

This perspective differs from the forensic linguistics approach that Hunter and Grant appear to adopt, which often emphasizes dissecting specific texts or cases in granular detail. While this approach is valuable for understanding language in a particular context, it represents a different mindset from the norm in scientific, psychological research, where the focus is often on identifying generalizable patterns across larger datasets or populations. Neither approach is inherently “better”; they serve different purposes. Forensic linguistics excels at capturing nuance and context in specific cases, while LIWC’s strength lies in its ability to support replicable, theory-driven insights into psychosocial phenomena.

Ultimately, the value of a tool like LIWC lies in its integration into a larger nomological network. It’s not about capturing every edge case or delivering perfect results in every study — it’s about consistently contributing to a broader understanding of how language reflects the mind. The fact that not every study finds the same result isn’t a failure of the tool; it’s a reminder of the complexity of human behavior and the importance of refining our theories to account for that complexity.

4. Bridging Perspectives

Ultimately, we agree with Hunter and Grant on an essential point: tools like LIWC should not be used in isolation, especially in sensitive contexts like security or extremism. Language is complex, and its meanings shift depending on context, culture, and usage. Combining LIWC with qualitative methods and domain expertise ensures a more nuanced understanding of the phenomena under study.

At the same time, this discussion highlights a much broader and more exciting opportunity for the social sciences. Tools like LIWC, when thoughtfully integrated into multi-method approaches, remind us of the incredible potential language holds for unlocking insights into human thought, emotion, and behavior. Language is both universal and deeply personal — an intersection of the individual mind and the broader social world. By leveraging advancements in natural language processing and computational methods, while grounding our work in rigorous theory and domain expertise, we are poised to ask — and answer — questions that were once unthinkable.

This is an exciting moment for the social sciences. The integration of computational tools and, in particular, emerging AI methods like LLMs with traditional methods opens doors to studying the human condition at both larger scales and deeper levels of nuance that were previously unimaginable. We can investigate patterns across millions of voices, track societal shifts in real-time, and uncover links between language and mental health, culture, or identity that deepen our understanding of ourselves and others.

Equally important, this convergence of methods fosters collaboration across disciplines — psychologists working with linguists, computational scientists partnering with sociologists, anthropologists joining forces with ethicists. These partnerships push us to think more creatively about how we study human behavior, ensuring that our methods are not only innovative but also thoughtful, inclusive, and responsible.

The takeaway here is much larger than any single tool: it’s a call to embrace the complexity of language and human experience while working toward solutions that are rigorous, ethical, and impactful. Tools like LIWC are just one part of a much larger toolkit, one that grows more powerful and promising as we combine perspectives and approaches. By fostering dialogue between disciplines and integrating diverse methods, the social sciences are not only evolving — they’re thriving, offering us new ways to understand and navigate our ever-changing world.


  1. Footnote: Before jumping right into it, I’d like to point out that their critique is spot-on in identifying that a particular misspelling of a racial epithet was in the LIWC-22 “swear words” dictionary, but the “correct” spelling of that same word was not categorized in the same way. This was a forehead-slappingly, undeniably embarrassing oversight on our part — one that has since been corrected and will be included in a forthcoming LIWC-22 update. After nearly 18 months of painstaking work redeveloping the LIWC-22 dictionary, it’s clear that even the most meticulous processes can miss something critical when fatigue sets in. It’s a humbling reminder that attention to detail matters right up until the very last keystroke. ↩︎

Leave a Reply

Your email address will not be published. Required fields are marked *