A SYSTEMATIC LITERATURE REVIEW OF EVALUATING THE ACCURACY OF NVIVO, R, AND LARGE LANGUAGE MODELS IN THEMATIC ANALYSIS OF INTERVIEW DATA: A COMPARATIVE STUDY

Loading...
Thumbnail Image

Date

Journal Title

Journal ISSN

Volume Title

Publisher

The Library, University of Kelaniya, Sri Lanka.

Abstract

In response to the growing integration of computational tools in qualitative research, this study presents a systematic comparison of three widely adopted approaches for conducting thematic analysis on qualitative interview data: NVivo (proprietary), R-based text analysis packages (open source), and Large Language Models (LLMs) such as ChatGPT. Thematic analysis remains a cornerstone of qualitative inquiry, and with increasing reliance on automated methods, it is essential to understand how accurately these tools replicate the depth, nuance, and contextual sensitivity of human-coded analysis. Interview transcripts, selected for their richness in expression and detail, provide a robust dataset for comparative evaluation. Manual thematic coding, grounded in Braun and Clarke's six-phase framework, is employed as the gold standard. Each tool then independently analyzes the same dataset to extract themes and patterns. The resulting outputs are assessed using established metrics such as Precision, Recall, F1 Score, and Thematic Overlap, which quantitatively capture the degree of alignment with the human-coded reference. This evaluation aims not only to identify the comparative strengths and weaknesses of each tool but also to highlight the contexts in which specific tools may be most effective or limited. Moreover, the study delves into the potential of hybrid approaches that combine human judgment with machine-driven analysis to improve both accuracy and efficiency. By systematically mapping the capabilities and limitations of these tools, the research provides valuable, evidence-based guidance for researchers and practitioners navigating the evolving domain of AI-supported qualitative analysis. The findings are intended to support more informed tool selection and promote the thoughtful integration of computational methods into human-centered research workflows.

Description

Citation

Jayawardana, B. C., Withanaarachchi, A. S., Jayalal, S., & Abeysekara, R. (2025). A systematic literature review of evaluating the accuracy of NVivo, R, and large language models in thematic analysis of interview data: A comparative study. Proceeding of the 3rd Desk Research Conference - DRC 2025. The Library, University of Kelaniya, Sri Lanka. (pp. 170–183).

Collections

Endorsement

Review

Supplemented By

Referenced By