Since OpenAI made ChatGPT publicly available in November 2022, the use of large language models (LLMs) within academia and higher education has been widely scrutinised. Some points of view are critical of their use: using LLMs to generate written work has been considered akin to plagiarism, while software that can be used to identify work written using generative artificial intelligence (AI) is fallible. This suggests that the impact of LLM use may include restructuring of assessments and, in some cases, reverting to pen and paper-based submissions (Milano et al., 2023). Alternatively, the accessibility and widespread use of LLMs suggests that they are tools that will become common in many future careers and thus there is a responsibility of academics to teach students how to use them ethically and effectively (Hardy, 2023). Indeed, Russel Group Universities have released a statement to this effect (Russel Group, 2023).
In response to this dilemma, this project aimed to develop a workflow utilising an LLM to include critical appraisal in paragraphs suitable for written academic work. This was to be done in a manner that would facilitate writing production, without reducing the need for understanding of the scientific content.
ChatGPT 4 was provided with a sequence of prompts that utilised a combination of Generate Knowledge Prompting (Liu et al., 2021) and Prompt Chaining (Anthropic, 2024). The LLM was instructed that it would be provided with knowledge and it should then write a paragraph according to the following structure: Claim, Justification, Evidence, Conclusion. Constraints were provided that defined content and a range for the number of sentences for each part of the paragraph. Knowledge was provided that included appraisal from two journal articles, as well as providing a hierarchy regarding the importance of details. The resultant paragraph was refined by providing additional prompts for ease of reading, academic format, and to remove statements that could not be confirmed with the knowledge that had been provided. While ChatGPT 4 successfully produced a comprehensible paragraph, the text produced was identified as being generated by AI using the Turnitin AI-checker.
This workflow demonstrates that using LLMs can be a valid approach to producing written work and can be done without encouraging cognitive dissonance from the scientific content incorporated. Furthermore, using LLMs may provide more inclusive writing practices for authors who may have barriers to producing written content, for example, those for whom English is a second language. However, using such approaches provides content that will be identified as being generated by AI, which will impact future approaches to identifying and defining academic misconduct within written work.