How to solve a problem like Generative AI? Matt Hardy shares why he used GenAI to teach critical appraisal. Read his article.

How to solve a problem like Generative AI

Why I developed a workshop using Large Language Models to teach critical appraisal

By Dr Matthew Hardy, University of Bradford, UK, and Member of Physiology News Advisory Group  

“Education professionals were caught unprepared by GenAI and we had not had the foresight to research the capability of LLMs, or to tell our students what was permissible. Put more simply, as a teacher of writing skills, student use of GenAI for written assignments became common enough that I had no choice but to address it in my teaching.”

Paton Historical Studies Fund recipient, Matthew Hardy (University of Bradford, UK)
Matthew Hardy

Why do you feel the need to innovate with generative AI?

If you only looked at my job title, you could be forgiven for thinking that most of my teaching is centred on teaching students how the human body works. Indeed, I do still spend a sizeable portion of my workload teaching biology and physiology. However, like many teaching-focused lecturers, I have spent time cultivating a niche: the majority of my student teaching now addresses the development of transferable and academic skills for bioscientists. In particular, I address difficulties students have with approaching academic written work.

When ChatGPT 3.5 was made available to the wider public it was announced with much fanfare across news media, including the suggestion that everyone may ‘become a cheat’ (Reich R, 2022). At the time I considered that making such statements may become a self-fulfilling prophecy; if you tell everyone that a new tool will write their essay for them, it is unsurprising that some students will use the tool for just that. However, regardless of intent, it became apparent in 2023 that a sizeable enough proportion of students were using large language models (LLMs) to assist with their written work. Whilst I have no doubt that some were using them as a shortcut (cheating), it was also clear that some were doing so out of ignorance. This widespread adoption of LLMs was relatively rapid and occurred partway through the academic year. Thus, education professionals were caught unprepared by GenAI and we had not had the foresight to research the capability of LLMs, or to tell our students what was permissible. Put more simply, as a teacher of writing skills, student use of GenAI for written assignments became common enough that I had no choice but to address it in my teaching.

Do you think that generative AI is harmful for student education?

As someone who enjoys writing, my natural instinct is to decry the advent of GenAI and the influence it has had on how some students approach written work. The reason that I have become invested in incorporating GenAI into some of my teaching is more one of necessity than innovation. A number of students are going to use GenAI in their assignments whether I teach about its use or not; to ignore that would be akin to burying my head in the sand.

That said, as I have developed teaching with GenAI I have become aware of two things. The first is that if we don’t offer any guidance as to how students can use LLMs, a sizeable proportion of those that do will misuse GenAI. This is not to say they are all cheating on purpose, but they are not aware of how to make effective use of the GenAI tools available to them. This has consequences of either academic misconduct, low quality work, or both.

The second point is subjective. My impression is that the more I teach students about LLMs, the less inclined they are to actually use them (this notion is borne out somewhat by some of my earlier workshops using GenAI (Hardy ME, 2023)). I am not sure if this is because they are becoming aware that appropriate use requires additional work and learning, and some of them therefore find it easier to simply write their assignments. Alternatively, it may be because the more students are taught about use (and misuse) of GenAI models, the more concerned they become about being flagged for misconduct (this is a student concern over AI use regardless of whether they are intending to cheat or not). Regardless, I am hopeful that for those students who develop an aversion to AI use, they may be developing their traditional writing skills.

However, although expectant that I am teaching students effective writing practices, as an educator I also need to be aware that my teaching needs to be inclusive. Some of my students will have more of an affinity for written language than others, whilst others may have barriers to English communication. For example, students with dysgraphia or dyslexia are likely to encounter barriers to producing high quality written content (Chung PJ et al., 2020; Tops et al., 2013) and scientists for whom English is a second language will take longer to read and write papers, whilst also being more likely to have submitted articles rejected (Lenharo M, 2023; Amano T et al., 2023). By definition, LLMs are good at producing grammatically correct language (as well as relatively accurate translations). In this regard they have great potential for helping students overcome these barriers to science communication.

On balance, I think that our approaches to the use of GenAI probably have been damaging for education. That said, now GenAI has become more engrained, there is more research and innovation that recognises the strengths and weaknesses of its use. The incoming head of the Office for Students, Edward Peck CBE, recently identified ten trends that may shape the future university landscape (Peck E et al., 2025). This includes the notion that, as knowledge becomes more ubiquitous, educators may need to emphasise higher-order skills such as judgement and critical thinking. Where this occurs, it is likely that GenAI may become a partner to education and research. The workshop I describe below may be a (small) step toward that.

How did you come up with the idea for the workshop you delivered?

As more people began to use GenAI to help them produce written work, it became apparent to me that those hoping for a shortcut were not playing to the strengths of the applications. LLMs are very good at producing grammatically accurate text. However, should they be asked to produce facts, references, or even just generate academic writing, they are prone to conversational tone, bias, repetition, and hallucinations (falsehoods).

Another, less obvious, strength of LLMs is their ability to integrate information. This has previously been demonstrated with the integration of knowledge that has either been produced by another GenAI application, or stored in a data repository to which the LLM has access (Lewis P et al., 2020; Liu J et al., 2021).

The workflow I developed was designed to utilise the aspects of LLMs that could be trusted, whilst also considering how this may be used to aid students writing. In this regard, I optioned not to rely on knowledge produce by an LLM, which may be inaccurate, but to supply it with ‘knowledge’ it could use to produce written text. With academic writing in mind, I considered the concept of critical appraisal to be something that might constitute a useful basis for the ‘knowledge’ that students could supply. If I gave students a research question and some evidence (journal articles) with which to answer it, could they appraise the evidence into knowledge that could be integrated by an LLM to make sound conclusions?

This required some specific prompting methodology to instruct an LLM to produce text that would incorporate the critical appraisal supplied by students. This also meant that the workflow could include prompts suggesting the structure of the text that was to be written. This provides students with an additional learning outcome regarding paragraph structure. Thus, the prompts used consist of the following:

  • Prompt 1: Instructs an LLM that it will produce text integrating knowledge that will be provided in the next prompt. The prompt includes direction as to how the text should be structured (e.g. a claim, evidence-based arguments, conclusion) and format (academic).
  • Prompt 2: Provides itemised knowledge from the critical appraisal of evidence.

From this, students working through this process may learn: critical appraisal, writing structure, and some basic prompt engineering techniques.

Is your workflow appropriate for generating academic work?

No, this approach should not be used as a substitute for one’s own writing. Academic writing typically requires an evidence-based opinion to be expressed. In the case of the workflow described above, the researcher merely provides the evidence whilst the opinion is generated by the AI. It would also be flagged as written by AI at those institutions that use an AI-checker. That said, it is a good litmus test for the effectiveness of a student’s critical appraisal; should the text output provide a valuable insight, it is an indication that the knowledge provided by the student is appropriate to the topic (and true to the evidence).

What are your concluding thoughts?

One of the issues with GenAI applications is that they have been mis advertised. Many people perceive them as an easy means to produce text or discover facts. From an educators point of view, this is upsetting because it suggests that they have the potential to remove a lot of the skills that we consider sacrosanct. Regardless of our concerns, LLMs are becoming ubiquitous in everyday life, appearing in MS applications (e.g. MS Editor) and being used by colleagues in perfectly acceptable, time-saving manners (e.g. constructing emails).

I am hopeful that as GenAI advances, so too will the skillset of those that use it. The workflow I have suggested is a useful teaching tool, but I am also aware of approaches that can help in other aspects of written work. Examples include: refining search strings for searching databases, suggesting questions to ask on a topic, and outlining structures for written assignments, to name but a few. All of these require knowledge of traditional methodology and how to prompt an LLM to produce effective output. Should these be in place, then I am hopeful that LLMs may become useful aids, or partners, in education and research.

References

  1. Amano T et al.(2023). The manifold costs of being a non-native English speaker in science. PLoS Biology 21.7
  2. Chung PJ, Patel DR andNizami I. (2020) Disorder of written expression and dysgraphia: definition, diagnosis, and management. Transl Pediatr. 9(Suppl 1):S46-S54.
  3. Hardy ME. (2023) The Impact of Artificial Intelligence on Teaching Writing Skills to Life Science Students. Physiology News. Issue 132. P38-9.
  4. Lenharo M (2023). Science’s Language Barrier: the cost for non-native English speakers. Nature 619 (2023): 27.
  5. Lewis P et al. (2020). Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in neural information processing systems, 33, 9459-9474.
  6. Liu J et al. (2021). Generated knowledge prompting for common sense reasoning. arXiv preprint arXiv:2110.08387.
  7. Peck E, Mcarthy B and Shaw J. (2025). The future of the campus university: 10 trends that will change higher education. HEPI Policy Note 64. 2025.
  8. Reich R. (2022)Now AI can write students’ essays for them, will everyone become a cheat? The Guardian (2022) 28 July. https://www.theguardian.com/commentisfree/2022/nov/28/ai-students-essays-cheat-teachers-plagiarism-tech Accessed 09 October 2023.
  9. Tops W et al. (2013). Beyond spelling: The writing skills of students with dyslexia in higher education. Reading and Writing: An Interdisciplinary Journal, 26(5), 705–720.

Site search

Filter

Content Type