The Creative Capabilities of LLMs
I’ve been spending a lot of time exploring the creative capabilities of LLMs (Large Language Models such as ChatGPT / GPT-4). While pursuing this goal, I’ve found it challenging to achieve truly original creative results; however, I’ve discovered ways to improve the overall quality of the output produced by these language models. Recently, I was surprised by the qualitative improvement that resulted from one of my creative exercises, a process I’d been evolving, so in this post, I attempt to formalize the model I’ve developed as:
The Proposer • Critic • Synthesizer • Judge Model
Strengths & Limitations of LLMs
This process came from some insights and observations I’ve gleaned about LLMs.
1. LLMs produce better results when given ‘more time’ to ‘think aloud’
I’d seen examples of where LLMs could improve their logic & reasoning skills by working through a problem step-by-step, and explaining their rationale at each step.
Since an LLM is effectively a predictive text engine, it stands to reason that the more contextual information it is given, or that it produces along the way, the better chance that it will be ‘steered’ towards an ideal result.
A stronger setup leads to a stronger conclusion
2. LLMs seem better at analytical tasks than creative ones
I’ve observed that LLMs excel at objective, fact-based analysis or consensus-of-opinion-based analysis. I could have it review this post, for instance, and it would give me a critical response with some well-reasoned points.
I’m not sure if this is due to the priorities of the teams which have trained these models, or the difficulty in giving it reliable feedback on subjective tasks — where one person’s opinion may differ dramatically from another’s and there is no absolute right or wrong.
However, another aspect of this is that the LLM ‘finds it difficult’ to anticipate what you, as the person writing the prompt, expect to see as an output.
It led me to wonder:
Could you tease out a creative solution from an LLM given the right process?
The Improv Experiment
One of the early challenges I found, in giving creative writing tasks to LLMs, was how little attention was given to dialogue vs. exposition. Most of the stories I had produced read like plot summaries.
So I conducted an experiment where I prompted the LLM to construct a scenario where each of the key characters of the story were being portrayed by actors in an improv workshop, and they would be coming up with dialogue for a given situation.
Using personas to improvise character dialogue did improve the results
Persona-driven Dialogues
This led me to see the value in having different personas, created by the LLM, engage in dialogue from different perspectives. Not just for creative writing, I find it an interesting technique to explore concepts as well. For instance having a professor of Philosophy engage with a professor of Computer Science discuss approaches to creating an AGI.
However, I still found this would lead to unsatisfactory results. The personas were too amenable to each other’s opinion. There was too much ‘Yes and…’ and not enough critical debate, leading to an end result that didn’t seem to stand up to scrutiny.
And yet, if I then asked the LLM to critique the result, it can produce many of the same negative points that I had — which seems to suggest it has the capability to do even better
I began to think about the concept of Generative Adversarial Networks (GANs) in which you essentially have two AIs ‘playing against’ each other over the quality of the result — where one AI is making ever improving ‘guesses’ and the other AI is trying to find mistakes in those attempts.
This is how I settled on the process which has greatly improved results for me.
The Proposer • Critic • Synthesizer • Judge Model
First, let me describe each of the persona roles and then I’ll go into more detail on how I’m able to utilize them.
The Proposer
This is the idealistic persona who suggests the outcome you’re looking for. Effectively this is the role that embodies your underlying prompt.
The Critic
This is the adversarial role who tries to make well-reasoned arguments against what The Proposer is suggesting. Their goal is to see that the idea is rejected.
The Synthesizer
This persona brings both sides of the discussion together, taking the criticism into account in order to suggest the best path forward to achieve a positive outcome. They should be unbiased, but generally supportive of what The Proposer wants to achieve.
The Judge
This persona weighs up what they have heard from the other 3 personas and gives a fair and honest assessment on how likely the proposed is to succeed. They then give it a grade from A-F, on its overall fitness.
The PCSJ Model in Practice
Setup
First, I ask the LLM to write a bio for each of the personas. For instance — The Proposer as an award-winning science fiction writer. I haven’t tested this extensively, but I think this self-generated additional context gives each persona richer feedback in the discussion
Context setting
With the bios of each persona generated, I set the context for the LLM that we’re reading the transcript of these four personas in a workshop. I have them speak in turn and I define their roles according to PCSJ as described above. I start by having them introduce themselves to reinforce that their earlier bios have been retained as context.
It’s important to set the context as the ‘transcript of a meeting’ otherwise, the LLM will slip into the default of describing the dialogue instead of writing the dialogue
Iterative process
Finally, I ‘run the process’ by having each persona play their role. When The Judge gives the final grade, if it’s less than an ‘A-’, I’ll have them continue their discussion taking the comments and feedback into account. I’ll restate their PCSJ roles, otherwise they can slip into a more natural conversation.
The Synthesizer as Prompt Engineer
After I’d used this model as a technique to think through various concepts, I had another idea. In listening to examples of how people have used LLMs to generate more accurate prompts, it occured to me that —
The role of The Synthesizer could be to write the perfect prompt to achieve the goal of The Proposer
In this sense, it becomes a little ‘meta’. The goal of the conversation becomes discussing the feasibility of creating a prompt that will achieve the intended results.
And that is how I was able to achieve the best creative writing results I’ve seen so far.