In my last post, I used ComfyUI-IF_AI_tools to integrate to the brxce/stable-diffusion-prompt-generator model running in Ollama. I wonder if I could use the base Mistral 7B model to help improve my uncreative prompts instead...

System Message

With Ollama, it is possible to create a modelfile to change the way the model responds. The modelfile works a bit like a Dockerfile, which builds on an underlying model by specifying additional instructions or overriding parameters. As mentioned, I will use the quantized Mistral 7B for Ollama, which should be released under the permissive Apache 2.0 license. The most important modelfile parameters are:

  • FROM defines the base model,
  • PARAMETER overrides model parameters e.g. temperature, whereby “Increasing the temperature will make the model answer more creatively.”
  • SYSTEM is a set of instructions that guide the model in generating the desired outputs. The Mistral documentaion describes it better:

A system message is an an optional message that sets the behavior and context for an AI assistant in a conversation, such as modifying its personality or providing specific instructions. A system message can include task instructions, personality traits, contextual information, creativity constraints, and other relevant guidelines to help the AI better understand and respond to the user's input.

Of course Mistral is not trained on “good” SDXL prompts, but all I am after is some ideas that will help with prompting as I am no word-smith. I am too lazy to experiment, and I am too lazy to do few-shot prompting...

Installing

Assuming you have Ollama installed: Pull the Mistral 7B v0.2 model, and about 4.1GB in size:

ollama pull mistral

Create a new model file, mine is mybyways-prompt.model:

FROM mistral
PARAMETER num_thread 6
PARAMETER temperature .9
SYSTEM """Your task is to improve on the given prompt for image generation by adding visual
elements and descriptions to the prompt. The prompt must be less than 100 words, using short
sentences. Start with an overview of the scene and its subjects. Then describe the expressions,
poses or actions of subjects. Add foreground and background elements like other people, objects
or buildings. Include color, camera, lighting and atmospheric conditions. Retain any art style or
artist names given, e.g. photo, cinematic still, anime, line art, pixel art, comic, 3D digital
art, fantasy. Do not refer to prior prompts, do not provide any further guidance."""

I am not sure what Stable Cascade’s token limit is, hence I arbitrarily set a maximum of 100 words. I must emphasize I do not know prompt engineering, and I am sure the following is poor, for both Stable Diffusion and Mistral!

  • To build the modelfile, run ollama create mybyways-prompt -f mybyways-prompt.model
  • To check it is created, ollama show mybyways-prompt --modelfile
  • To it delete, ollama rm mybyways-prompt
  • To run the model interactively for testing, ollama run mybyways-prompt
  • And last but not least, to run the Ollama server from the console, ollama serve

If you are curious, ComfyUI-IF_AI_tools uses a system message too - you can see it in the IFPromptMkrNode.py source code.

Workflow

In the example below, the prompt below produced the given output which is richer (though in all honestly, the original prompt produced an image more to my liking):

>>> watercolor fantasy painting of a baby dragon in a box

In a magical watercolor fantasy setting, a delicate box adorned with intricate patterns and
precious gems is home to a baby dragon. With scales glowing in shades of emerald and sapphire,
the young dragon gazes curiously at its surroundings. Its tiny wings flutter gently behind
gossamer membranes as it ponders the world beyond. Surrounding the box lies a soft carpet of
autumn leaves that dance in the breeze, while a tranquil forest filled with vibrant colors
stretches out in the background. Golden sunrays penetrate through the treetops, illuminating
the scene with a warm, enchanting glow.

I am using Stable Cascade because I hope it has a better understanding of natural language since “prompt alignment” has been improved:

Mistral System Message to Improve Image Generation Prompt in Stable Cascade