LLM Prompt Generation with Ollama in ComfyUI

In my last post, I described running Mistral, a Large Language Model, locally using Ollama. To accompany that piece, I created a prompt and manually used AI to generate an image. Today, I’ll wire up a ComfyUI workflow to Ollama to do this seamlessly, thanks to ComfyUI-IF_AI_tools.

Installing

Read the docs. And don’t blindly follow code (like I do).

Assuming you have Ollama installed: Pull the brxce/stable-diffusion-prompt-generator model, which is based on LLaMA-7B and about 4.1GB in size, then start Ollama as a server to observe the happenings:

ollama pull brxce/stable-diffusion-prompt-generator
ollama serve

Assuming you have ComfyUI Portable installed: Follow the instructions on the ComfyUI-IF_AI_tools page, which I summarized below. The batch file in the last line installs the prerequisites, of which there are many I would rather skip, since I am only using Ollama (I do not need or want the OpenAI, Anthropic, etc. libraries). Anyway:

cd ComfyUI\custom_nodes
git clone https://github.com/if-ai/ComfyUI-IF_AI_tools.git
cd ComfyUI-IF_AI_tools
embedded_install.bat

Now we can use LLMs, locally running in Ollama, to help with prompt generation, directly from the ComfyUI user interface. However, if you prefer, IF_AI_tools also LLM services in the Cloud.

Workflow

Just a bog-standard SDXL-type workflow (nodes in green), except we use two new IF_AI_tools nodes (in what passes for red) to “enhance” the original input prompt:

use the IF Prompt to Prompt IF_PromptMkr node to pass a very simple prompt to the model in Ollama.
use the IF Display Text IF_DisplayText and, if required, the IF_SaveText nodes to see and save the embellished output prompt.

Sometimes I get a truly incoherent and utterly meaningless word salad that has little or no relation to the input prompt. If that level of “creativity” suits, then... great! But often, the word salad has so much more influence on the generated image than the actual desired subject! I toned down the temperature so that the generated prompt more closely aligns to the input - in the example above, at the default of 0.7, the “in a box” instruction was simply skipped. Even then, what happened to “on the beach”? So: the higher the temperature, the more variable and random the output.

And yes, I have insufficient RAM to cache the models in memory, making everything very slow. Good luck.

❮ Older

Newer ❯