It is fun to make images with Stable Diffusion, but it is also frustrating when the result is not what you expect and it takes long time generate new pictures.
I have been playing with the cpu-only-branch of Stable Diffusion on a linux computer with an 8th Generation Core i7 CPU and 16GB of RAM and here comes some findings.
Basic prompt and parameters
I wanted to generate a useful picture for my Dungeons & Dragons game. So, as a somewhat qualified start I did:
fantasy art of village house, cliff, well, town square, market and storm, in the style of greg rutkowski
I used
- seed=2 (bacause I did not like seed=1)
- Sample Steps=10
- Guide=7.5
- Sample Model=Euler Ancestral
- Resolution=512×512
- The 1.4 model (the small one)
Not very far from default settings. My performance is about 10s per sample step, thus this picture took 1m40s to generate:

This is the unmodified 512×512 picture. Below I will publish smaller/scaled picture but unless otherwise mentioned they are all generated as 512×512. This picture was not so far from what I had in mind, but I don’t see any well or market, or town square.
Sample Methods
I generated exactly the same thing, only changing the Sample Method parameter:

Three of the sample methods took (roughly) twice the time (200% in the name above). I can at least draw the conclusion that the sampling method is not just a mathematical detail but something that actually affects the output quite much.
Sampling Steps
Next thing was to try different number of sampling steps, from 2 to 99:


Guide
There is a guide parameter (how strongly the image should follow the prompt) and that is not a very obvious parameter. For this purpose i used 30 Sampling Steps and tried a few guide values (0-15 are allowed values):

To my amateur eye, guide seems to be mostly about contrast and sharpness. I do not see that the pictures resembles my prompt more or less.
Resolution
I generated 6 images using different resolutions. Sampling Steps is now 20.

To my surprise the lower than 512×512 came out ok, I have had very bad results at lower resolutions below. It is obvious that changing the resolutions creates a different picture, like with a different seed with the same prompt. The smaller pictures are faster and the larger slower to generate (as indicated by the %), and the largest image caused my 16GB computer to use its swap (but I think something else was swapped out). My conclusion is that you can not generate many pictures a low resolution, and then regenerate the ones you want with higher resolution and the same seed (there are probably other ways to upscale).
Image type
So far all images have been “fantasy art”. I tried a few alternatives with 20 Sampling Steps:

This changes much. The disposition is similar but the architecture is entirely different. What if I like a drawing with the roof style of fantasy art?
Artists
So far I have been using Greg Rutkowski for everyting (at first opportunity I will buy a collection of Greg Rutkowskis work when I find one – so far I have not found any). How about different artists:

Dropping Keywords
So far I have not seen much of wells and markets in my pictures. What about dropping those keywords from the prompt?

Model Choice
There is a 1.4-model to download, and a larger (full) version. What is the difference. I tried three prompts (all fantasy art in the style of greg rutkowski):
- old well in medieval village
- medieval village on cliff
- medieval village under cliff

Conclusion here is that the result is slightly different depending on model, but it is not like it makes a huge difference when it comes to quality and preference.
Trying to get a well
Not giving up on getting a picture of a well I made 9 pictures, using different seeds and the prompt:
- fantasy art old well in medieval village, greg rutkowski

None of them contains a well as I think of a well. If I do an image search on Google I get plenty of what I want. Perhaps Stable Diffusion does not know what a well looks like, or perhaps this is what fantasy art and/or Greg Rutkowski would draw wells as.
Conclusion
I did this because I thought I could learn something and I did. Perhaps you learnt something from reading about my results. It is obviously possible to get cool pictures, but what if you want something specific? The prompt is important, but if you are playing with the wrong parameters you may be wasting you time.
0 Comments.