Monthly Archives: October 2022

Stable Diffusion – Playing with parameters

It is fun to make images with Stable Diffusion, but it is also frustrating when the result is not what you expect and it takes long time generate new pictures.

I have been playing with the cpu-only-branch of Stable Diffusion on a linux computer with an 8th Generation Core i7 CPU and 16GB of RAM and here comes some findings.

Basic prompt and parameters

I wanted to generate a useful picture for my Dungeons & Dragons game. So, as a somewhat qualified start I did:

fantasy art of village house, cliff, well, town square, market and storm, in the style of greg rutkowski

I used

  • seed=2 (bacause I did not like seed=1)
  • Sample Steps=10
  • Guide=7.5
  • Sample Model=Euler Ancestral
  • Resolution=512×512
  • The 1.4 model (the small one)

Not very far from default settings. My performance is about 10s per sample step, thus this picture took 1m40s to generate:

This is the unmodified 512×512 picture. Below I will publish smaller/scaled picture but unless otherwise mentioned they are all generated as 512×512. This picture was not so far from what I had in mind, but I don’t see any well or market, or town square.

Sample Methods

I generated exactly the same thing, only changing the Sample Method parameter:

Three of the sample methods took (roughly) twice the time (200% in the name above). I can at least draw the conclusion that the sampling method is not just a mathematical detail but something that actually affects the output quite much.

Sampling Steps

Next thing was to try different number of sampling steps, from 2 to 99:

I find in fascinating how some buildings disappear and are replaced by others at certain thresholds here. Even if it is more expensive to run 75 steps than 10, if you are looking for results like the 75 stops picture, there is no point in generating multiple images with 10 steps. As an amateur, more steps give more details and more sharpness.

Guide

There is a guide parameter (how strongly the image should follow the prompt) and that is not a very obvious parameter. For this purpose i used 30 Sampling Steps and tried a few guide values (0-15 are allowed values):

To my amateur eye, guide seems to be mostly about contrast and sharpness. I do not see that the pictures resembles my prompt more or less.

Resolution

I generated 6 images using different resolutions. Sampling Steps is now 20.

To my surprise the lower than 512×512 came out ok, I have had very bad results at lower resolutions below. It is obvious that changing the resolutions creates a different picture, like with a different seed with the same prompt. The smaller pictures are faster and the larger slower to generate (as indicated by the %), and the largest image caused my 16GB computer to use its swap (but I think something else was swapped out). My conclusion is that you can not generate many pictures a low resolution, and then regenerate the ones you want with higher resolution and the same seed (there are probably other ways to upscale).

Image type

So far all images have been “fantasy art”. I tried a few alternatives with 20 Sampling Steps:

This changes much. The disposition is similar but the architecture is entirely different. What if I like a drawing with the roof style of fantasy art?

Artists

So far I have been using Greg Rutkowski for everyting (at first opportunity I will buy a collection of Greg Rutkowskis work when I find one – so far I have not found any). How about different artists:

Obviously picking a suitable artist is critical for your result. To my surprise, for my purposes Anders Zorn is probably more useful than Boris Vellejo.

Dropping Keywords

So far I have not seen much of wells and markets in my pictures. What about dropping those keywords from the prompt?

The disposition is somewhat similar, and still no wells or markets.

Model Choice

There is a 1.4-model to download, and a larger (full) version. What is the difference. I tried three prompts (all fantasy art in the style of greg rutkowski):

  • old well in medieval village
  • medieval village on cliff
  • medieval village under cliff

Conclusion here is that the result is slightly different depending on model, but it is not like it makes a huge difference when it comes to quality and preference.

Trying to get a well

Not giving up on getting a picture of a well I made 9 pictures, using different seeds and the prompt:

  • fantasy art old well in medieval village, greg rutkowski

None of them contains a well as I think of a well. If I do an image search on Google I get plenty of what I want. Perhaps Stable Diffusion does not know what a well looks like, or perhaps this is what fantasy art and/or Greg Rutkowski would draw wells as.

Conclusion

I did this because I thought I could learn something and I did. Perhaps you learnt something from reading about my results. It is obviously possible to get cool pictures, but what if you want something specific? The prompt is important, but if you are playing with the wrong parameters you may be wasting you time.

Stable Diffusion CPU-only

I spent much time trying to install Stable Diffusion on an Intel NUC Hades Canyon with Core i7 (8th Generation) and an AMD RX Vega (4GB), with no success. 4GB is tricky. AMD is trickier.

I gave up on my NUC and installed on my laptop with Windows, GeForce GTX 1650. That worked, and a typical image (512×512 and 20 samples) takes about 3 minutes to generate.

For practical reasons I wanted to run Stable Diffusion on my Linux NUC anyway, so I decided to give a CPU-only version of stable diffusion a try (stable-diffusion-cpuonly). It was a pretty easy install, and to my surprise generation is basically as fast as on my GeForce GTX 1650. I have 16GB of RAM and that works fine for 512×512. I think 8GB would be too little and as usual, lower resolutions than 512×512 generates very bad output for me.

So when you read “stable diffusion requires Nvidia GPU with at least 4GB of RAM”, for simple hobby purposes any computer with 16GB of RAM will be fine.

Elemental Dice 1-5

I just received 5 new Elemental Dice. Only two of the new ones were metal (Sa,Ce) and three were embedded in resin (S, Mn, Hg). Here is a picture of the complete collection.

As you can see the Ce(rium) die has already started to deteriorate. It came like that, I am not the only one, and we will see about replacement shipments.

I added Ce and Sa to my density table:

Atomic
Weigth
ElementTextbook
Dens
(g/cm)
Weight
(g)
Actual
Density
(g/cm)
Quote
6C2.267.071.7376.37%
12Mg1.7387.021.7198.61%
13Al2.710.902.6698.56%
22Ti4.517.944.3897.33%
24Cr7.1927.776.7894.29%
26Fe7.87430.657.4895.03%
27Co8.934.508.4294.64%
28Ni8.935.398.6497.08%
29Cu8.9635.548.6896.84%
30Zi7.13324.445.9783.65%
39Y4.4717.244.2194.16%
40Zr6.4925.386.2095.47%
41Nb8.5733.488.1795.38%
42Mo10.240.089.7995.93%
45Rh14.434.288.3758.12%
46Pd11.934.048.3169.84%
47Ag10.4940.319.8493.82%
48Cd8.733.028.0692.66%
50Sn7.3129.067.0997.06%
56Ce6.7626.336.4395.09%
62Sa7.5229.327.1695.19%
64Gd7.930.617.4794.60%
74W19.376.2818.6296.49%
78Pt21.4534.258.3638.98%
79Au19.2534.528.4343.78%
82Pb11.2943.9210.7294.97%
83Bi9.7838.029.2894.91%

All dice have a lower wight than exptected. First, the edge are rounded and text and die-numbers are engraved or carved out of the metal cube so it is expected to not be 100%. Some dice are just plated, for obvious reasons (Rh,Pd,Pt,Au) so most part of those are probably Fe/Ni-something. I do not know what is up with Carbon, I suppose it is another form of pure carbon than Graphite.

Windows 11 22H2, Docker, on Dell XPS 15 7590

I have a Dell XPS 15 7590 that I use for running Docker and Business Central images. I have used Windows 10 for two years experiencing some occational problems with starting the Docker images, so I decided to finally upgrade to Windows 11.

Bad timing. It seems 22H2 introduced a bug for this computer (link to Dell forum).

Automatically updating to Windows 11 failed. Blue Screen on boot: INACCESSIBLE_BOOT_DEVICE. But it recovered from that and booted back to Windows 10.

I decided to make a clean installation of Windows 11 Pro. That was ok, until i installed Docker (4.12.0), then I got the same Blue Screen and this time I found no way to recover. I think the problem has to do with Hyper-V-activation, but there are probably more details I am not aware of.

Eventually, after several installations and restore efforts, things seem to work:

  • BIOS AHCI (not RAID, but I do not think that matters)
  • BIOS Virtualization ON
  • BIOS Virtual Direct I/O OFF
  • Applied KB5017389

I have now learnt to use Restore Points in Windows. Very useful to make a manual restore point before a significant configuration change. When the computer fails to start properly, you can navigate to the option of using a Restore Point, and that has worked several times (and every time) for me now. You need to have a recovery key for the computer (I got mine from aka.ms/recoverykey, need to log in of course, for a “personal” computer). It is the same key for every restore point so you can write it down and keep it.

Windows 11 impressions

This is the first time I install and use Windows 11. I am actually somewhat satisfied, even impressed. This is the first time using Windows makes me feel inspired and empowered, ever.