Stable Diffusion

Overview

graph LR;
Text(Text) -- prompt --> txt2img([txt2img]) --> Image
Image -.-> img2img([img2img])

ControlNet(ControlNet) --> upscale["Upscale:tile"] --> Image
ControlNet --> shuffle --> Image
ControlNet --> inpaint,outpaint --> Image
ControlNet --> seg["Semantic Segmentation"] --> Image
ControlNet --> depth,normalbae,openpose --> Image
ControlNet --> mlsd,canny,softedge,scribble,lineart --> Image
ControlNet --> ip2p["Instruct Pix2Pix (ip2p)"] --> Image

img2img -- "Fix small" --> HiRes-Image
img2img -- "Fix color" --> VAE(["Variational Auto-Encoder (VAE)"])

VAE -- "smoother" --> MSE
VAE -- "original" --> EMA

Basic

Glossaries

Base knowledge for starter.

txt2img
- Prompts: Syntax ((), <>), Checkpoints // What we want in text.
- Negative Prompts: Textual Inversion // What we not want in text.
- LoRA: Modify cross-attention at weight // Change final image to what we want.
- Hypernetworks: Modify cross-attention by insert additional networks. // Process image to what we want. (Prefer LoRA)
- Embeddings: Result of fine-tuning (Textual Inversion). // Define new word to certain style.
img2img // Draw something like this.
VAE: Variational Auto-Encoder // Fix washout color.
Hires.fix // Upscale with some fix.

UI

You will need some ui for generate some cool picture.

WEB Stable Diffusion web UI: A browser interface based on Gradio library for Stable Diffusion. // Most famous.
WEB vladmandic/automatic: Forked of AUTOMATIC1111 + Advanced CUDA tuning

Prompts

You can use BREAK to end current and not influent next word. ref

absurdres , highres, ultra detailed, (1girl:1.3),
BREAK
solarization, inverted tones, experimental photography, surreal contrast, striking visuals, artistic abstraction,
BREAK
paper cut art, layered silhouettes, intricate patterns, delicate craftsmanship, shadow play, depth and dimension, creative expression,
BREAK
ice art, frozen sculptures, translucent forms, ephemeral beauty, crystalline textures, delicate craftsmanship, chilling allure,
BREAK
green eyes,

Negative Prompts

You will need these unwanted prompts to prevent bad hand.

Models

Model is a new artist. You can find one below anyway consider use safetensor because .ckpt (aka .zip) can contain harmful script.

Extensions

stable-diffusion-webui-state: Preserve web UI parameters (inputs, sliders, checkboxes etc.) after page reload.
sd-webui-controlnet: The WebUI extension for ControlNet and other injection-based SD controls.
sd-webui-segment-anything: Segment Anything for Stable Diffusion WebUI.
sd-webui-regional-prompter: Set prompt to divided region.
sd-webui-model-converter: TODO: Replace with about this link.
sd-webui-3d-open-pose-editor: TODO: Replace with about this link.
sd-3dmodel-loader: Model convert extension , Used for AUTOMATIC1111's stable diffusion webui.
sd-webui-depth-lib: Depth map library for use with the Control Net extension for Automatic1111/stable-diffusion-webui.
Latent Couple extension (two shot diffusion port): This extension is an extension of the built-in Composable Diffusion. This allows you to determine the region of the latent space that reflects your subprompts.
EbSynth: AUTOMATIC1111 UI extension for creating videos using img2img and ebsynth.
DAAM: Diffusion Attentive Attribution Maps. // How much and where is text influenced picture.
sd-extension-system-info: System Info + Benchmark.

Advanced

Sampler: Most use DPM++ 2M Karras.
CFGScale: aka CFG Guidance Scale // Low = creative, High = prompt. ref
SAM: Segment Anything // It's a magic wand.
AutoSAM: Auto Segment Anything // It's a lazy magic wand.
OpenPose // Draw human from skelton.
3D Model & Pose Loader // Create skeleton from 3D.
Checkpoint Merger // Mix 2 checkpoint.
Train // Create new model by our pictures.
Real-ESRGAN:
Clip skip, Script-X/Y/Z plot: To create grid for comparison
outpaint

Modify Model

You can merge model via stable-diffusion-web-ui.

graph LR;

EasyNegative["<a href='https://huggingface.co/datasets/gsdf/EasyNegative'>EasyNegative</a><br/>(Negative Embedding)"] .- counterfeit

defacta3th[<img class='thumb128' src='https://imagecache.civitai.com/xG1nkqKTMzGDvpLrqFT7WA/83eee1eb-fd3a-4497-b04a-64da85667d00/width=450/00256-4261342171.jpeg' style='object-fit:cover'/><br/><a href='https://civitai.com/models/45804/defacta3th'>Defacta3th</a>] --> defacounter-mix[<img class='thumb128' src='https://imagecache.civitai.com/xG1nkqKTMzGDvpLrqFT7WA/27c1e448-198b-46b9-9869-9081e1883400/width=450/00216-1523285327.jpeg' style='object-fit:cover'/><br/> <a href='https://civitai.com/models/55237'> defacounter-mix</a>]
counterfeit[<img class='thumb128' src='https://imagecache.civitai.com/xG1nkqKTMzGDvpLrqFT7WA/5f06c30d-0169-4d58-381c-3ee00ea90100/width=450/002.jpeg' style='object-fit:cover'/><br/> <a href='https://civitai.com/models/4468/counterfeit-v30'>Counterfeit-V3.0</a>] --> defacounter-mix

Rust

Libraries

tch-rs: Rust bindings for the C++ api of PyTorch. The goal of the tch crate is to provide some thin wrappers around the C++ PyTorch api (a.k.a. libtorch). It aims at staying as close as possible to the original C++ api.
tch-m1: how to use LaurentMazare/tch-rs on M1.
burn-rs: This library strives to serve as a comprehensive deep learning framework, offering exceptional flexibility and written in Rust. Our objective is to cater to both researchers and practitioners by simplifying the process of experimenting, training, and deploying models.
diffusers-rs: An implementation of the diffusers api in Rust.
sic image cli: Convert images and perform image operations from the command-line.

ETC

2D - Graphite: Redefining state-of-the-art graphics editing + stable diffusion.
Video - TemporalKit: An all in one solution for adding Temporal Stability to a Stable Diffusion Render via an automatic1111 extension.
3D - ReMoDiffuse: ReMoDiffuse is a retrieval-augmented 3D human motion diffusion model. Benefiting from the extra knowledge from the retrieved samples, ReMoDiffuse is able to achieve high-fidelity on the given prompts.
Blender - Dream Texture: Stable Diffusion built-in to Blender.
ControlNetMediaPipeFace: Control Stable Diffusion with a Facial Pose

Gist of Rust