Stable Diffusion

Overview

graph LR;
Text(Text) -- prompt --> txt2img([txt2img]) --> Image
Image -.-> img2img([img2img])

ControlNet(ControlNet) --> upscale["Upscale:tile"] --> Image
ControlNet --> shuffle --> Image
ControlNet --> inpaint,outpaint --> Image
ControlNet --> seg["Semantic Segmentation"] --> Image
ControlNet --> depth,normalbae,openpose --> Image
ControlNet --> mlsd,canny,softedge,scribble,lineart --> Image
ControlNet --> ip2p["Instruct Pix2Pix (ip2p)"] --> Image

img2img -- "Fix small" --> HiRes-Image
img2img -- "Fix color" --> VAE(["Variational Auto-Encoder (VAE)"])

VAE -- "smoother" --> MSE
VAE -- "original" --> EMA

Basic

Glossaries

Base knowledge for starter.

  • txt2img
    • Prompts: Syntax ((), <>), Checkpoints // What we want in text.
    • Negative Prompts: Textual Inversion // What we not want in text.
    • LoRA: Modify cross-attention at weight // Change final image to what we want.
    • Hypernetworks: Modify cross-attention by insert additional networks. // Process image to what we want. (Prefer LoRA)
    • Embeddings: Result of fine-tuning (Textual Inversion). // Define new word to certain style.
  • img2img // Draw something like this.
  • VAE: Variational Auto-Encoder // Fix washout color.
  • Hires.fix // Upscale with some fix.

UI

You will need some ui for generate some cool picture.

Prompts

  • You can use BREAK to end current and not influent next word. ref
    absurdres , highres, ultra detailed, (1girl:1.3),
    BREAK
    solarization, inverted tones, experimental photography, surreal contrast, striking visuals, artistic abstraction,
    BREAK
    paper cut art, layered silhouettes, intricate patterns, delicate craftsmanship, shadow play, depth and dimension, creative expression,
    BREAK
    ice art, frozen sculptures, translucent forms, ephemeral beauty, crystalline textures, delicate craftsmanship, chilling allure,
    BREAK
    green eyes,
    

Negative Prompts

You will need these unwanted prompts to prevent bad hand.

Models

Model is a new artist. You can find one below anyway consider use safetensor because .ckpt (aka .zip) can contain harmful script.

Extensions

Advanced

  • Sampler: Most use DPM++ 2M Karras.

  • CFGScale: aka CFG Guidance Scale // Low = creative, High = prompt. ref

  • SAM: Segment Anything // It's a magic wand.

  • AutoSAM: Auto Segment Anything // It's a lazy magic wand.

  • OpenPose // Draw human from skelton.

  • 3D Model & Pose Loader // Create skeleton from 3D.

  • Checkpoint Merger // Mix 2 checkpoint.

  • Train // Create new model by our pictures.

  • Real-ESRGAN:

  • Clip skip, Script-X/Y/Z plot: To create grid for comparison

  • outpaint

Modify Model

You can merge model via stable-diffusion-web-ui.

graph LR;

EasyNegative["<a href='https://huggingface.co/datasets/gsdf/EasyNegative'>EasyNegative</a><br/>(Negative Embedding)"] .- counterfeit

defacta3th[<img class='thumb128' src='https://imagecache.civitai.com/xG1nkqKTMzGDvpLrqFT7WA/83eee1eb-fd3a-4497-b04a-64da85667d00/width=450/00256-4261342171.jpeg' style='object-fit:cover'/><br/><a href='https://civitai.com/models/45804/defacta3th'>Defacta3th</a>] --> defacounter-mix[<img class='thumb128' src='https://imagecache.civitai.com/xG1nkqKTMzGDvpLrqFT7WA/27c1e448-198b-46b9-9869-9081e1883400/width=450/00216-1523285327.jpeg' style='object-fit:cover'/><br/> <a href='https://civitai.com/models/55237'> defacounter-mix</a>]
counterfeit[<img class='thumb128' src='https://imagecache.civitai.com/xG1nkqKTMzGDvpLrqFT7WA/5f06c30d-0169-4d58-381c-3ee00ea90100/width=450/002.jpeg' style='object-fit:cover'/><br/> <a href='https://civitai.com/models/4468/counterfeit-v30'>Counterfeit-V3.0</a>] --> defacounter-mix

Rust

Libraries

  • tch-rs: Rust bindings for the C++ api of PyTorch. The goal of the tch crate is to provide some thin wrappers around the C++ PyTorch api (a.k.a. libtorch). It aims at staying as close as possible to the original C++ api.
  • tch-m1: how to use LaurentMazare/tch-rs on M1.
  • burn-rs: This library strives to serve as a comprehensive deep learning framework, offering exceptional flexibility and written in Rust. Our objective is to cater to both researchers and practitioners by simplifying the process of experimenting, training, and deploying models.
  • diffusers-rs: An implementation of the diffusers api in Rust.
  • sic image cli: Convert images and perform image operations from the command-line.

ETC

  • 2D - Graphite: Redefining state-of-the-art graphics editing + stable diffusion.
  • Video - TemporalKit: An all in one solution for adding Temporal Stability to a Stable Diffusion Render via an automatic1111 extension.
  • 3D - ReMoDiffuse: ReMoDiffuse is a retrieval-augmented 3D human motion diffusion model. Benefiting from the extra knowledge from the retrieved samples, ReMoDiffuse is able to achieve high-fidelity on the given prompts.
  • Blender - Dream Texture: Stable Diffusion built-in to Blender.
  • ControlNetMediaPipeFace: Control Stable Diffusion with a Facial Pose