Stable Diffusion Prompting Guide: Parameters, LoRAs & Advanced Techniques
Stable Diffusion is the open-source AI image generator that gives you complete control. No API limits, no content filters, no subscription feesโjust pure generative power running on your own hardware.
But that power comes with complexity. Stable Diffusion has more parameters, more models, more techniques, and more ways to fail than commercial alternatives. This comprehensive guide teaches you the complete SD prompting ecosystem, from basic syntax to advanced workflows with LoRAs and custom models.
Why Stable Diffusion Prompting Is Different
Stable Diffusion gives you the keys to the engine. While Midjourney curates aesthetics and DALL-E simplifies the interface, SD exposes every parameterโgiving you both tremendous power and tremendous responsibility.
Key differences from other platforms:
More control:
- Adjust sampling steps, CFG scale, sampling method
- Load custom models trained on specific styles or subjects
- Use LoRAs (lightweight model modifications) for precise style control
- Full negative prompt support with weighting
- Exact seed control for reproducibility
More complexity:
- Need to understand technical parameters
- Model selection dramatically affects results
- Prompt syntax has special characters:
(),[],BREAK,AND - Quality depends on parameter tuning, not just prompts
More hardware-dependent:
- Runs locally (or on rented cloud GPUs)
- Performance varies by VRAM, GPU speed
- Different models have different VRAM requirements
More community-driven:
- Thousands of custom models on Civitai, Hugging Face
- Community-created LoRAs for specific styles, characters, concepts
- Constant experimentation and innovation
- No corporate filtering or safety rails (use responsibly)
When to choose Stable Diffusion:
- Maximum control over every aspect
- Custom models for specific aesthetics
- No content restrictions (within legal/ethical bounds)
- Budget-conscious (after initial hardware investment)
- Privacy (everything local, no data sent to APIs)
When to choose something else:
- Want plug-and-play simplicity (use Midjourney or DALL-E)
- Don't have GPU or don't want to manage software
- Need consistent quality without parameter tuning
Getting Started: AUTOMATIC1111 vs. ComfyUI
Two primary interfaces dominate Stable Diffusion:
AUTOMATIC1111 WebUI
Pros:
- User-friendly interface
- Most popular, best community support
- Easy to install
- Extensive extensions
- Good for beginners
Cons:
- Can be slow for complex workflows
- Less flexible than ComfyUI
- Linear generation flow
Best for: Beginners, single-image generation, straightforward workflows
ComfyUI
Pros:
- Node-based workflow (like Blender nodes)
- Extremely flexible and powerful
- Efficient for complex multi-step generation
- Better performance for advanced users
Cons:
- Steeper learning curve
- Visual programming paradigm takes time to learn
- Less beginner-friendly
Best for: Advanced users, complex workflows, production pipelines
This guide assumes AUTOMATIC1111 WebUI, but concepts apply to both.
SD Prompt Syntax Fundamentals
Stable Diffusion has special syntax for advanced prompt control.
Basic Syntax
Simple prompt:
a portrait of a woman, long hair, blue eyes, smilingComma-separated descriptors, read left to right.
Emphasis and Weighting: () and []
Increase emphasis with (word):
(highly detailed), photorealistic portraitMultiple parentheses stack:
((highly detailed)) = stronger emphasis
(((highly detailed))) = even strongerNumeric weighting (more precise):
(highly detailed:1.3) = 30% more emphasis
(blue eyes:1.5) = 50% more emphasis
(background:0.8) = 20% less emphasisReduce emphasis with [word]:
[blurry] = reduce influence
[[blurry]] = reduce morePractical examples:
Basic: a woman with red hair
Weighted: a woman with (red hair:1.3), (flowing locks:1.2)
Result: Stronger emphasis on red color and hair flow
Basic: landscape with mountains
Weighted: (majestic mountains:1.4), forest, (river:0.8)
Result: Mountains emphasized, river de-emphasizedWhen to use weighting:
- Persistent problems that won't appear (increase weight)
- Elements that dominate when you want them subtle (decrease weight)
- Fine-tuning composition emphasis
AND Operator โ Compositing
The AND operator blends multiple concepts in the same generation.
Syntax:
concept A AND concept BExample:
a woman with long flowing hair AND a warrior in armorResult: Blends "woman with long flowing hair" and "warrior in armor"โcreating an armored woman with flowing hair.
Multiple ANDs:
fantasy landscape AND sunset lighting AND magical atmosphereWeighting with AND:
(concept A:1.3) AND (concept B:0.8)Use cases:
- Blending two characters into one
- Combining incompatible concepts creatively
- Multi-style fusion
Limitations:
- Can produce confusing results if concepts conflict too much
- Works best when concepts complement each other
BREAK Keyword โ Prompt Regions
BREAK separates prompt into chunks, each processed somewhat independently. Useful for long prompts.
Syntax:
main subject, details BREAK background elements, atmosphere BREAK lighting and technical parametersExample:
a woman in elegant dress, red hair, green eyes BREAK mystical forest background, glowing mushrooms BREAK volumetric lighting, cinematic, highly detailedWhen to use:
- Prompts over 75 tokens (tokens โ words/word fragments)
- Separating subject from background
- Complex scenes with multiple focus areas
How it works:
- CLIP (the text encoder) has a 75-token limit per chunk
BREAKcreates new chunk- All chunks blend in final image
Pro tip: Use BREAK when adding more descriptors stops improving the outputโyou've likely hit the token limit.
Critical SD Parameters
These parameters fundamentally change your outputs.
CFG Scale (Classifier-Free Guidance)
Controls how closely the model follows your prompt.
Range: 1-30 (practical range: 4-15)
Effect:
- Low CFG (1-4): Creative, loose interpretation, often incoherent
- Sweet spot (7-12): Balanced, follows prompt while maintaining quality
- High CFG (13-20): Strong prompt adherence, risk of over-fitting
- Very high (20+): Often produces artifacts, "deep-fried" look
Recommended by use case:
| Use Case | CFG Scale |
|---|---|
| Photorealism | 7-10 |
| Artistic/painterly | 8-12 |
| Anime/illustration | 7-11 |
| Experimental/abstract | 5-8 |
| Precise prompt following | 10-14 |
Model dependency:
- SDXL often works best at CFG 6-10 (lower than SD 1.5)
- Some custom models specify optimal CFG in their descriptions
Interaction with negative prompts:
- Higher CFG = negative prompts more powerful
- Lower CFG = negative prompts weaker
Pro tip: Start at 7. Increase if model ignores your prompt. Decrease if you see artifacts or over-saturation.
Sampling Steps
Number of refinement iterations from noise to image.
Range: 15-150 (practical range: 20-50)
Effect:
- Low steps (15-20): Fast, lower detail, may look rough
- Sweet spot (25-35): Good balance of quality and speed
- High steps (40-60): Diminishing returns, minimal improvement
- Very high (60+): Rarely necessary, mostly wastes time
Recommended by sampler:
| Sampler | Recommended Steps |
|---|---|
| Euler a | 20-30 |
| DPM++ 2M Karras | 20-30 |
| DPM++ SDE Karras | 25-35 |
| DDIM | 30-50 |
| UniPC | 20-25 |
Quality vs. speed:
- Each step adds computation time
- After 30-40 steps, quality improvement is minimal
- Use lower steps for iteration, higher for final renders
Pro tip: Start at 28. Increase only if you see obvious quality issues.
Sampling Method
The algorithm used to refine the image from noise.
Popular samplers:
Euler a (Ancestral):
- Fast, creative
- Introduces randomness (each run different even with same seed)
- Good for exploration
- Not deterministic
DPM++ 2M Karras:
- High quality, reliable
- Deterministic (same seed = same result)
- Good balance of speed and quality
- Most popular general-purpose sampler
DPM++ SDE Karras:
- Very high quality
- Slower than 2M
- Good for final high-quality renders
- Slightly more creative than 2M
DDIM:
- Classic, reliable
- Deterministic
- Slower convergence (needs more steps)
- Good for img2img workflows
UniPC:
- Very fast
- Good quality at low step counts
- Great for quick iterations
LMS Karras:
- Older, reliable
- Good quality
- Moderate speed
Comparison:
| Sampler | Speed | Quality | Deterministic | Best For |
|---|---|---|---|---|
| Euler a | Fast | Good | No | Exploration, variety |
| DPM++ 2M Karras | Fast | Excellent | Yes | General use, best all-around |
| DPM++ SDE Karras | Slower | Excellent | Yes | Final high-quality renders |
| DDIM | Moderate | Good | Yes | img2img, consistency |
| UniPC | Very fast | Good | Yes | Rapid iteration |
Pro tip: Use DPM++ 2M Karras as default. Switch to DPM++ SDE Karras for final renders.
Seed
Random seed that determines initial noise pattern.
Value: Any integer (0 to 4,294,967,295)
Special value: -1 = random seed each time
Purpose:
- Reproducibility: Same seed + same prompt + same parameters = same image
- Iteration: Change prompt details while keeping composition
- Consistency: Generate variations of the same scene/character
Use cases:
Finding a good composition:
1. Generate with seed -1 (random) until you find composition you like
2. Note the seed (shown in generation info)
3. Use that seed with prompt variationsConsistent character:
"a wizard" --seed 12345 โ establishes base
"a wizard, blue robes" --seed 12345 โ same composition, blue robes
"a wizard, blue robes, holding staff" --seed 12345 โ same composition, adds staffA/B testing prompts:
Same seed for both prompts โ see how specific words change outputLimitations:
- Seed consistency only works within same model and same parameters
- Changing sampler, steps, or CFG breaks seed consistency
- Different models = different results even with same seed
Clip Skip
Determines which layer of CLIP text encoder to use.
Range: 1-12 (practical: 1-2)
Values:
- Clip Skip 1: Default, uses final CLIP layer (most "cooked" interpretation)
- Clip Skip 2: Uses second-to-last layer (more flexible, less rigid)
When to use Clip Skip 2:
- Anime and illustration models (many are trained with CLIP skip 2)
- When you want looser interpretation
- If model description recommends it
When to use Clip Skip 1:
- Photorealistic models
- SDXL (usually)
- Default for most models
Pro tip: Check model description on Civitai/Hugging Faceโit often specifies optimal CLIP skip.
Positive Prompt Structure for SD
Order matters in Stable Diffusion prompts.
Optimal structure:
[Quality tags], [subject and details], [style], [lighting], [composition], [technical details]Example:
masterpiece, best quality, highly detailed, photorealistic, a woman with flowing auburn hair, green eyes, elegant dress, golden hour lighting, shallow depth of field, bokeh background, 85mm portrait, 8KWhy this order works:
- Early tokens have more weight
- Quality tags prime the model for high-quality output
- Subject comes early for emphasis
- Style and technical details refine
Quality tags that work:
masterpiece, best quality, highly detailed, photorealistic, 8K, ultra detailed, intricate detail, sharp focus, professional photographyWhen quality tags help:
- Always good to include 2-3
- Especially important for anime models
- Less critical for photorealism models trained on high-quality datasets
When quality tags hurt:
- Too many (more than 5) dilute impact
- Generic combinations ("best quality masterpiece ultra detailed") become meaningless
- Better to use specific technical terms
LoRA Explained
LoRA (Low-Rank Adaptation) = lightweight model modification for specific styles, characters, or concepts.
Think of base models as general knowledge. LoRAs are specializations.
How LoRAs work:
- Base model: 2-7 GB (general image generation)
- LoRA: 10-200 MB (adds specific knowledge)
- LoRAs modify base model's behavior without replacing it
- You can stack multiple LoRAs (usually 2-4 max)
LoRA syntax:
<lora:filename:weight>Example:
a portrait, <lora:cyberpunk_style:0.8>, neon lighting, futuristicWeight range: 0.0 to 1.5+
- 0.3-0.5: Subtle influence
- 0.7-0.9: Balanced (most common)
- 1.0-1.2: Strong influence
- 1.5+: Very strong, may override base model
Common LoRA types:
Style LoRAs:
<lora:watercolor_style:0.8>
<lora:ghibli_aesthetic:0.7>
<lora:art_nouveau:0.9>Character LoRAs:
<lora:specific_character_name:0.8>(popular for consistent characters in series)
Concept LoRAs:
<lora:cyberpunk_city:0.7>
<lora:fantasy_armor:0.8>Quality/detail LoRAs:
<lora:detail_enhancer:0.5>
<lora:better_hands:0.6>Where to find LoRAs:
- Civitai: Largest repository, quality ratings, examples
- Hugging Face: Open model repository
- Community discords and subreddits
How to install:
- Download LoRA file (.safetensors or .ckpt)
- Place in
stable-diffusion-webui/models/Lora/directory - Refresh LoRA list in WebUI
- Add to prompt with syntax:
Best practices:
- Read LoRA description for optimal weight and trigger words
- Start at weight 0.8, adjust up/down
- Stack 2-3 LoRAs max (more = conflicts)
- Match LoRA base model (SD 1.5 LoRA won't work with SDXL)
Example with multiple LoRAs:
a character portrait, <lora:anime_style:0.7>, <lora:detailed_background:0.5>, vibrant colors, dynamic poseTextual Inversions / Embeddings
Textual Inversions (also called embeddings) are single-concept additions to the model's vocabulary.
Most common use: negative embeddings.
Negative Embeddings
Pre-trained negative prompts as single tokens.
Popular negative embeddings:
EasyNegative:
- Most popular
- General quality improvement
- Equivalent to comprehensive negative prompt
- Usage: Just add
EasyNegativeto negative prompt
bad_prompt_version2:
- Comprehensive quality negative
- Anatomy and composition focus
- Usage:
bad_prompt_version2
bad-hands-5:
- Specialized for hand anatomy
- Use with other negatives
- Usage:
bad-hands-5
bad-artist:
- Reduces amateur-looking outputs
- Style quality focus
- Usage:
bad-artist
Syntax:
Positive prompt: [your normal prompt]
Negative prompt: EasyNegative, bad-hands-5, text, watermarkInstallation:
Same as LoRAs but in different folder:
- Download embedding file (.pt or .safetensors)
- Place in
stable-diffusion-webui/embeddings/directory - Use name in prompts
Benefits:
- Shorter negative prompts
- Consistent quality baseline
- Community-tested combinations
Model Selection: SDXL vs. SD 1.5 vs. SD 3.5
Different Stable Diffusion versions for different needs.
SD 1.5
Released: 2022
Size: ~2-4 GB
VRAM: 4-6 GB minimum
Pros:
- Huge library of custom models and LoRAs
- Runs on modest hardware
- Extremely well-documented
- Most community support
Cons:
- Lower base quality than SDXL
- Weaker prompt following
- More prone to anatomy issues
Best for:
- Limited VRAM (4-6 GB)
- Access to specific custom models/LoRAs
- Anime and illustration (many specialized models)
SDXL (1.0)
Released: 2023
Size: ~6-7 GB
VRAM: 8-12 GB recommended
Pros:
- Significantly better quality than SD 1.5
- Better prompt following
- Improved anatomy
- Better photorealism
- Native 1024ร1024 resolution
Cons:
- Higher VRAM requirements
- Slower generation
- Smaller (but growing) library of custom models
- LoRAs need to be SDXL-specific
Best for:
- High-quality photorealism
- Modern hardware (8+ GB VRAM)
- When you need best quality
Settings differences from SD 1.5:
- Lower CFG (6-10 vs. 7-12)
- Often fewer steps needed (25 vs. 30)
- No CLIP skip needed (uses dual text encoders)
SD 3.5
Released: 2024
Size: ~8-12 GB (architecture change)
VRAM: 12+ GB recommended
Pros:
- Best anatomy and hands
- Excellent prompt following
- Multi-modal (text + image understanding)
- Better composition
Cons:
- Much higher VRAM requirements
- Smaller ecosystem (newer)
- Slower
- Still maturing
Best for:
- Cutting-edge quality
- High-end hardware
- Professional work where quality is paramount
Custom Models (Civitai, Hugging Face)
Thousands of community models for specific styles:
Photorealism models:
- Realistic Vision
- DreamShaper
- ChilloutMix
Anime/illustration:
- Anything V5
- CounterfeitXL
- BreakDomain
Artistic styles:
- Deliberate
- Dreamlike Photoreal
- Rev Animated
How to choose custom models:
- Browse Civitai, sort by downloads/rating
- Check example images
- Read model description for optimal settings
- Download and place in
models/Stable-diffusion/folder - Select in WebUI dropdown
Pro tip: Start with well-rated models (30K+ downloads on Civitai). Experiment once you understand basics.
Prompt Examples by Category
Complete prompts for common use cases.
Portraits
Positive: masterpiece, best quality, photorealistic portrait, a woman in her 30s, natural beauty, soft smile, auburn hair, green eyes, subtle makeup, elegant, golden hour lighting, shallow depth of field, 85mm lens, bokeh background, professional photography, 8K
Negative: EasyNegative, bad-hands-5, bad anatomy, deformed, disfigured, poorly drawn face, extra limbs, lowres, blurry, text, watermark
Settings: Steps 28, CFG 8, DPM++ 2M KarrasLandscapes
Positive: breathtaking landscape, majestic mountain range, pristine alpine lake, reflection, pine forest, golden hour lighting, dramatic clouds, vibrant colors, highly detailed, nature photography, professional, 8K, HDR
Negative: EasyNegative, people, buildings, cars, lowres, blurry, bad composition
Settings: Steps 30, CFG 7, DPM++ SDE KarrasConcept Art (Fantasy)
Positive: epic fantasy concept art, ancient dragon, massive scale, detailed scales, glowing eyes, perched on cliff, stormy sky, dramatic lighting, volumetric fog, highly detailed, trending on artstation, digital painting, cinematic composition
Negative: bad anatomy, deformed, lowres, blurry, amateur, low quality
Settings: Steps 35, CFG 9, DPM++ 2M KarrasAnime Character
Positive: masterpiece, best quality, 1girl, beautiful anime girl, long flowing hair, detailed eyes, magical girl outfit, dynamic pose, vibrant colors, sparkles and light effects, soft shading, anime style illustration
Negative: bad_prompt_version2, lowres, bad anatomy, bad hands, text, error, extra digits, fewer digits
Settings: Steps 25, CFG 7, Euler a, Clip Skip 2
Model: Anime-specific model (Anything V5, CounterfeitXL)Product Photography
Positive: professional product photography, perfume bottle, elegant design, clean white background, studio lighting, three-point lighting, commercial photography, highly detailed, 8K, sharp focus, no shadows
Negative: cluttered, messy, people, text, watermark, blurry, low quality, shadows
Settings: Steps 28, CFG 7, DPM++ 2M KarrasInpainting and Outpainting
Inpainting: Regenerate specific areas of an image
Outpainting: Extend an image beyond its borders
Inpainting Workflow
- Generate base image
- Select inpaint tab in WebUI
- Upload image
- Mask area to regenerate
- Write prompt describing what you want in masked area
- Generate
Use cases:
- Fix anatomy (hands, faces)
- Change specific elements (clothing color, background objects)
- Remove unwanted elements
- Add new elements
Settings for inpainting:
- Denoise strength: 0.4-0.7 (lower = subtle change, higher = complete regeneration)
- Same model as original
- Mask blur: 4-8 pixels (smooth edges)
Outpainting Workflow
- Generate base image
- Select img2img tab
- Extend canvas in desired direction
- Mask or use "Outpainting" script
- Prompt describes what extends into new areas
- Generate
Use cases:
- Extend landscapes
- Enlarge canvas for composition
- Add context around subject
Batch Prompting: X/Y/Z Plots
Systematic prompt exploration by varying one parameter at a time.
X/Y plot tool (in WebUI):
- Test multiple values of two parameters simultaneously
- Generates grid of all combinations
- Compare results side-by-side
Common X/Y combinations:
Test CFG scale:
- X axis: CFG Scale (6, 7, 8, 9, 10, 11, 12)
- Y axis: Your prompt variations
- See optimal CFG for each prompt
Test samplers:
- X axis: Sampler (Euler a, DPM++ 2M, DPM++ SDE, etc.)
- Y axis: Steps (20, 25, 30, 35)
- Find best sampler/step combo
Test prompt variations:
- X axis: Different prompt wordings
- Y axis: Different CFG or steps
- Optimize prompt language
Example X/Y plot use:
Prompt: portrait of a woman
X axis: CFG Scale - 6, 7, 8, 9, 10, 11, 12
Y axis: Prompt S/R - "woman" / "beautiful woman" / "elegant woman"
Result: 21 images (7 CFG ร 3 prompts) showing all combinationsWhen to use X/Y plots:
- Dialing in settings for a new model
- Testing prompt effectiveness
- Finding optimal parameters for specific styles
Hardware Requirements
Minimum:
- GPU: NVIDIA GTX 1060 (6 GB VRAM)
- RAM: 16 GB
- Storage: 50 GB+ (models are large)
- Can run SD 1.5 at 512ร512
Recommended:
- GPU: RTX 3060 (12 GB VRAM) or better
- RAM: 32 GB
- Storage: 200 GB+ SSD
- Can run SDXL comfortably
High-end:
- GPU: RTX 4090 (24 GB VRAM)
- RAM: 64 GB
- Storage: 1 TB+ NVMe SSD
- Can run SD 3.5, multiple LoRAs, high-res
VRAM by model:
| Model | Resolution | VRAM |
|---|---|---|
| SD 1.5 | 512ร512 | 4 GB |
| SD 1.5 | 768ร768 | 6 GB |
| SDXL | 1024ร1024 | 8 GB |
| SDXL + LoRAs | 1024ร1024 | 10-12 GB |
| SD 3.5 | 1024ร1024 | 12-16 GB |
Optimization for low VRAM:
- Use
--medvramor--lowvramflags - Lower resolution
- Stick to SD 1.5
- Use xformers optimization
- Limit batch size to 1
Next Steps
Start here:
- Negative Prompts Guide โ Master SD's negative prompts and embeddings
- Prompt Anatomy Guide โ Learn the 7 components
Compare:
- Stable Diffusion Review โ Full platform breakdown
- AI Tools Directory โ Compare SD vs. other generators
Explore:
- 100+ Prompt Templates โ Ready-to-use SD prompts
Stable Diffusion is prompt engineering for power users. The learning curve is steep, but the ceiling is unlimited. Master the parameters, build your LoRA library, and you'll have capabilities no commercial platform can match.
Running SD locally? A portable SSD is practically mandatory for storing multiple model checkpoints โ keep one dedicated to your AI workflow.
Some links in this article are affiliate links โ we may earn a small commission if you purchase, at no extra cost to you. Full disclosure โ