Tutorial Overview#

This tutorial aims to help readers master the core techniques of prompt engineering, enabling them to create exquisite images that meet their needs using AI image generation tools. We will delve into the methods of constructing prompts and, through multiple practical cases, help readers progress from beginner to expert. This tutorial is suitable for readers who are interested in AI image generation and have a certain level of computer skills, such as designers, marketers, and enthusiasts interested in AI art.

By studying this tutorial, you will be able to:

Understand the basic principles and core concepts of prompt engineering.
Master various techniques and methods for constructing effective prompts.
Use different image generation models to create images with diverse styles.
Solve common problems encountered during AI image generation.
Enhance your creative expression and transform ideas into visual works.

Preparations#

Before you start learning, you need to make the following preparations:

Required Tools:
- AI Image Generation Platform: It is recommended to use mainstream platforms such as Midjourney, DALL-E 2, and Stable Diffusion. These platforms all offer free trials or paid subscription services, and you can choose according to your needs. This tutorial will use Stable Diffusion as an example because of its open-source nature and greater customizability.
- Text Editor: Used to write and edit prompts. It is recommended to use VS Code, Sublime Text, etc., which have features such as code highlighting and auto-completion to improve your efficiency.
- Image Processing Software: Optional, used for post-processing of generated images. It is recommended to use Photoshop, GIMP, etc.
Environment Configuration:
- Stable Diffusion: If you choose Stable Diffusion, you need to install the Python environment (version 3.7 or above is recommended) and configure the corresponding dependency libraries. For specific installation steps, you can refer to the official documentation of Stable Diffusion or related tutorials. It is recommended to use conda to create an independent virtual environment to avoid dependency conflicts with other projects.
- GPU: AI image generation requires high computing resources. It is recommended to use a computer with an Nvidia GPU and install drivers such as CUDA and cuDNN. If your computer does not have a GPU, you can also use cloud GPU services, such as Google Colab, Kaggle, etc.
Basic Knowledge:
- Python Basics: Understand the basic syntax and commonly used libraries of Python, such as numpy, PIL, etc.
- Deep Learning Basics: Understand the basic concepts of deep learning, such as neural networks, convolutional neural networks, etc.
- Image Processing Basics: Understand the basic concepts of images, such as pixels, color spaces, etc.

Explanation of Core Concepts#

Understanding the following core concepts is crucial for mastering prompt engineering:

Prompt: A prompt is a text instruction provided by the user to the AI model to guide the model to generate a specific image. A good prompt should be clear, concise, and accurate, and be able to accurately describe the image you want to generate.
Positive Prompt: Describes the elements, features, styles, etc. that you want to include in the image. For example: "a photo of a cat, realistic, detailed, 8k".
Negative Prompt: Describes the elements, features, styles, etc. that you want to avoid including in the image. For example: "blurry, deformed, ugly, low quality".
Model: An AI image generation model is a neural network trained on a large amount of image data and used to generate images. Different models have different styles and characteristics. For example, Stable Diffusion has several different models, such as SD1.5, SDXL, etc.
Sampler: The sampling method determines how the model generates the image. Different sampling methods will produce different effects. Common sampling methods include Euler a, DPM++ 2M Karras, etc.
Steps: The number of iteration steps determines the fineness of the image generated by the model. The more iteration steps, the finer the image, but the generation time is also longer.
Seed: The random seed determines the randomness of the image generated by the model. Using the same prompt and random seed can generate the same image.
Prompt Weight: By adjusting the weight of the prompt, you can control the model's emphasis on different prompts. For example, you can use parentheses to increase or decrease the weight of the prompt. For example: "cat:(1.5)" means increasing the weight of "cat", "dog: (0.5)" means decreasing the weight of "dog".
LoRA (Low-Rank Adaptation): A fine-tuning technique that allows you to add specific styles or objects to the model without retraining the entire model. LoRA files are usually small and easy to share and use.

Step 1: Building a Basic Prompt#

The first step in building a prompt is to clarify the content of the image you want to generate. You need to consider the following questions:

Subject: What is the main object of the image? For example: a cat, a castle, a landscape painting.
Background: What is the background of the image? For example: forest, beach, city.
Style: What is the style of the image? For example: realistic, cartoon, oil painting.
Lighting: What is the lighting effect of the image? For example: sunset, moonlight, spotlight.
Composition: What is the composition of the image? For example: close-up, long shot, bird's-eye view.

Organize this information into a clear and concise sentence as your basic prompt.

Precautions:

Use specific nouns and adjectives, and avoid using vague words.
Try to use concise language and avoid using overly long sentences.
Put the most important information at the beginning of the prompt, as the model will prioritize processing the content at the beginning.

Example:

Subject: A golden labrador retriever
Background: On a green lawn, sunny
Style: Realistic
Lighting: Natural light
Composition: Half-length portrait

Basic Prompt: "a realistic photo of a golden labrador retriever sitting on a green grass field, sunny day, natural light, half body portrait"

Next, we can add some details to further refine the prompt. For example, we can add the dog's expression, posture, and details of the grass, etc.

Refined Prompt: "a realistic photo of a golden labrador retriever sitting on a green grass field, sunny day, natural light, half body portrait, happy expression, looking at the camera, blades of grass swaying in the breeze"

By constantly refining the prompt, we can gradually approach the image we want to generate.

Step 2: Using Positive and Negative Prompts#

Positive prompts are used to describe the elements you want to include in the image, while negative prompts are used to describe the elements you want to avoid including in the image. Using positive and negative prompts reasonably can effectively control the image generation results.

Tips for Positive Prompts:

Use descriptive words, such as color, shape, texture, etc.
Use stylized words, such as artistic style, photography style, etc.
Use emotional words, such as happiness, sadness, anger, etc.
Use specific details, such as clothing, accessories, scenes, etc.

Tips for Negative Prompts:

Avoid using vague words, such as "bad", "ugly", etc.
Use specific words, such as "blurry", "deformed", "low quality", etc.
Avoid using too many negative prompts to avoid limiting the model's creativity.

Example:

Taking the labrador retriever above as an example, we can add the following positive and negative prompts:

Positive Prompts: "highly detailed, 8k, sharp focus, professional photography"
Negative Prompts: "blurry, deformed, low quality, cartoon, painting"

Add these prompts to the previous prompt to get the final prompt:

"a realistic photo of a golden labrador retriever sitting on a green grass field, sunny day, natural light, half body portrait, happy expression, looking at the camera, blades of grass swaying in the breeze, highly detailed, 8k, sharp focus, professional photography, (masterpiece, best quality:1.2), negative: blurry, deformed, low quality, cartoon, painting"

Note that (masterpiece, best quality:1.2) uses weight, indicating that the model should pay more attention to these two words.

Step 3: Adjusting Parameters and Optimization#

It is not enough to have good prompts; you also need to adjust the appropriate parameters to generate the ideal image. Here are some commonly used parameters:

Sampler: Different sampling methods will produce different effects. Common sampling methods include Euler a, DPM++ 2M Karras, etc. Euler a is faster and suitable for rapid iteration; DPM++ 2M Karras has higher quality and is suitable for generating the final image.
Steps: The number of iteration steps determines the fineness of the image generated by the model. Generally speaking, 20-30 steps are enough. If you want a more detailed image, you can increase it to more than 50 steps. However, it should be noted that the more iteration steps, the longer the generation time.
Seed: The random seed determines the randomness of the image generated by the model. Using the same prompt and random seed can generate the same image. If you want to generate different images, you can modify the random seed. You can use -1 to represent the random seed, and a different image will be generated each time.
CFG Scale (Classifier Free Guidance Scale): CFG Scale controls the model's emphasis on the prompt. The higher the value, the more the model emphasizes the prompt, but it may cause image distortion. Generally speaking, 7-12 is a more appropriate range.
Resolution: The resolution determines the size of the image. The higher the resolution, the clearer the image, but the generation time is also longer. You need to choose the appropriate resolution according to your needs. Common resolutions include 512x512, 768x768, etc.

Optimization Tips:

Iterative Adjustment: Constantly adjust the parameters, observe the image generation results, and adjust according to the results.
Refer to Other Works: Refer to other excellent works and learn their prompts and parameter settings.
Use Community Resources: Join relevant communities, exchange experiences with other users, and learn skills.

Example:

For the labrador retriever example above, we can try the following parameter settings:

Sampler: DPM++ 2M Karras
Steps: 30
Seed: -1
CFG Scale: 7
Resolution: 512x512

By adjusting these parameters, we can generate images of different styles and qualities and choose the image that best meets our needs.

Step 4: Using LoRA and ControlNet to Enhance Control#

LoRA (Low-Rank Adaptation) is a fine-tuning technique that allows you to add specific styles or objects to the model without retraining the entire model. ControlNet is a neural network structure that allows you to use additional inputs (such as sketches, edge maps, depth maps) to more precisely control the image generation process.

Using LoRA:

Download LoRA Model: Download the LoRA model you need from websites such as Civitai.
Load LoRA Model: Load the LoRA model in your Stable Diffusion interface. Usually, you need to put the LoRA file in the models/Lora directory.
Use LoRA in Prompt: Use the format <lora:model name:weight> to call the LoRA model in the prompt. For example, <lora:moreRealistic:0.8> means using the moreRealistic LoRA model with a weight of 0.8.

Using ControlNet:

Install ControlNet Plugin: Install the ControlNet plugin for Stable Diffusion.
Prepare ControlNet Input: According to the effect you want, prepare the corresponding input image, such as sketch, edge map, depth map, etc.
Select ControlNet Preprocessor and Model: Select the appropriate preprocessor and model in the ControlNet interface. The preprocessor is used to convert the input image into a format that the model can understand, and the model is used to control the image generation process.
Adjust ControlNet Parameters: Adjust the parameters of ControlNet, such as weight, starting step, etc., to get the best effect.

Example:

Suppose we want to generate a labrador retriever image with a specific artistic style, we can use the LoRA model to achieve it.

Download a LoRA model named ArtStyleLoRA.
Put the ArtStyleLoRA.safetensors file in the models/Lora directory.
Add <lora:ArtStyleLoRA:0.7> to the prompt, indicating that the ArtStyleLoRA model is used with a weight of 0.7.

The final prompt may look like this:

"a realistic photo of a golden labrador retriever sitting on a green grass field, sunny day, natural light, half body portrait, happy expression, looking at the camera, blades of grass swaying in the breeze, highly detailed, 8k, sharp focus, professional photography, lora:ArtStyleLoRA:0.7, (masterpiece, best quality:1.2), negative: blurry, deformed, low quality, cartoon, painting"

ControlNet can be used to precisely control the posture of the labrador retriever.

Take a photo of the labrador retriever's posture, or draw a simple sketch.
Use the Canny edge detection preprocessor to extract the edge map.
Select the Canny ControlNet model.
Use the edge map as the input of ControlNet.

By combining LoRA and ControlNet, we can more precisely control the image generation process and create images that better meet our needs.

Common Problems and Solutions#

Here are some common problems encountered during AI image generation and their solutions:

The Generated Image Does Not Match the Prompt:
- Problem: The generated image differs greatly from the content described in the prompt.
- Solution: Check whether the prompt is clear, concise, and accurate. Try using more specific words and adjust the weight of the prompt. You can try adding negative prompts to exclude unwanted results.
Poor Image Quality:
- Problem: The generated image is blurry, distorted, and lacks detail.
- Solution: Increase the number of iteration steps, adjust the CFG Scale, and choose a higher quality sampling method. You can try using a higher resolution and add prompts such as "highly detailed" and "8k".
Long Generation Time:
- Problem: It takes a long time to generate an image.
- Solution: Reduce the number of iteration steps and choose a faster sampling method. You can try using a lower resolution and optimize the prompt to reduce the model's calculation amount.
Duplicate Generation Results:
- Problem: The images generated each time are very similar.
- Solution: Modify the random seed and try using different prompts. You can try using different models and LoRA models.
Model Error or Crash:
- Problem: The model encounters an error or crashes during the image generation process.
- Solution: Check whether your hardware configuration meets the requirements. Update your drivers and make sure your software version is the latest. You can try restarting your computer or reinstalling the model.

Advanced Techniques and Best Practices#

Mastering the following advanced techniques can help you better utilize AI image generation:

Prompt Combination: Combine multiple prompts to create more complex images. For example, you can combine "a photo of a cat" and "in the style of Van Gogh" to generate a Van Gogh-style image of a cat.
Prompt Splitting: Splitting a complex prompt into multiple simple prompts can better control the image generation process.
Using Wildcards: Using wildcards can generate images with different variations. For example, you can use {cat|dog|bird} to generate images of cats, dogs, or birds.
Image Editing Tools: Using image editing tools to post-process the generated images can further improve the quality and effect of the images. For example, you can use Photoshop to adjust the color, brightness, contrast, etc. of the image.
Learning Community Resources: Actively participate in relevant communities, learn from the experiences and skills of other users, and you can quickly improve your skills.

Best Practices:

Define Goals: Before you start, define the goals and style of the image you want to generate.
Keep Trying: Don't be afraid to try different prompts, parameters, and models.
Record Experience: Record your experimental results and summarize the lessons learned.
Share Results: Share your work with others and get feedback and suggestions.

Summary and Further Learning#

This tutorial introduced the basic principles, core concepts, and practical techniques of prompt engineering. By studying this tutorial, you should be able to master the methods of constructing effective prompts and use AI image generation tools to create images that meet your needs.

Review Key Points:

Prompts are the key to guiding AI models to generate images.
Positive prompts are used to describe the elements you want to include in the image, and negative prompts are used to describe the elements you want to avoid including in the image.
Adjusting parameters can control the image generation effect.
LoRA and ControlNet can enhance control over the image.
Continuous learning and practice are the keys to improving skills.

Further Learning:

Official Documentation: Read the official documentation of platforms such as Stable Diffusion, Midjourney, and DALL-E 2 to learn more detailed information.

Prompt Engineering in Practice: AI Painting Techniques to Precisely Generate the Images You Want!

Prompt Engineering in Practice: AI Painting Techniques to Precisely Generate the Images You Want!

Tutorial Overview#

Preparations#

Explanation of Core Concepts#

Step 1: Building a Basic Prompt#

Step 2: Using Positive and Negative Prompts#

Step 3: Adjusting Parameters and Optimization#

Step 4: Using LoRA and ControlNet to Enhance Control#

Common Problems and Solutions#

Advanced Techniques and Best Practices#

Summary and Further Learning#