This article provides an introductory explanation of checkpoints for beginners. We also briefly cover SD1.5, SDXL, SD3, and Flux.1.
What is a Checkpoint?
A checkpoint is a file that stores the "weights" or "data" learned by AI models like Stable Diffusion. This file contains important information about how the model processes data and generates images.
Commonly Used Checkpoints
Most commonly used checkpoints are models that have undergone additional training (fine-tuning) on top of the vanilla base model of Stable Diffusion.
What is a Base Model Checkpoint?
A base model checkpoint refers to the preset models released by Stable Diffusion. These models are trained on large datasets to ensure they are versatile and can be applied in general scenarios.
Features:
Base models are versatile due to their extensive training across various fields, making them effective for image generation and training.
Limitations:
Since they are not specialized in any particular style or theme, base models may not produce satisfying results for generating specific styles, like anime or realistic photographs.
As of November 2024, SD1.5, SDXL, SD3, and Flux.1 are available as base model checkpoints.
Checkpoints After Additional Training (Fine-Tuning Models)
Fine-tuned checkpoints are models that have undergone additional training with specific datasets or styles, making them highly capable of generating images in specific genres or styles.
Features:
These models are optimized for particular themes or styles (e.g., realism, anime, landscapes, portraits, or hand-drawn styles). They exhibit unique qualities that enable them to generate images tailored to specific genres.
Pros:
Due to their specialized focus on specific styles or themes, fine-tuned models allow for precise, targeted expression, making it easy to generate images that meet the user’s needs. They often incorporate delicate adjustments that base models may not achieve.
Cons:
Their specialization may limit their versatility in general image generation, making them better suited for specific purposes.
In general, SD1.5 and SDXL series checkpoints are often used for image generation after fine-tuning. In contrast, base models are more commonly used for SD3 and Flux.1 image generation.
Model Series Overview
モデル Series | ||||
SD1.5 | SDXL | SD3 | Flux | |
Release Date | Oct 2022 | Jul 2023 | Jun 2024 | Aug 2024 |
Recommended Resolution | 512x512 | 1024x1024 | 1024x1024 | 1024x1024 |
Base Model Size | about 2GB | about 6.8GB | about 10GB | about 20GB |
Features | Initial popular model series | Successor to SD1.5 | Official successor to SDXL | Model series branched from Stable Diffusion |
Representative Models | ChilloutMixepiCRealism万象熔炉 | AnythingMeina Mix | DreamShaper XLAnimagine XL万象熔炉 | Anything XLPony Diffusion XL | SD3(vanila) | Flux.1(vanila)STOIQO NewRealitySeaArt Infinity |
SD1.5 Series Models
The SD1.5 series is the oldest among the Stable Diffusion models in circulation. Despite its age, this series does not lack in performance. Over time, numerous fine-tuned checkpoints have been released, enabling users to generate images in a wide variety of styles. With many LoRA options available, users can combine them to create diverse images.
Due to its simplicity, the SD1.5 series is an excellent choice for beginners. The straightforward nature of these models offers high flexibility when using checkpoints and LoRA, making it a frequently used series.
Representative Models: ChilloutMix, Anything, Meina Mix.
SDXL Series Models
SDXL is the successor to SD1.5, offering higher resolution and a significantly increased number of parameters, which enhances expressiveness. Like SD1.5, there are many fine-tuned checkpoints available for SDXL. Fine-tuned SDXL checkpoints allow users to create a broader range of images, including those that previously required LoRA in SD1.5.
However, LoRA compatibility between SDXL models can be problematic, with some LoRAs only working with the model they were trained on and not transferring well to other models.
Beginners are recommended to start with SD1.5 before moving on to SDXL.
Representative Models: DreamShaper XL, Animagine XL, Pony Diffusion XL. (Among SDXL’s fine-tuned models, it is recommended to start with Animagine as Pony has distinct quirks in both the images it generates and its method.)
SD3 Series Models
As of November 2024, SD3 is not widely used. Although SD3.5 was released in late October, it has not gained popularity due to the ease of use of Flux.1. However, SD3 might become more common as fine-tuning and LoRA development progress.
This series is not recommended for beginners since it is not widely adopted even among experienced Stable Diffusion users.
Representative Models: Currently limited to the base model of SD3.
Flux.1 Series Models
Flux.1 was released in August 2024. Its most notable feature is the ability to include alphabetic characters in generated images. Although SD3 is also purported to support this feature, it often struggles to achieve consistent results.
Flux.1 is also capable of generating diverse image styles with a single checkpoint. Unlike SD1.5 and SDXL, which are restricted to specific styles (e.g., realistic images for realistic checkpoints), Flux.1’s larger model size enables it to learn multiple styles in one checkpoint.
Another advantage of Flux.1 is its reduced rate of finger-related errors, such as generating images with six fingers, which was a common issue with SD1.5 and SDXL. This makes Flux.1 particularly powerful for realistic image generation, especially for scenes involving water reflections, complex landscapes, or detailed human figures.
When used with appropriate prompts, Flux.1 can generate images in nearly all styles, making it suitable for beginners as well.
Representative Models: Flux.1 (base model), STOIQO NewReality, SeaArt Infinity.
About SeaArt Infinity
SeaArt Infinity will be covered in detail in a future article, so here is a brief overview. SeaArt Infinity is a checkpoint created by SeaArt with fine-tuning applied to Flux.1, meaning it belongs to the Flux.1 lineage.On SeaArt, SeaArt Infinity and its derivative model SeaArt Realism were reclassified into the SA1.0 category in November 2024, but technically they are fine-tuned models of Flux.1.
This model is specifically trained to enhance text generation and accurate representation of hands and fingers, making the most of Flux’s strengths. Additionally, the cost of generating images with SeaArt Infinity is slightly lower than with the general Flux.1 model. Currently, it serves as a reliable "go-to" model for many situations, so beginners can also use it with ease.
Some external sites or articles on SeaArt mistakenly claim that "Infinity is based on SDXL" or "unrelated to Flux.1." However, given its text generation capabilities, it is clearly based on Flux.1.
Summary
This has been a brief overview of checkpoints in Stable Diffusion. If you have any questions, feel free to ask in the comments.