Overview
HyperGen makes it easy to load training datasets. Simply organize your images in a folder, optionally add captions, and load with one line of code.Basic Usage
Folder Structure
Images Only
The simplest structure - just put all your images in a folder:.jpg/.jpeg.png.webp.bmp
Images with Captions
For better results, add caption files next to each image:- Have the same name as the image (except the extension)
- Be plain text files (
.txt) - Contain a descriptive caption on the first line
- Be UTF-8 encoded
Captions are optional but highly recommended. They help the model learn what features to associate with your style or subject.
Loading Datasets
Simple Loading
Custom Extensions
Specify which file extensions to include:Checking Dataset Contents
Batch Iteration
Process dataset in batches:Dataset Guidelines
Image Count
Minimum
10-20 imagesMinimum for basic style/subject learning
Recommended
50-200 imagesBest balance of quality and training time
Maximum
1000+ imagesFor complex styles or high diversity
Diminishing Returns
Beyond 500 imagesMore data helps, but gains are smaller
Image Quality
Resolution:- Minimum: 512x512
- Recommended: 1024x1024 or higher
- The model will resize images automatically
- Use high-quality, sharp images
- Avoid heavily compressed JPEGs
- Remove watermarks if possible
- Crop to relevant content
- Include different angles and compositions
- Vary lighting conditions
- Mix different aspects of your subject/style
- Avoid duplicate or near-duplicate images
Caption Guidelines
Good captions help the model learn better: Do:- Describe what’s in the image objectively
- Mention key visual elements (colors, objects, actions)
- Be specific but concise (1-2 sentences)
- Use consistent terminology across captions
- Write subjective opinions (“beautiful”, “amazing”)
- Add metadata or keywords
- Copy the same caption for all images
- Write overly long descriptions
Advanced Dataset Usage
Accessing Individual Items
Dataset Properties
Custom Dataset Class
For advanced use cases, you can subclass the Dataset:Common Issues
No Images Found
Error:- Check that the path is correct
- Verify images have supported extensions
- Ensure files aren’t hidden (don’t start with
.) - Try absolute path instead of relative
Missing Captions
If some images don’t have captions, they’ll haveNone as their caption:
Unicode/Encoding Issues
If you see encoding errors, ensure caption files are UTF-8:Dataset Examples
Style Transfer Dataset
For learning an art style:Subject/Character Dataset
For learning a specific person or character:Product Dataset
For learning product photography:Next Steps
LoRA Training
Learn how to train with your dataset
Training Overview
Understand the training process
Examples
View complete training examples
Quick Start
Train your first LoRA in 5 minutes