A single framework for diverse image synthesis across multiple domains, offering improved visual quality, diversity, and scalability.

StarGAN v2 is a PyTorch-based image-to-image translation model designed to learn mappings between different visual domains while maintaining diversity and scalability. It addresses limitations of existing methods by using a single framework to handle multiple domains and improve image diversity. The architecture involves a generator and discriminator network, trained adversarially to produce realistic and diverse images. Key use cases include transforming images between different animal faces (AFHQ dataset) and manipulating attributes like hairstyle on human faces (CelebA-HQ dataset). The model's performance is evaluated using metrics such as Fréchet Inception Distance (FID) and Learned Perceptual Image Patch Similarity (LPIPS), showing significant improvements over baseline methods. It supports pre-trained networks and datasets via bash scripts for easy setup.
StarGAN v2 is a PyTorch-based image-to-image translation model designed to learn mappings between different visual domains while maintaining diversity and scalability.
Explore all tools that specialize in learn mappings between visual domains. This domain focus ensures StarGAN v2 delivers optimized results for this specific requirement.
Explore all tools that specialize in generate realistic and diverse images. This domain focus ensures StarGAN v2 delivers optimized results for this specific requirement.
Explore all tools that specialize in modify image attributes (e.g., hairstyle). This domain focus ensures StarGAN v2 delivers optimized results for this specific requirement.
Generates multiple diverse images from a single input by sampling from a latent space, enabling exploration of different styles and attributes.
Capable of translating images across multiple domains using a single model, eliminating the need for separate models for each domain pair.
Supports generation of high-quality images at 512x512 resolution, preserving details and visual fidelity.
Comes with the Animal Faces-HQ (AFHQ) dataset, a high-quality dataset of animal faces with large inter- and intra-domain variations.
Offers pre-trained models for CelebA-HQ and AFHQ datasets, allowing users to quickly start experimenting without extensive training.
Clone the repository: git clone https://github.com/clovaai/stargan-v2.git
Navigate to the directory: cd stargan-v2/
Create a conda environment: conda create -n stargan-v2 python=3.6.7
Activate the environment: conda activate stargan-v2
Install PyTorch and dependencies: conda install -y pytorch=1.4.0 torchvision=0.5.0 cudatoolkit=10.0 -c pytorch
Install additional packages: conda install x264=='1!152.20180717' ffmpeg=4.0.2 -c conda-forge
Install pip packages: pip install opencv-python==4.1.2.30 ffmpeg-python==0.2.0 scikit-image==0.16.2 pillow==7.0.0 scipy==1.2.1 tqdm==4.43.0 munch==2.5.0
Download datasets and pre-trained networks using provided bash scripts (e.g., bash download.sh celeba-hq-dataset)
All Set
Ready to go
Verified feedback from other users.
"StarGAN v2 is highly regarded for its ability to generate diverse and high-quality images, though users note the complexity of setup and training."
Post questions, share tips, and help other users.
No direct alternatives found in this category.