Replies: 3 comments 8 replies
-
VAECache is the module responsible for undertaking the actual caching: https://github.com/bghira/SimpleTuner/blob/main/helpers/caching/vae.py the caching is initiated during the dataset configuration: https://github.com/bghira/SimpleTuner/blob/main/helpers/data_backend/factory.py#L297 the aspect sampler is responsible for parsing bucketed data and returning valid batches of samples: https://github.com/bghira/SimpleTuner/blob/main/helpers/multiaspect/sampler.py#L313 it's also at that point where the most information is available about a sample. the image_metadata dict is gathered and provided to the collate_fn: https://github.com/bghira/SimpleTuner/blob/main/helpers/training/collate.py#L174 that further adjusts and calculates some runtime information for the samples, incl whether or not dropout is applied to the caption. the trainer then consumes these samples here:
you might be interested in the metadata scan process if you are going to be actively reading images in the dataset to collect metadata, which is an expensive operation on S3 buckets. this would allow keeping a quick list of properties you need for later. that is here: https://github.com/bghira/SimpleTuner/blob/main/helpers/multiaspect/bucket.py#L642 |
Beta Was this translation helpful? Give feedback.
-
@zjysteven Hello! I'm just starting to look into using GLIGEN for a workflow and I found that it is only available for SD15. I found this issue thread and was wondering if you had many any headway on creating GLIGEN for SDXL? I would love to give it a try if you have. Thanks! |
Beta Was this translation helpful? Give feedback.
-
@zjysteven That's a shame. Thank you for letting me know. |
Beta Was this translation helpful? Give feedback.
-
Hi,
I'm trying to adapt GLIGEN to SDXL, which conditions on both captions and object bounding box coordinates to generate images. For this reason some customization of the dataset is needed, and I would greatly appreciate some instructions on how to achieve this (e.g., which module or part of the codebase I should be looking at).
Also, would you kindly point me to where the VAE output caching is taking place in
train_sdxl.py
? I somehow failed to locate where it happens.Thank you in advance for your time.
Beta Was this translation helpful? Give feedback.
All reactions