multi nodes multi gpu is support ？ #516

ValueError: Your dataloader config must contain at least one image dataset AND at least one text_embed dataset. See this link for more information about dataset_type: https://github.com/bghira/SimpleTuner/blob/main/documentation/DATALOADER.md#configuration-options

0 replies

YLongJin · 2024-06-21T06:26:09Z

YLongJin
Jun 21, 2024
Author

I do not have text_embed dataset， i just have json file with image path and caption and prompt

0 replies

sayakpaul · 2024-06-21T06:29:10Z

sayakpaul
Jun 21, 2024

Oh then you should probably use the datasets library for that:

https://huggingface.co/docs/datasets/en/image_load

0 replies

YLongJin · 2024-06-21T06:36:14Z

YLongJin
Jun 21, 2024
Author

how can I make the embed dataset

0 replies

sayakpaul · 2024-06-21T06:46:30Z

sayakpaul
Jun 21, 2024

Cc: @bghira

0 replies

bghira · 2024-06-21T12:41:58Z

bghira
Jun 21, 2024
Maintainer

please follow the quickstart and use the demo dataset first just so you know you have it working

0 replies

bghira · 2024-06-21T18:48:06Z

bghira
Jun 21, 2024
Maintainer

the link to the quickstart is here for stable diffusion 3. I do not have a specific one for SDXL, but the basic concepts apply other than the difference for MODEL_NAME and STABLE_DIFFUSION_3 variable properties.

here is what the config looks like:

[
  {
    "id": "pseudo-camera-10k-sd3",
    "type": "local",
    "crop": true,
    "crop_aspect": "square",
    "crop_style": "center",
    "resolution": 0.5,
    "minimum_image_size": 0.25,
    "maximum_image_size": 1.0,
    "target_downsample_size": 1.0,
    "resolution_type": "area",
    "cache_dir_vae": "cache/vae/sd3/pseudo-camera-10k",
    "instance_data_dir": "datasets/pseudo-camera-10k",
    "disabled": false,
    "skip_file_discovery": "",
    "caption_strategy": "filename",
    "metadata_backend": "json"
  },
  {
    "id": "text-embeds",
    "type": "local",
    "dataset_type": "text_embeds",
    "default": true,
    "cache_dir": "cache/text/sd3/pseudo-camera-10k",
    "disabled": false,
    "write_batch_size": 128
  }
]

it's not anything you manually have to create - the trainer will do that for you.

this merely points to the storage location where these embeds can be stored.

it might seem needlessly complicated, and it is - because the trainer can split the storage locations of everything but the VAE cache objects, for efficiency purposes. you can store the image data locally via NVME and text embeds on a S3 storage bucket, for example.

hopefully after following the quickstart you'll have something working for your model, and then you can expand the configuration of accelerate to include multi-node.

because i don't have access to multiple nodes to train on, i have no ability to test or verify anything about configuration or runtime problems. please report any issues you do have, and we can work together on solving them.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

multi nodes multi gpu is support ？ #516

{{title}}

Replies: 13 comments

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

multi nodes multi gpu is support ？ #516

YLongJin Jun 20, 2024

Replies: 13 comments

bghira Jun 20, 2024 Maintainer

YLongJin Jun 21, 2024 Author

YLongJin Jun 21, 2024 Author

bghira Jun 21, 2024 Maintainer

YLongJin Jun 21, 2024 Author

sayakpaul Jun 21, 2024

YLongJin Jun 21, 2024 Author

YLongJin Jun 21, 2024 Author

sayakpaul Jun 21, 2024

YLongJin Jun 21, 2024 Author

sayakpaul Jun 21, 2024

bghira Jun 21, 2024 Maintainer

bghira Jun 21, 2024 Maintainer

YLongJin
Jun 20, 2024

bghira
Jun 20, 2024
Maintainer

YLongJin
Jun 21, 2024
Author

YLongJin
Jun 21, 2024
Author

bghira
Jun 21, 2024
Maintainer

YLongJin
Jun 21, 2024
Author

sayakpaul
Jun 21, 2024

YLongJin
Jun 21, 2024
Author

YLongJin
Jun 21, 2024
Author

sayakpaul
Jun 21, 2024

YLongJin
Jun 21, 2024
Author

sayakpaul
Jun 21, 2024

bghira
Jun 21, 2024
Maintainer

bghira
Jun 21, 2024
Maintainer