PyTorch_MultiGpuExperiments

A repo to perform benchmarks and experiments for my local multi-gpu setup. My current rig has 2 GPUs: 3090ti and a 3090 (the 3090 is, unfortunately, in a PCI Gen 3.0 4x functionality with a 16x sized slot). My CPU is an intel i9-11900K (8 cores).

I'm using PyTorch version 1.12.0. For Windows I am using the gloo backend for the data distributed parallel experiments, and for Linux (tested on Ubuntu), I am using nccl (there are some issues with nccl hanging, will investigate more later).

The benchmark consists of declaring a dataset of random images fed into a ResNet50 for various batch sizes for 5 epochs, 1000 random images of size 600 * 600.

Ubuntu 22.04 LTS.

Results for Single GPU training

The estimated training time for 1 gpu/s at batch size 8 is 87.884 seconds

The estimated training time for 1 gpu/s at batch size 16 is 86.74 seconds

Results for Data Parallel training with 2 gpus

The estimated training time for 2 gpu/s at batch size 8 is 108.168 seconds

The estimated training time for 2 gpu/s at batch size 16 is 75.209 seconds

Results for Distributed Data Parallel training with 2 gpus

The estimated training time for 2 gpu/s at batch size 8 is 50.848 seconds

The estimated training time for 2 gpu/s at batch size 16 is 47.357 seconds

Windows 10

Results for Single GPU training

The estimated training time for 1 gpu/s at batch size 8 is 91.801 seconds

The estimated training time for 1 gpu/s at batch size 16 is 87.89 seconds

Results for Data Parallel training with 2 gpus

The estimated training time for 2 gpu/s at batch size 8 is 106.887 seconds

The estimated training time for 2 gpu/s at batch size 16 is 77.81 seconds

Results for Distributed Data Parallel training with 2 gpus

The estimated training time for 2 gpu/s at batch size 8 is 65.234 seconds

The estimated training time for 2 gpu/s at batch size 16 is 52.446 seconds

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
README.md		README.md
dataddp_exp.py		dataddp_exp.py
dataparallel_exp.py		dataparallel_exp.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PyTorch_MultiGpuExperiments

Ubuntu 22.04 LTS.

Results for Single GPU training

Results for Data Parallel training with 2 gpus

Results for Distributed Data Parallel training with 2 gpus

Windows 10

Results for Single GPU training

Results for Data Parallel training with 2 gpus

Results for Distributed Data Parallel training with 2 gpus

About

Releases

Packages

Languages

JamesMcCullochDickens/PyTorch_MultiGpuExperiments

Folders and files

Latest commit

History

Repository files navigation

PyTorch_MultiGpuExperiments

Ubuntu 22.04 LTS.

Results for Single GPU training

Results for Data Parallel training with 2 gpus

Results for Distributed Data Parallel training with 2 gpus

Windows 10

Results for Single GPU training

Results for Data Parallel training with 2 gpus

Results for Distributed Data Parallel training with 2 gpus

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages