Seems that all models are loaded to the first GPU? #11

DDDOH · 2024-11-25T08:04:37Z

I use four A100 run python -m prover.launch --config=configs/RMaxTS.py --log_dir=logs/RMaxTS_results
But always get cuda out of memory error.

I added some print message:

Then i get

All models are tried to load in first GPU. Is this expected? Or maybe pytorch behavior is different for new version (I am using '2.5.1+cu121', and the version in requirement.txt is 2.2.1)

The text was updated successfully, but these errors were encountered:

Luobots · 2024-12-15T16:50:00Z

Same error. VLLM can't set the cuda index.

zzhisthebest · 2025-01-04T07:02:24Z

@Luobots How to solve this problem?I meet the same error.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Seems that all models are loaded to the first GPU? #11

Seems that all models are loaded to the first GPU? #11

DDDOH commented Nov 25, 2024 •

edited

Loading

Luobots commented Dec 15, 2024

zzhisthebest commented Jan 4, 2025

Seems that all models are loaded to the first GPU? #11

Seems that all models are loaded to the first GPU? #11

Comments

DDDOH commented Nov 25, 2024 • edited Loading

Luobots commented Dec 15, 2024

zzhisthebest commented Jan 4, 2025

DDDOH commented Nov 25, 2024 •

edited

Loading