You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
/usr/local/lib/python3.10/dist-packages/torch/distributed/distributed_c10d.py:366: UserWarning: torch.distributed.reduce_op is deprecated, please use torch.distributed.ReduceOp instead
warnings.warn(
2024-09-19 10:22:35,681 INFO worker.py:1788 -- Started a local Ray instance.
Traceback (most recent call last):
File "/usr/local/bin/vllm", line 33, in <module>
sys.exit(load_entry_point('vllm', 'console_scripts', 'vllm')())
File "/data/wenxue_model_adaptation/projects/vllm-fork/vllm/scripts.py", line 149, in main
args.dispatch_function(args)
File "/data/wenxue_model_adaptation/projects/vllm-fork/vllm/scripts.py", line 29, in serve
asyncio.run(run_server(args))
File "/usr/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/usr/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
return future.result()
File "/data/wenxue_model_adaptation/projects/vllm-fork/vllm/entrypoints/openai/api_server.py", line 289, in run_server
app = await init_app(args, llm_engine)
File "/data/wenxue_model_adaptation/projects/vllm-fork/vllm/entrypoints/openai/api_server.py", line 229, in init_app
if llm_engine is not None else AsyncLLMEngine.from_engine_args(
File "/data/wenxue_model_adaptation/projects/vllm-fork/vllm/engine/async_llm_engine.py", line 476, in from_engine_args
executor_class = cls._get_executor_cls(engine_config)
File "/data/wenxue_model_adaptation/projects/vllm-fork/vllm/engine/async_llm_engine.py", line 423, in _get_executor_cls
initialize_ray_cluster(engine_config.parallel_config)
File "/data/wenxue_model_adaptation/projects/vllm-fork/vllm/executor/ray_utils.py", line 126, in initialize_ray_cluster
raise ValueError(
ValueError: The number of required hpus exceeds the total number of available hpus in the placement group.
The text was updated successfully, but these errors were encountered:
Your current environment
pip list
How would you like to use vllm
docker image:
vault.habana.ai/gaudi-docker/1.17.0/ubuntu22.04/habanalabs/pytorch-installer-2.3.1
and mounting cards 5, 6, and 7.when tensor-parallel-size=2
error msg
The text was updated successfully, but these errors were encountered: