You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Latest DJL tensorrt container being used: 763104351884.dkr.ecr.region.amazonaws.com/djl-inference:0.29.0-tensorrtllm0.11.0-cu124
DJL is looking for hugging face artifacts to convert and fails upon not finding any:
OSError: Error no file named pytorch_model.bin, model.safetensors, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory /tmp/.djl.ai/download/bf8a789f03a76ad3e0773d75f1a9a366b66e57ba.
Expected Behavior
(what's the expected behavior?)
Model was converted and quantized prior to being deployed, so the expectation is these artifacts should deploy properly with the djl tensorrt container.
Error Message
(Paste the complete error message, including stack trace.)
OSError: Error no file named pytorch_model.bin, model.safetensors, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory /tmp/.djl.ai/download/bf8a789f03a76ad3e0773d75f1a9a366b66e57ba.
How to Reproduce?
(If you developed your own code, please provide a short script that reproduces the error. For existing examples, please provide link.)
Convert huggingface format artifacts to quantized TRT-LLM artifacts using a DJL container in a sagemaker notebook
Push tensorrt, quantized artifacts to new s3 path
Deploy to sagemaker using DJL container: 763104351884.dkr.ecr.region.amazonaws.com/djl-inference:0.29.0-tensorrtllm0.11.0-cu124
Steps to reproduce
(Paste the commands you ran that produced the error.)
See above
What have you tried to solve it?
Tried different paths is serving.properties without any success.
The text was updated successfully, but these errors were encountered:
Description
(A clear and concise description of what the bug is.)
Model artifacts are in the (TRT-LLM) LMI model format:
` aws s3 ls ***
PRE 1/
2024-10-25 14:59:16 739 config.json
2024-10-25 14:59:16 11222 config.pbtxt
2024-10-25 14:59:16 194 generation_config.json
2024-10-25 14:59:16 21 requirements.txt
2024-10-25 14:59:16 444 special_tokens_map.json
2024-10-25 14:59:16 9085698 tokenizer.json
2024-10-25 14:59:16 52097 tokenizer_config.json
`
Latest DJL tensorrt container being used: 763104351884.dkr.ecr.region.amazonaws.com/djl-inference:0.29.0-tensorrtllm0.11.0-cu124
DJL is looking for hugging face artifacts to convert and fails upon not finding any:
OSError: Error no file named pytorch_model.bin, model.safetensors, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory /tmp/.djl.ai/download/bf8a789f03a76ad3e0773d75f1a9a366b66e57ba.
Expected Behavior
(what's the expected behavior?)
Model was converted and quantized prior to being deployed, so the expectation is these artifacts should deploy properly with the djl tensorrt container.
Error Message
(Paste the complete error message, including stack trace.)
OSError: Error no file named pytorch_model.bin, model.safetensors, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory /tmp/.djl.ai/download/bf8a789f03a76ad3e0773d75f1a9a366b66e57ba.
How to Reproduce?
(If you developed your own code, please provide a short script that reproduces the error. For existing examples, please provide link.)
Steps to reproduce
(Paste the commands you ran that produced the error.)
See above
What have you tried to solve it?
Tried different paths is serving.properties without any success.
The text was updated successfully, but these errors were encountered: