-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Loading compiled fails: model_type=bert -> transformers
being used in compiled config.
#744
Comments
Hi @michaelfeil, I think there was a mismatch of the auto-detected library ("sentence transformers") and the class used for inference ( The following code using import torch
from optimum.neuron import NeuronModelForSentenceTransformers # type: ignore
from transformers import AutoConfig, AutoTokenizer # type: ignore[import-untyped]
compiler_args = {"auto_cast": "matmul", "auto_cast_type": "fp16"}
input_shapes = {"batch_size": 4, "sequence_length": 512}
model = NeuronModelForSentenceTransformers.from_pretrained(
model_id="TaylorAI/bge-micro-v2", # BERT SMALL
export=True,
**compiler_args,
**input_shapes,
) |
@JingyaHuang I am not sure if I want that. On 1.18. that part no longer works & is a breaking change. Here is how to run it: https://github.com/michaelfeil/infinity/tree/main/infra/aws_neuron |
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
Not stale. |
I see @michaelfeil, will open a PR to put back the support w/o. sentence transformers, thanks for reporting. |
Hi @michaelfeil, I opened a pull request here: #756, could you check if this fixes the issue? Thx. |
Thanks, will look into it! |
Hi @michaelfeil, I just merged the fix. Let me know if it works and feel free to to reopen if there is any further questions. Thx :D ! |
System Info
import torch
from optimum.neuron import NeuronModelForFeatureExtraction # type: ignore
from transformers import AutoConfig, AutoTokenizer # type: ignore[import-untyped]
compiler_args = {"num_cores": get_nc_count(), "auto_cast_type": "fp16"}
input_shapes = {
"batch_size": 4,
"sequence_length": (
self.config.max_position_embeddings
if hasattr(self.config, "max_position_embeddings")
else 512
),
}
self.model = NeuronModelForFeatureExtraction.from_pretrained(
model_id="TaylorAI/bge-micro-v2", # BERT SMALL
revision=None,
trust_remote_code=True,
export=True,
**compiler_args,
**input_shapes,
)
Analysis:
/var/tmp/neuron-compile-cache/neuronxcc-2.14.227.0+2d4f85be/MODULE_4aeca57e8a4997651e84/config.json
/var/tmp/neuron-compile-cache/neuronxcc-2.14.227.0+2d4f85be/MODULE_4aeca57e8a4997651e84/config.json
the model_type="transformer", but should be "bert"Reproduction:
docker run -it --device /dev/neuron0 michaelf34/aws-neuron-base-img:inf-repro
Also fails with same command with:
Also fails with
Does not fail with same command with
pip3 install --upgrade neuronx-cc==2.15.* torch-neuronx torchvision transformers-neuronx libneuronxla protobuf optimum-neuron==0.0.20
The text was updated successfully, but these errors were encountered: