FOAK Cross Entropy Loss Will Not Work with New Loss Functions After Transformers 4.46 #98

fabianlim · 2024-10-29T06:17:55Z

There has been significant refactoring of the loss functions for transformers 0.46, that will render the cross entropy patching ineffective. Need to have a different ModelPatcherRule for the new transformers version. CC: @anhuong

huggingface/transformers#34191

So now there are 3 possiblities

custom_loss_function is passed into Trainer
model has migrated to the custom_loss_function API
model has not migrated (like Granite now)

For 3. This is the easy one, because it means no code changes

For 1. Im thinking we do not patch anything, because if a user wants to do this, we cant control what loss function they use

For 2. In this case we want to patch fixed_cross_entropy , but this should be done on a per-model basis. So we need to somehow have the model instantiate the loss function, e.g., ForCausalLMLoss, and only patch fixed_cross_entropy during this instantiation process, and put it back to original after it is done

The text was updated successfully, but these errors were encountered:

fabianlim · 2024-10-31T17:22:14Z

@anhuong should we try to resolve this before coming back to #93?

fabianlim changed the title ~~FOAK Cross Entropy Loss Will Not Work with New Loss Functions~~ FOAK Cross Entropy Loss Will Not Work with New Loss Functions After Transformers 4.46 Oct 29, 2024

anhuong mentioned this issue Nov 1, 2024

build(deps): set transformers below 4.46, waiting on fixes foundation-model-stack/fms-hf-tuning#384

Merged

2 tasks

fabianlim added urgent Time sensitivity involved. future Will be affected in future versions (e.g., deprecation) labels Nov 4, 2024

anhuong self-assigned this Nov 7, 2024

anhuong mentioned this issue Nov 10, 2024

Introduce Liger Fused Cross Entropy Kernel to FOAK Plugin #76

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FOAK Cross Entropy Loss Will Not Work with New Loss Functions After Transformers 4.46 #98

FOAK Cross Entropy Loss Will Not Work with New Loss Functions After Transformers 4.46 #98

fabianlim commented Oct 29, 2024 •

edited

Loading

fabianlim commented Oct 31, 2024

FOAK Cross Entropy Loss Will Not Work with New Loss Functions After Transformers 4.46 #98

FOAK Cross Entropy Loss Will Not Work with New Loss Functions After Transformers 4.46 #98

Comments

fabianlim commented Oct 29, 2024 • edited Loading

fabianlim commented Oct 31, 2024

fabianlim commented Oct 29, 2024 •

edited

Loading