Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Update torchtune generation to be more flexible
Summary: The existing softmax sampling trick implementation in the torchtune generator is not flexible enough to deal with vocab pruned models (when the number of logits produced does not match the size of the embedding layer). This is an unnecessary limitation and is easy to fix if we simply create the `q` tensor to match the size of the logits tensor instead of the embedding layer. NOTE: this is just a draft diff to get feedback on possible changes to the OSS torchtune package before submitting a proper pull request Differential Revision: D65480353
- Loading branch information