-
Notifications
You must be signed in to change notification settings - Fork 488
Issues: pytorch/torchtune
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Expose FSDP2 MixedPrecisionPolicy params
enhancement
New feature or request
triaged
This issue has been assigned an owner and appropriate label
#2267
opened Jan 14, 2025 by
EugenHotaj
Training with lora_finetune_distributed is slower than single_device, profile shows that nccl is causing this problem
distributed
Anything related to distributed env (multi-GPU, multi-node)
triaged
This issue has been assigned an owner and appropriate label
#2264
opened Jan 14, 2025 by
seekerzz
adding support for LR schedule for full distributed finetune
best practice
Things we should be doing but aren't
better engineering
Tasks which help improve eng productivity e.g. building tools, cleaning up code, writing docs
triaged
This issue has been assigned an owner and appropriate label
#2263
opened Jan 13, 2025 by
tginart
[RFC] Additional chat loss masking strategies
community help wanted
We would love the community's help completing this issue
discussion
Start a discussion
enhancement
New feature or request
good first issue
Good for newcomers
rfc
Request for comments
#2261
opened Jan 13, 2025 by
RdoubleA
Request: adding Tasks which help improve eng productivity e.g. building tools, cleaning up code, writing docs
triaged
This issue has been assigned an owner and appropriate label
py.typed
for type checkers
better engineering
#2258
opened Jan 13, 2025 by
jamesbraza
Qlora uses more memory than regular lora
triaged
This issue has been assigned an owner and appropriate label
#2255
opened Jan 11, 2025 by
AndrewMead10
Lora and Dora finetuning produces identical results
bug
Something isn't working
high-priority
#2250
opened Jan 10, 2025 by
AndrewMead10
Finetuning Llama 3.1 8B Base Model on ChatML Format Dataset – Loss Reaches NaN After 2000 Steps
triaged
This issue has been assigned an owner and appropriate label
#2246
opened Jan 10, 2025 by
abdul-456
Overriding kv cache entries in torchtune models
discussion
Start a discussion
triaged
This issue has been assigned an owner and appropriate label
#2241
opened Jan 9, 2025 by
telgamal-1
Finetune meta-llama/Llama-Guard-3-1B
triaged
This issue has been assigned an owner and appropriate label
#2237
opened Jan 8, 2025 by
jingzhaoou
quantization recipe should mimic checkpointer.save_checkpoint
better engineering
Tasks which help improve eng productivity e.g. building tools, cleaning up code, writing docs
#2229
opened Jan 4, 2025 by
felipemello1
Improvement: define a protocol to handle base loss and all chunked loss.
enhancement
New feature or request
#2226
opened Jan 2, 2025 by
insop
Improvement: add a "division by zero" check in chunked loss handling in kd_losses.py
enhancement
New feature or request
#2225
opened Jan 2, 2025 by
insop
Hugging Face from_pretrained() using merged weights KeyError: 'base_model_name_or_path'
bug
Something isn't working
triaged
This issue has been assigned an owner and appropriate label
#2224
opened Jan 2, 2025 by
chg0901
How to use train and test split with the recipes?
enhancement
New feature or request
triaged
This issue has been assigned an owner and appropriate label
#2222
opened Jan 1, 2025 by
7rabbit
Add a page explaining quickly setting up with custom data in live docs
#2221
opened Jan 1, 2025 by
RdoubleA
packed errors
bug
Something isn't working
triaged
This issue has been assigned an owner and appropriate label
#2218
opened Dec 31, 2024 by
chg0901
[feature request] support input/output to fsspec path
enhancement
New feature or request
triaged
This issue has been assigned an owner and appropriate label
#2217
opened Dec 31, 2024 by
leoleoasd
First example dataset for instruct datasets has no _component
#2215
opened Dec 30, 2024 by
johnowhitaker
hotw to estimate gpu memory needed for knowledge distillation?
discussion
Start a discussion
triaged
This issue has been assigned an owner and appropriate label
#2213
opened Dec 30, 2024 by
chuangzhidan
[Question] what to do when model doesn't have
tokenizer.model
?
high-priority
#2212
opened Dec 29, 2024 by
steveepreston
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.