-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: change tracker API to initialize tracker early and track additional metrics. #50
Conversation
cc @VassilisVassiliadis comments are welcome |
Tracker now takes command line arguments as config. Aim stack is the default tracker and code contains example to measure additional metrics seamlessly into aimstack like 'model_load_time' Signed-off-by: Dushyant Behl <[email protected]>
Signed-off-by: Dushyant Behl <[email protected]>
Fix duplicate argument Signed-off-by: Dushyant Behl <[email protected]>
Signed-off-by: Dushyant Behl <[email protected]>
Signed-off-by: Dushyant Behl <[email protected]>
Signed-off-by: Dushyant Behl <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking of something along these lines (need to double check the indentation as I used the web interface to suggest the changes)
FSDP. Add tracker factory. Signed-off-by: Dushyant Behl <[email protected]>
@VassilisVassiliadis the design and some functions had changed due to inconsistency with FSDP runs. I feel most of your ask is incorporated now. Can you check again if you have any other questions. This argument |
Thanks! this looks good to me |
4bbe5d3
to
8661cc0
Compare
Signed-off-by: Dushyant Behl <[email protected]>
@alex-jw-brooks PTAL |
def train( | ||
model_args: configs.ModelArguments, | ||
data_args: configs.DataArguments, | ||
train_args: configs.TrainingArguments, | ||
peft_config: Optional[ | ||
Union[peft_config.LoraConfig, peft_config.PromptTuningConfig] | ||
] = None, | ||
callbacks: Optional[List[TrainerCallback]] = None, | ||
tracker: Optional[Tracker] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the train
function originall accepts only configurations by design, I feel we need a strong reason to allow it to also include objects. The more natural way would be to accept one "tracker config" input
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using a yaml as config was the first thought that came too. But given HF has chosen to use arguments for most configurations we went with it as a design choice. But if in fms-hf-tuning we choose to use config files for configs, we can do that too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Train function expects to take callbacks which need to be associated with the training hence main takes the config and initializes tracker (which is needed for the callback)
Tracker here is the tracker object which train function needs to track any extra metrics and metadata hence the design choice to pass tracker separately.
callbacks = [peft_saving_callback, file_logger_callback] | ||
|
||
# Initialize the tracker | ||
tracker = get_tracker(tracker_name, tracker_config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not just have a tracker_install_callbacks
function somewhere to reduce 3 lines into 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. Here too it's a design choice we have to make. We have tried to keep it consistent with how config is managed in fms-hf-tuning here too. This has tried to follow the pattern used in the peft config block above. But if we choose to have a different approach we can follow that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its simple choice to either perform things explicitly or run the code in some other funciton reverting to no callback under the hood. Does not affect the functionality. If you strongly feeling a meta function can help we can make that change.
requirements.txt
Outdated
@@ -2,7 +2,7 @@ numpy | |||
accelerate>=0.20.3 | |||
transformers>=4.34.1 | |||
torch | |||
aim==3.17.5 | |||
aim==3.18.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since there's a desire to implement multiple trackers, did we want to make the dependency (and imports) optional, just used when available and configured?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can be done..but then do we still list these in the requirements.txt?
or do we throw an error and ask user to install the required tracker before importing it in the code.
1fd64ba
to
93e2d03
Compare
Signed-off-by: Dushyant Behl <[email protected]>
Signed-off-by: Dushyant Behl <[email protected]>
See. Bump aim from 3.17.5 to 3.18.1 #42
Example of how new api can be invoked