Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements to Benchmark Scripts and Config Generation Workflow #13

Merged
merged 9 commits into from
May 20, 2024

Conversation

fabianlim
Copy link
Contributor

@fabianlim fabianlim commented May 17, 2024

Improvements to benchmarks

From now on, all benchmarks need to run in tox environment for package version hygiene.

tox -e run_benches
  • note, since the PR is not yet merged we need to set FHT_BRANCH=accel-pr

In addition

  • the tox command above accepts environment variables DRY_RUN, NO_DATA_PROCESSING, NO_OVERWRITE. See scripts/run_benchmarks.sh
  • in run_benchmarks.sh we will clear the RESULT_DIR if it exists, to avoid contaimination with old results. To protect against overwrite, then always run with NO_OVERWRITE=true.
  • also the run_benchmarks.sh script will produce two CSVs now.
    • raw_summary.csv: this is the original one previously called summary.csv. This one will contain the raw results
    • benchmarks.csv: this one is a processed version of raw_summary.csv. it only contains that columns that differ for easier viewing.
    • TODO: consider providing one more file for columns that are the same
  • in run_benchmarks.sh we also pip freeze a requirements.txt file inside the tox environment.
    • we should check in this file as well
    • TODO: does tox have a version lock file?

Script for Producing CSV report

After running a few benchmarks, we can gather all the results into a single CSV report.

  • can be done in incremental manner, even when the benches are still running
  • works with multiple benchmark directories, just specify them one after another
# do the following in the repo directory
# activate the tox environment
source .tox/run-benches/bin/activate

# run the display-bench-results.py on a directory with benchmark results 
# - say "benchmark_outputs"
PYTHONPATH= python scripts/benchmarks/display-bench-results.py benchmark_outputs

This will produce an output like this, and then the .csv report can be read by pandas.read_csv

***************** Report Created ******************
Total lines: '48'
Number columns included: '20'
Number columns excluded: '20'
Excluding number of exceptions caught: '0'
Written report to 'results.csv'

Improvements to Generate Configs

We added a new tox -e verify-configs to ensure that the configs are correctly generated.

  • this is now enabled as a workflow.
  • this will run tox -e gen-configs as a worflow, and test against the file that was checked in. If it differs it will fail the build.
  • This ensures that the sample-configuratinos are always up-to-date with the plugin configs

Signed-off-by: Yu Chin Fabian Lim <[email protected]>
@fabianlim fabianlim changed the base branch from main to dev May 17, 2024 06:26
Signed-off-by: Yu Chin Fabian Lim <[email protected]>
Signed-off-by: Yu Chin Fabian Lim <[email protected]>
@fabianlim fabianlim requested a review from achew010 May 18, 2024 08:31
@fabianlim fabianlim merged commit 1c790ed into dev May 20, 2024
2 checks passed
@fabianlim fabianlim deleted the gen-configs branch May 20, 2024 10:17
fabianlim added a commit that referenced this pull request May 27, 2024
…or GPTQ-LoRA (#20)

* Add GitHub Workflow for Linting , Formatting and Test. Activate Workflow for Framework (#7)

* add lint workflow

Signed-off-by: Yu Chin Fabian Lim <[email protected]>

* add pylintrc, update .tox fix files

Signed-off-by: Yu Chin Fabian Lim <[email protected]>

* activate test and minor fix

Signed-off-by: Yu Chin Fabian Lim <[email protected]>

* lint benchmarks.py and add workflow to dev

Signed-off-by: Yu Chin Fabian Lim <[email protected]>

---------

Signed-off-by: Yu Chin Fabian Lim <[email protected]>

* Improvements to Benchmark Scripts and Config Generation Workflow (#13)

* fix benches and add verify configs

Signed-off-by: Yu Chin Fabian Lim <[email protected]>

* update readme and add workflow

Signed-off-by: Yu Chin Fabian Lim <[email protected]>

* add packaging dep

Signed-off-by: Yu Chin Fabian Lim <[email protected]>

* update torch dep in framework and run-benches

Signed-off-by: Yu Chin Fabian Lim <[email protected]>

* take host env in run-benches

* add display bench results script

* rename summary.csv to raw_summary.csv and update run_benchmarks.sh

* export environment variables in shell command

* dump out pip requirements for repro, and add default FHT_branch

---------

Signed-off-by: Yu Chin Fabian Lim <[email protected]>

* Added support for running official HF baseline FSDP-QLoRA benchmark (#16)

* new baseline scenario

* rename variables

* added warning when plugin allows SFTTrainer to handle PEFT on single device

* Fix FSDP when performing GPTQ-LoRA with Triton V2  (#15)

* wrap in parameters and torch view to correct dtype

Signed-off-by: Yu Chin Fabian Lim <[email protected]>

* refactor to apply patch only on FSDP and simplify

Signed-off-by: Yu Chin Fabian Lim <[email protected]>

---------

Signed-off-by: Yu Chin Fabian Lim <[email protected]>

* Provide Memory Benchmarking Feature to Benchmarking Code (#14)

* add gpu memory logging support

* made improvements to GPU reference and result collation

* Renamed memory logging argument to reflect its readings as reserved me
mory using nvidia-smi and changed aggregation function in result collation

* variable renames

* manual linting

* added memory logging functionality via HFTrainer

* added support to benchmark memory using HFTrainer and updated READMEwith explanation of the 2 memory benchmarking options

* addressed changes requested in PR #14

* fix bug and smplify gpu logs aggregation logic

* fixes to calculation of HFTrainer Mem Logging values

* fix calculations

* more fixes

* fix to ignore including  stage inside max calculation of alloc memory

* more comments and README updates

* added fix to keyerror due to empty output dict from OOM

* manual linting

* added benchmark results to refs

* remove unnecessary columns in results gathering

* made changes to results gathering

---------

Signed-off-by: Yu Chin Fabian Lim <[email protected]>
Co-authored-by: achew010 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants