Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update yml file to run 8b tests on presubmit and 70b and 405b tests nightly #387

Merged
merged 32 commits into from
Nov 6, 2024

Conversation

aviator19941
Copy link
Collaborator

Updates yml file to run 8b tests on each pull_request and 70b and 405b tests nightly.

Comment on lines 82 to 84
- name: Run llama 70b and 405b tests
if: github.event_name != "pull_request"
run: pytest sharktank/tests/models/llama/benchmark_amdgpu_test.py -v -s --longrun --iree-hip-target=gfx942 --html=out/index.html
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing which tests a workflow runs based on the event trigger feels like a recipe for confusion.

Workflow should be as simple and predictable as possible - don't overengineer this.

I would split into one workflow that runs on push and pull_request then a separate workflow that runs on schedule (and both should support workflow_dispatch for debugging).

If you're concerned about duplicating boilerplate... that needs to be fixed too. The nightly workflow should be using nightly release packages, and we need to roll out the new packaging into these workflows. Lines 59-76 should be condensed down to one line.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alright sounds good, will split into two workflow files

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the tests fit nicely alongside other sharktank tests, whatever runs on push and pull_request could be part of https://github.com/nod-ai/SHARK-Platform/blob/main/.github/workflows/ci-sharktank.yml.

What were the original reasons for creating a separate workflow?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original reason for creating a separate workflow was to be able to have a few of the small llama tests (8b) run on pull_request in order to test any regressions in sharktank with all the changes coming in. The larger llama tests (70b and 405b) take much longer, so would just be run nightly.

CC @saienduri

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah let's go with two workflows due to the machine constraint (won't be able to slot nicely into ci-sharktank).

Copy link
Contributor

@saienduri saienduri Oct 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main thing is the hardcoded paths, so it would have to be it's own step that only runs when the matrix.runs_on is the mi300x machine, which I guess is fine. For the long tests, we only want to run that on a schedule/workflow_dispatch, so having that in its own workflow file makes the most sense.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, Avi we can probably add the 8b testing to the ci-sharktank.yml and the full llama testing to a new workflow file ci-sharktank-nightly.yml

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well we shouldn't have any hardcoded paths. That needs to be fixed. Write tests first for developers to use (on any system that has the right hardware, enough disk space, etc.), then teach the CI to run them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aviator19941 is it possible to have the 8b files in huggingface?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fully agree, but looks like there are too many moving parts/files/configurations right now (model size, data type, batch size, TP sharding, decomposed vs decodeposed attention) which makes it hard to move to huggingface quickly. Let's go with separate workflows for now (just so it doesn't halt testing coverage for llama), but the end goal here should be moving to huggingface and the ci-sharktank.yml.

@pytest.mark.xfail(
reason="Test not yet implemented", strict=True, raises=AttributeError
reason="Test not yet implemented", strict=True, raises=ExportMlirException
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you update the reasons appropriately for all the tests?

Copy link
Contributor

@saienduri saienduri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@aviator19941 aviator19941 force-pushed the large_llama_ci_tests branch 2 times, most recently from 1f21e83 to cef7e2b Compare November 2, 2024 01:17
@aviator19941 aviator19941 force-pushed the large_llama_ci_tests branch 2 times, most recently from 9b92a79 to e8f90d4 Compare November 5, 2024 14:58
Signed-off-by: aviator19941 <[email protected]>
Signed-off-by: aviator19941 <[email protected]>
Signed-off-by: aviator19941 <[email protected]>
Signed-off-by: aviator19941 <[email protected]>
Signed-off-by: aviator19941 <[email protected]>
Signed-off-by: aviator19941 <[email protected]>
Signed-off-by: aviator19941 <[email protected]>
Signed-off-by: aviator19941 <[email protected]>
Signed-off-by: aviator19941 <[email protected]>
Signed-off-by: aviator19941 <[email protected]>
Signed-off-by: aviator19941 <[email protected]>
Signed-off-by: aviator19941 <[email protected]>
Signed-off-by: aviator19941 <[email protected]>
Signed-off-by: aviator19941 <[email protected]>
Signed-off-by: aviator19941 <[email protected]>
Signed-off-by: aviator19941 <[email protected]>
Signed-off-by: aviator19941 <[email protected]>
Signed-off-by: aviator19941 <[email protected]>
Signed-off-by: aviator19941 <[email protected]>
Signed-off-by: aviator19941 <[email protected]>
Signed-off-by: aviator19941 <[email protected]>
Signed-off-by: aviator19941 <[email protected]>
@aviator19941 aviator19941 merged commit 8ff3c95 into main Nov 6, 2024
5 checks passed
@aviator19941 aviator19941 deleted the large_llama_ci_tests branch November 6, 2024 12:26
@ScottTodd
Copy link
Member

In the future, when making substantial changes to a pull request after obtaining approval, please re-request review. I wasn't paying attention here because there were no new comments on the review thread and the commit history is confusing (32 commits with multiple rounds of force pushing over may days). A PR with +539 −384 lines changed should at a minimum have a longer summary / commit message and go through closer review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants