Update yml file to run 8b tests on presubmit and 70b and 405b tests nightly #387

aviator19941 · 2024-10-30T22:31:40Z

Updates yml file to run 8b tests on each pull_request and 70b and 405b tests nightly.

ScottTodd · 2024-10-30T22:42:39Z

.github/workflows/ci-llama.yaml

+      - name: Run llama 70b and 405b tests
+        if: github.event_name != "pull_request"
        run: pytest sharktank/tests/models/llama/benchmark_amdgpu_test.py -v -s --longrun --iree-hip-target=gfx942 --html=out/index.html


Changing which tests a workflow runs based on the event trigger feels like a recipe for confusion.

What happens with the workflow_dispatch trigger?

Will developers know that https://github.com/nod-ai/SHARK-Platform/actions/workflows/ci-llama.yaml?query=event%3Apull_request and https://github.com/nod-ai/SHARK-Platform/actions/workflows/ci-llama.yaml?query=event%3Aschedule are testing entirely different things?

Workflow should be as simple and predictable as possible - don't overengineer this.

I would split into one workflow that runs on push and pull_request then a separate workflow that runs on schedule (and both should support workflow_dispatch for debugging).

If you're concerned about duplicating boilerplate... that needs to be fixed too. The nightly workflow should be using nightly release packages, and we need to roll out the new packaging into these workflows. Lines 59-76 should be condensed down to one line.

alright sounds good, will split into two workflow files

If the tests fit nicely alongside other sharktank tests, whatever runs on push and pull_request could be part of https://github.com/nod-ai/SHARK-Platform/blob/main/.github/workflows/ci-sharktank.yml.

What were the original reasons for creating a separate workflow?

The original reason for creating a separate workflow was to be able to have a few of the small llama tests (8b) run on pull_request in order to test any regressions in sharktank with all the changes coming in. The larger llama tests (70b and 405b) take much longer, so would just be run nightly.

CC @saienduri

Yeah let's go with two workflows due to the machine constraint (won't be able to slot nicely into ci-sharktank).

The main thing is the hardcoded paths, so it would have to be it's own step that only runs when the matrix.runs_on is the mi300x machine, which I guess is fine. For the long tests, we only want to run that on a schedule/workflow_dispatch, so having that in its own workflow file makes the most sense.

So, Avi we can probably add the 8b testing to the ci-sharktank.yml and the full llama testing to a new workflow file ci-sharktank-nightly.yml

Well we shouldn't have any hardcoded paths. That needs to be fixed. Write tests first for developers to use (on any system that has the right hardware, enough disk space, etc.), then teach the CI to run them.

@aviator19941 is it possible to have the 8b files in huggingface?

Fully agree, but looks like there are too many moving parts/files/configurations right now (model size, data type, batch size, TP sharding, decomposed vs decodeposed attention) which makes it hard to move to huggingface quickly. Let's go with separate workflows for now (just so it doesn't halt testing coverage for llama), but the end goal here should be moving to huggingface and the ci-sharktank.yml.

.github/workflows/ci-llama-quick-tests.yaml

saienduri · 2024-10-31T05:23:58Z

sharktank/tests/models/llama/benchmark_amdgpu_test.py

    @pytest.mark.xfail(
-        reason="Test not yet implemented", strict=True, raises=AttributeError
+        reason="Test not yet implemented", strict=True, raises=ExportMlirException


Can you update the reasons appropriately for all the tests?

.github/workflows/ci-llama-large-tests.yaml

saienduri

LGTM

…n nightly Signed-off-by: aviator19941 <[email protected]>

Signed-off-by: aviator19941 <[email protected]>

…mpiles very slowly Signed-off-by: aviator19941 <[email protected]>

ScottTodd · 2024-11-06T16:01:58Z

In the future, when making substantial changes to a pull request after obtaining approval, please re-request review. I wasn't paying attention here because there were no new comments on the review thread and the commit history is confusing (32 commits with multiple rounds of force pushing over may days). A PR with +539 −384 lines changed should at a minimum have a longer summary / commit message and go through closer review.

ScottTodd reviewed Oct 30, 2024

View reviewed changes

saienduri reviewed Oct 31, 2024

View reviewed changes

.github/workflows/ci-llama-quick-tests.yaml Outdated Show resolved Hide resolved

aviator19941 force-pushed the large_llama_ci_tests branch from ab51261 to ab67cf9 Compare October 31, 2024 04:59

aviator19941 requested a review from saienduri October 31, 2024 04:59

saienduri reviewed Oct 31, 2024

View reviewed changes

aviator19941 force-pushed the large_llama_ci_tests branch from 4da6643 to 1ac1e0d Compare October 31, 2024 15:57

aviator19941 commented Oct 31, 2024

View reviewed changes

.github/workflows/ci-llama-large-tests.yaml Outdated Show resolved Hide resolved

aviator19941 force-pushed the large_llama_ci_tests branch from 913e91e to 82eee2c Compare November 1, 2024 00:49

aviator19941 requested a review from saienduri November 1, 2024 07:55

aviator19941 force-pushed the large_llama_ci_tests branch from 084bcd8 to cba07d4 Compare November 1, 2024 16:59

saienduri approved these changes Nov 1, 2024

View reviewed changes

aviator19941 force-pushed the large_llama_ci_tests branch 2 times, most recently from 1f21e83 to cef7e2b Compare November 2, 2024 01:17

saienduri mentioned this pull request Nov 4, 2024

llama reporting tracking nod-ai/SHARK-TestSuite#383

Open

aviator19941 force-pushed the large_llama_ci_tests branch 2 times, most recently from 9b92a79 to e8f90d4 Compare November 5, 2024 14:58

aviator19941 added 14 commits November 6, 2024 05:17

Make llama 8b tests run on pull_request, 70b and 405b tests on longru…

c50d8c3

…n nightly Signed-off-by: aviator19941 <[email protected]>

Add 70b f16 decomposed test

e2a06a5

Signed-off-by: aviator19941 <[email protected]>

Use single quotes in yml file, change fp8 name

7d4c125

Signed-off-by: aviator19941 <[email protected]>

Fix artifacts_dir error

6e21c58

Signed-off-by: aviator19941 <[email protected]>

Make 70b test nightly

4500c7e

Signed-off-by: aviator19941 <[email protected]>

Split CI into two workflows

64feb8f

Signed-off-by: aviator19941 <[email protected]>

Change job name

2969e6c

Signed-off-by: aviator19941 <[email protected]>

Fix tests

c94b3aa

Signed-off-by: aviator19941 <[email protected]>

Use variable

e2d6208

Signed-off-by: aviator19941 <[email protected]>

Test workflow

8df29b1

Signed-off-by: aviator19941 <[email protected]>

Update 70b f16 xfail error

99c59ac

Signed-off-by: aviator19941 <[email protected]>

Update shard_irpa_file function

8d3a02d

Signed-off-by: aviator19941 <[email protected]>

Remove pdb

121ab2e

Signed-off-by: aviator19941 <[email protected]>

Remove shard_llama, add checks to shard_llm_dataset

070ca32

Signed-off-by: aviator19941 <[email protected]>

aviator19941 added 17 commits November 6, 2024 05:17

Fix sharding arg

03c435c

Signed-off-by: aviator19941 <[email protected]>

Add sharding support for compile and benchmark functions

8ce05dc

Signed-off-by: aviator19941 <[email protected]>

Remove on pull_request

e7a27a2

Signed-off-by: aviator19941 <[email protected]>

Test website updated

2c8ff9a

Signed-off-by: aviator19941 <[email protected]>

Make 70b test tp8 instead of tp1

80b21c8

Signed-off-by: aviator19941 <[email protected]>

Update weights path

984b1df

Signed-off-by: aviator19941 <[email protected]>

Fix compile arg

c3ff738

Signed-off-by: aviator19941 <[email protected]>

Fix benchmark arg

7ab4075

Signed-off-by: aviator19941 <[email protected]>

fix path

3e7af14

Signed-off-by: aviator19941 <[email protected]>

Fix shard_llm_dataset check

7e25a00

Signed-off-by: aviator19941 <[email protected]>

Fix cli args for TP support

777dbd1

Signed-off-by: aviator19941 <[email protected]>

Remove pdb

9cccee4

Signed-off-by: aviator19941 <[email protected]>

Fix benchmark device arg

4626689

Signed-off-by: aviator19941 <[email protected]>

Fix benchmarking args, rename decodeposed to non_decomposed

851b18f

Signed-off-by: aviator19941 <[email protected]>

Remove pdb

d312e8b

Signed-off-by: aviator19941 <[email protected]>

Fix irpa paths for 405b f16 tests

330c6bd

Signed-off-by: aviator19941 <[email protected]>

Remove on pull_request from large tests

1cfbb2c

Signed-off-by: aviator19941 <[email protected]>

aviator19941 force-pushed the large_llama_ci_tests branch from b9e2d68 to 1cfbb2c Compare November 6, 2024 11:18

Pin iree-compiler and iree-runtime to 20241104.1068, 20241105.1069 co…

af7b5aa

…mpiles very slowly Signed-off-by: aviator19941 <[email protected]>

aviator19941 merged commit 8ff3c95 into main Nov 6, 2024
5 checks passed

aviator19941 deleted the large_llama_ci_tests branch November 6, 2024 12:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update yml file to run 8b tests on presubmit and 70b and 405b tests nightly #387

Update yml file to run 8b tests on presubmit and 70b and 405b tests nightly #387

aviator19941 commented Oct 30, 2024

ScottTodd Oct 30, 2024

aviator19941 Oct 30, 2024

ScottTodd Oct 30, 2024

aviator19941 Oct 30, 2024

saienduri Oct 30, 2024

saienduri Oct 30, 2024 •

edited

Loading

saienduri Oct 30, 2024

ScottTodd Oct 30, 2024

saienduri Oct 31, 2024

saienduri Oct 31, 2024

saienduri Oct 31, 2024

saienduri left a comment

ScottTodd commented Nov 6, 2024

Update yml file to run 8b tests on presubmit and 70b and 405b tests nightly #387

Update yml file to run 8b tests on presubmit and 70b and 405b tests nightly #387

Conversation

aviator19941 commented Oct 30, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

saienduri Oct 30, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

saienduri left a comment

Choose a reason for hiding this comment

ScottTodd commented Nov 6, 2024

saienduri Oct 30, 2024 •

edited

Loading