forked from Dao-AILab/flash-attention
-
Notifications
You must be signed in to change notification settings - Fork 46
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Enable fwd and varlen_fwd on AMD (#63)
* flash_attn_func works Compress This is a combination of 12 commits. add scripts save add our kernel import our kernel round trip use bshd layout figure out segfault fix show backward failure with prints save backward work run forward only test smallest config on everything add test fix remove pre commit install triton skip dropout pin d 32 factor d just run power of 2 remove timeout run serially clean up clean up 2 * Varlen works This is a combination of 6 commits. save some tests passing enable more enable everything move around alibi works * keep interface and kernel seperate * clean up
- Loading branch information
1 parent
320fb59
commit 508a92a
Showing
7 changed files
with
2,000 additions
and
10 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,63 @@ | ||
name: AMD Perf Kernel Tests | ||
|
||
on: | ||
workflow_dispatch: | ||
pull_request: | ||
branches: [main_perf] | ||
merge_group: | ||
branches: [main_perf] | ||
types: [checks_requested] | ||
push: | ||
branches: [main_perf] | ||
|
||
concurrency: | ||
group: ${{ github.ref }} | ||
cancel-in-progress: ${{ github.ref != 'refs/heads/main_perf' }} | ||
|
||
permissions: read-all | ||
|
||
|
||
jobs: | ||
Runner-Preparation-AMD: | ||
runs-on: ubuntu-latest | ||
timeout-minutes: 30 | ||
outputs: | ||
matrix-HIP: ${{ steps.set-matrix.outputs.matrix-HIP }} | ||
steps: | ||
- name: Prepare runner matrix | ||
id: set-matrix | ||
run: | | ||
if [ x"${{ github.repository }}" == x"ROCm/flash-attention" ]; then | ||
echo '::set-output name=matrix-HIP::[["self-hosted", "rocm"]]' | ||
else | ||
echo '::set-output name=matrix-HIP::[["ubuntu-latest"]]' | ||
fi | ||
Integration-Tests-AMD: | ||
needs: Runner-Preparation-AMD | ||
if: needs.Runner-Preparation-AMD.outputs.matrix-HIP != '' | ||
runs-on: ${{ matrix.runner }} | ||
strategy: | ||
matrix: | ||
runner: ${{fromJson(needs.Runner-Preparation-AMD.outputs.matrix-HIP)}} | ||
container: | ||
image: rocm/pytorch:rocm6.0.2_ubuntu22.04_py3.10_pytorch_2.1.2 | ||
options: --device=/dev/kfd --device=/dev/dri --security-opt seccomp=unconfined --group-add video --user root | ||
steps: | ||
- name: Checkout | ||
uses: actions/checkout@v4 | ||
- name: Install Triton | ||
run: | | ||
pip uninstall -y triton | ||
pip install matplotlib pandas pytest | ||
git clone https://github.com/triton-lang/triton | ||
cd triton | ||
pip install --verbose -e python | ||
cd .. | ||
- name: Build | ||
run: | | ||
python setup.py install | ||
- name: Test | ||
run: | | ||
pytest tests/test_flash_attn.py::test_flash_attn_output | ||
pytest tests/test_flash_attn.py::test_flash_attn_varlen_output |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -24,4 +24,11 @@ var/ | |
.idea/ | ||
|
||
# Dev | ||
venv | ||
venv | ||
|
||
# Other | ||
.eggs | ||
.vscode | ||
core | ||
scripts | ||
log* |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.