Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Queued PR] Port fixes from 0.7.2b #56

Draft
wants to merge 4 commits into
base: xinyazhang/meff-nonsquare_causal
Choose a base branch
from

Conversation

xinyazhang
Copy link
Collaborator

@xinyazhang xinyazhang commented Nov 14, 2024

Major Changes

  1. varlen support fixes
  2. Fix the numerical error due to rounding differences b/w FMA and MUL+SUB
    • The test is added test_large_bf16_nan_values in {test,tritonsrc}/test_backward.py

Note: "Fix NaN created by 0.0 (from sm_scale) * -inf (from masking)." is
firstly developed on main and then ported to 0.7.2b

The test is added to tritonsrc/test_backward.py as
test_large_bf16_nan_values

This change adds about 7% performance penalty with current tuning
database.
This requires a fix to aotriton_flash.mk_aotensor
@xinyazhang xinyazhang changed the title Port fixes from 0.7.2b [Queued PR] Port fixes from 0.7.2b Nov 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant