Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature](bangc-ops): Fix bugs of ms_deform_attn_forward and align na… #852

Conversation

Unireverse
Copy link
Collaborator

@Unireverse Unireverse commented Oct 11, 2023

…n/inf status

Thanks for your contribution and we appreciate it a lot. 🚀🚀

1. Motivation

Fix a bug of ms_deform_attn_forward small channel kernel and align nan/inf status of all kernels.

2. Modification

modified: bangc-ops/kernels/ms_deform_attn_forward/msda_forward_small_channel_union1.mlu

3. Test Report

If you want to know how to do operator testing, you can see GTest-User-Guide-zh.

3.1 Modification Details

3.1.1 Accuracy Acceptance Standard

For static threshold standard details, see: MLU-OPS Accuracy Acceptance Standard.

  • static threshold
    • diff1
      • float32 mlu diff1 <= 1e-5
      • float32 mlu diff1 <= 3e-3
      • float16 mlu diff1 <= 3e-3
    • diff2
      • float32 mlu diff2 <= 1e-5
      • float32 mlu diff2 <= 3e-3
      • float16 mlu diff2 <= 3e-3
    • diff3
      • mlu diff3 == 0
      • mlu diff3_1 == 0
      • mlu diff3_2 == 0
  • dynamic threshold
    • diff1: mlu diff1 <= max(baseline diff1 * 10, static threshold)
    • diff2: mlu diff2 <= max(baseline diff2 * 10, static threshold)
    • diff3: mlu diff3 <= max(baseline diff3 * 10, static threshold)
      • float32, threshold = 1e-5
      • float16, threshold = 1e-3

3.1.2 Operator Scheme checklist

  • Supported hardware
    • MLU370
    • MLU590
  • Job types
    • BLOCK
    • UNION1
    • UNION2
    • UNION4
    • The operator will dynamically select the most suitable task type, for example, UNION8

3.2 Accuracy Test

3.2.1 Accuracy Test

If you have checked the following items, please tick the relevant box.

  • Data type test (e.g. float32/int8)
  • Multi-dimensional tensor test
  • Layout test
  • Different size/integer remainder end segment/alignment misalignment test
  • Zero dimensional tensor test/zero element test
  • stability test
  • Multiple platform test
  • Gen_case module test, see: Gencase-User-Guide-zh
  • Nan/INF tests
  • Bug fix tests
  • For memory leak check details, see: GTest-User-Guide-zh
  • For code coverage check details, see: GTest-User-Guide-zh
  • For I/O calculation efficiency check details, see: MLU-OPS-Performance-Acceptance-Standard

3.2.2 Parameter Check

No update.

3.2.3 Accurary Test Result

Platform:MLU370
[ OK ] ms_deform_attn_forward/TestSuite.mluOp/23 (1 ms)
[----------] 24 tests from ms_deform_attn_forward/TestSuite (48 ms total)

[----------] Global test environment tear-down
[2023-10-11 15:23:32] [MLUOP] [Vlog]:TearDown CNRT environment.
[ SUMMARY ] Total 24 cases of 1 op(s).
ALL PASSED.
[==========] 24 test cases from 1 test suite ran. (713 ms total)
[ PASSED ] 24 test cases.

Platform:MLU590
[ OK ] ms_deform_attn_forward/TestSuite.mluOp/23 (2 ms)
[----------] 24 tests from ms_deform_attn_forward/TestSuite (346 ms total)

[----------] Global test environment tear-down
[ SUMMARY ] Total 24 cases of 1 op(s).
ALL PASSED.
[==========] 24 test cases from 1 test suite ran. (2087 ms total)
[ PASSED ] 24 test cases.

3.3 Performance Test

See MLU-OPS Performance Acceptance Standard for details.

No update.

3.4 Summary Analysis

All test passed.

@Unireverse Unireverse force-pushed the fix_ms_deform_forward_bug_and_align_nan_inf branch from b2b9b78 to 75fa852 Compare October 11, 2023 07:14
@PetrelYy PetrelYy added the BANGC label Oct 11, 2023
@duzekunKTH duzekunKTH merged commit 4c2a685 into Cambricon:master Oct 11, 2023
1 check passed
@Unireverse Unireverse added this to the v0.9.0 milestone Oct 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants