Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix MPSNDArrayDescriptor wrapper #502

Merged
merged 3 commits into from
Dec 19, 2024
Merged

Conversation

christiangnrd
Copy link
Contributor

@christiangnrd christiangnrd commented Dec 18, 2024

I realized automatically reversing dimensions in the descriptor as part of the low-level wrappers is a terrible idea.

Also a few more improvements

Also using this as an opportunity to test the [skip benchmark] and benchmarking from a fork. Edit: It works

@christiangnrd christiangnrd changed the title Followup to MPSNDArray PR Fix MPSNDArrayDescriptor wrapper Dec 18, 2024
Copy link

codecov bot commented Dec 18, 2024

Codecov Report

Attention: Patch coverage is 41.17647% with 10 lines in your changes missing coverage. Please review.

Project coverage is 76.04%. Comparing base (52d7056) to head (2b16b45).
Report is 386 commits behind head on main.

Files with missing lines Patch % Lines
lib/mps/ndarray.jl 43.75% 9 Missing ⚠️
lib/mps/MPS.jl 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #502      +/-   ##
==========================================
+ Coverage   71.04%   76.04%   +5.00%     
==========================================
  Files          36       57      +21     
  Lines        1143     2793    +1650     
==========================================
+ Hits          812     2124    +1312     
- Misses        331      669     +338     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@christiangnrd
Copy link
Contributor Author

@maleadt It seems like PR benchmark jobs are not commenting and are being uploaded because

comment-always: ${{ github.event_name == 'pull_request' }}
and
auto-push: ${{ github.event_name != 'pull_request' }}
were overlooked when changing pull_request to pull_request_target.

Should I add it to this PR or would it be better for you to commit the fix directly?

@maleadt
Copy link
Member

maleadt commented Dec 18, 2024

Ah oops. Feel free to commit directly on master, because the modified Benchmark.yml on this PR wouldn't get loaded (at least I think it wouldn't, because of using the pull_request_base instead of pull_request).

We should probably also edit out or revert the upload on the gh-pages branch.

@christiangnrd
Copy link
Contributor Author

We should probably also edit out or revert the upload on the gh-pages branch.

Done. I assume the benchmark page will update next time a benchmark action runs?

Feel free to commit directly on master

I don't think I have permission to commit directly to main. I tried and got the following error:

remote: error: GH006: Protected branch update failed for refs/heads/main.
remote: 
remote: - Changes must be made through a pull request.
To https://github.com/JuliaGPU/Metal.jl.git
 ! [remote rejected]   main -> main (protected branch hook declined)
error: failed to push some refs to 'https://github.com/JuliaGPU/Metal.jl.git'

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Metal Benchmarks

Benchmark suite Current: 2b16b45 Previous: 5a8daaa Ratio
private array/construct 26680.5 ns 26673.5 ns 1.00
private array/broadcast 462104.5 ns 458208 ns 1.01
private array/random/randn/Float32 745125 ns 852625 ns 0.87
private array/random/randn!/Float32 631375 ns 658334 ns 0.96
private array/random/rand!/Int64 538625 ns 555625 ns 0.97
private array/random/rand!/Float32 577333 ns 589208 ns 0.98
private array/random/rand/Int64 772188 ns 802125 ns 0.96
private array/random/rand/Float32 621291.5 ns 595625 ns 1.04
private array/copyto!/gpu_to_gpu 643020.5 ns 679834 ns 0.95
private array/copyto!/cpu_to_gpu 604041.5 ns 666958.5 ns 0.91
private array/copyto!/gpu_to_cpu 819812 ns 518021 ns 1.58
private array/accumulate/1d 1316917 ns 1339437.5 ns 0.98
private array/accumulate/2d 1371666 ns 1374708 ns 1.00
private array/iteration/findall/int 2087834 ns 2065062.5 ns 1.01
private array/iteration/findall/bool 1799333.5 ns 1816354 ns 0.99
private array/iteration/findfirst/int 1689500 ns 1690417 ns 1.00
private array/iteration/findfirst/bool 1664583 ns 1658583 ns 1.00
private array/iteration/scalar 3601875 ns 3599687.5 ns 1.00
private array/iteration/logical 3197625 ns 3169854 ns 1.01
private array/iteration/findmin/1d 1753916 ns 1746000 ns 1.00
private array/iteration/findmin/2d 1340791 ns 1351313 ns 0.99
private array/reductions/reduce/1d 1046833 ns 1041021 ns 1.01
private array/reductions/reduce/2d 655875 ns 655041 ns 1.00
private array/reductions/mapreduce/1d 1035812.5 ns 1040833 ns 1.00
private array/reductions/mapreduce/2d 660417 ns 660979.5 ns 1.00
private array/permutedims/4d 2540625 ns 2523625 ns 1.01
private array/permutedims/2d 1017917 ns 1020958 ns 1.00
private array/permutedims/3d 1584167 ns 1584104.5 ns 1.00
private array/copy 560166 ns 580395.5 ns 0.97
latency/precompile 5263931459 ns 5246987875 ns 1.00
latency/ttfp 6619689354 ns 6666328874.5 ns 0.99
latency/import 1164853499.5 ns 1165891041.5 ns 1.00
integration/metaldevrt 684146 ns 718542 ns 0.95
integration/byval/slices=1 1520958 ns 1537583 ns 0.99
integration/byval/slices=3 10693916.5 ns 8342167 ns 1.28
integration/byval/reference 1622854.5 ns 1552229 ns 1.05
integration/byval/slices=2 2617791.5 ns 2618584 ns 1.00
kernel/indexing 443583 ns 458000 ns 0.97
kernel/indexing_checked 442916.5 ns 473584 ns 0.94
kernel/launch 9513.833333333332 ns 10069.5 ns 0.94
metal/synchronization/stream 14417 ns 15479.5 ns 0.93
metal/synchronization/context 14833 ns 15084 ns 0.98
shared array/construct 27600.75 ns 26243.083333333336 ns 1.05
shared array/broadcast 448500 ns 473062.5 ns 0.95
shared array/random/randn/Float32 615624.5 ns 806375 ns 0.76
shared array/random/randn!/Float32 459250 ns 662875 ns 0.69
shared array/random/rand!/Int64 370083 ns 555333 ns 0.67
shared array/random/rand!/Float32 412667 ns 589000 ns 0.70
shared array/random/rand/Int64 670812.5 ns 737791.5 ns 0.91
shared array/random/rand/Float32 558104 ns 636875 ns 0.88
shared array/copyto!/gpu_to_gpu 87708 ns 89084 ns 0.98
shared array/copyto!/cpu_to_gpu 87417 ns 84916 ns 1.03
shared array/copyto!/gpu_to_cpu 83375 ns 77708 ns 1.07
shared array/accumulate/1d 1385083.5 ns 1339729 ns 1.03
shared array/accumulate/2d 1386625 ns 1387291 ns 1.00
shared array/iteration/findall/int 1960958 ns 1813291 ns 1.08
shared array/iteration/findall/bool 1680125 ns 1572375 ns 1.07
shared array/iteration/findfirst/int 1405875 ns 1385750 ns 1.01
shared array/iteration/findfirst/bool 1377145.5 ns 1363396 ns 1.01
shared array/iteration/scalar 157167 ns 158084 ns 0.99
shared array/iteration/logical 2981958 ns 2988208 ns 1.00
shared array/iteration/findmin/1d 1450020.5 ns 1467875 ns 0.99
shared array/iteration/findmin/2d 1361562.5 ns 1368229 ns 1.00
shared array/reductions/reduce/1d 724500 ns 731459 ns 0.99
shared array/reductions/reduce/2d 654771 ns 669917 ns 0.98
shared array/reductions/mapreduce/1d 731875 ns 733042 ns 1.00
shared array/reductions/mapreduce/2d 654167 ns 671958 ns 0.97
shared array/permutedims/4d 2332875 ns 2426458 ns 0.96
shared array/permutedims/2d 1014874.5 ns 1015708 ns 1.00
shared array/permutedims/3d 1573062.5 ns 1591625 ns 0.99
shared array/copy 223667 ns 250687 ns 0.89

This comment was automatically generated by workflow using github-action-benchmark.

@maleadt maleadt merged commit 84447c4 into JuliaGPU:main Dec 19, 2024
2 checks passed
@christiangnrd christiangnrd deleted the followup branch December 19, 2024 11:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants