Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metal library parsing: using CodecBzip2 feature to ignore padding. #504

Merged
merged 1 commit into from
Dec 24, 2024

Conversation

maleadt
Copy link
Member

@maleadt maleadt commented Dec 19, 2024

@maleadt maleadt marked this pull request as ready for review December 24, 2024 08:42
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Metal Benchmarks

Benchmark suite Current: 5859853 Previous: 1820957 Ratio
private array/construct 26416.714285714286 ns 26795.083333333336 ns 0.99
private array/broadcast 462875 ns 465417 ns 0.99
private array/random/randn/Float32 911083.5 ns 766458 ns 1.19
private array/random/randn!/Float32 595667 ns 658791 ns 0.90
private array/random/rand!/Int64 553875 ns 565416 ns 0.98
private array/random/rand!/Float32 555125 ns 585541 ns 0.95
private array/random/rand/Int64 920917 ns 794583 ns 1.16
private array/random/rand/Float32 831125 ns 603208 ns 1.38
private array/copyto!/gpu_to_gpu 568709 ns 661083 ns 0.86
private array/copyto!/cpu_to_gpu 702500 ns 821334 ns 0.86
private array/copyto!/gpu_to_cpu 639292 ns 769291 ns 0.83
private array/accumulate/1d 1453000 ns 1343959 ns 1.08
private array/accumulate/2d 1555188 ns 1393208.5 ns 1.12
private array/iteration/findall/int 2338229 ns 2103125 ns 1.11
private array/iteration/findall/bool 2110499.5 ns 1841854 ns 1.15
private array/iteration/findfirst/int 1846416.5 ns 1694416 ns 1.09
private array/iteration/findfirst/bool 1752917 ns 1667875 ns 1.05
private array/iteration/scalar 2750416 ns 3979292 ns 0.69
private array/iteration/logical 3595625 ns 3188666 ns 1.13
private array/iteration/findmin/1d 1936916 ns 1761458 ns 1.10
private array/iteration/findmin/2d 1444541 ns 1362292 ns 1.06
private array/reductions/reduce/1d 1016542 ns 1043000 ns 0.97
private array/reductions/reduce/2d 713250 ns 663833.5 ns 1.07
private array/reductions/mapreduce/1d 1004000 ns 1059834 ns 0.95
private array/reductions/mapreduce/2d 710083 ns 662417 ns 1.07
private array/permutedims/4d 2694750 ns 2564437 ns 1.05
private array/permutedims/2d 1105541.5 ns 1031750 ns 1.07
private array/permutedims/3d 1857249.5 ns 1593354.5 ns 1.17
private array/copy 871584 ns 591521 ns 1.47
latency/precompile 5924859583.5 ns 5763160083.5 ns 1.03
latency/ttfp 6778519146 ns 6663954270.5 ns 1.02
latency/import 1195181250 ns 1169291583 ns 1.02
integration/metaldevrt 771458 ns 714333 ns 1.08
integration/byval/slices=1 1685500 ns 1647416 ns 1.02
integration/byval/slices=3 20474041.5 ns 8868125 ns 2.31
integration/byval/reference 1674125 ns 1566291 ns 1.07
integration/byval/slices=2 2839250 ns 2727167 ns 1.04
kernel/indexing 465792 ns 457375 ns 1.02
kernel/indexing_checked 474625 ns 456500 ns 1.04
kernel/launch 8125 ns 37291.75 ns 0.22
metal/synchronization/stream 14750 ns 14916 ns 0.99
metal/synchronization/context 15250 ns 15042 ns 1.01
shared array/construct 26089.285714285714 ns 27152.833333333332 ns 0.96
shared array/broadcast 463167 ns 472750 ns 0.98
shared array/random/randn/Float32 915562.5 ns 791708.5 ns 1.16
shared array/random/randn!/Float32 604333 ns 659791.5 ns 0.92
shared array/random/rand!/Int64 550916 ns 552083.5 ns 1.00
shared array/random/rand!/Float32 555125 ns 595834 ns 0.93
shared array/random/rand/Int64 954000 ns 782875 ns 1.22
shared array/random/rand/Float32 829917 ns 578000 ns 1.44
shared array/copyto!/gpu_to_gpu 80666 ns 86667 ns 0.93
shared array/copyto!/cpu_to_gpu 82417 ns 89875 ns 0.92
shared array/copyto!/gpu_to_cpu 83000 ns 78083 ns 1.06
shared array/accumulate/1d 1485187.5 ns 1354292 ns 1.10
shared array/accumulate/2d 1541375 ns 1402292 ns 1.10
shared array/iteration/findall/int 2077667 ns 1845625 ns 1.13
shared array/iteration/findall/bool 1800834 ns 1610291 ns 1.12
shared array/iteration/findfirst/int 1530333 ns 1395270.5 ns 1.10
shared array/iteration/findfirst/bool 1467083 ns 1363083.5 ns 1.08
shared array/iteration/scalar 156375 ns 158584 ns 0.99
shared array/iteration/logical 3370375 ns 2985875 ns 1.13
shared array/iteration/findmin/1d 1621958.5 ns 1464479.5 ns 1.11
shared array/iteration/findmin/2d 1470479 ns 1373417 ns 1.07
shared array/reductions/reduce/1d 740041.5 ns 739562.5 ns 1.00
shared array/reductions/reduce/2d 717708 ns 671771 ns 1.07
shared array/reductions/mapreduce/1d 741187.5 ns 751584 ns 0.99
shared array/reductions/mapreduce/2d 732833 ns 668417 ns 1.10
shared array/permutedims/4d 2708250.5 ns 2553000 ns 1.06
shared array/permutedims/2d 1117208 ns 1035875 ns 1.08
shared array/permutedims/3d 1873479.5 ns 1593333 ns 1.18
shared array/copy 217833 ns 251771 ns 0.87

This comment was automatically generated by workflow using github-action-benchmark.

@maleadt maleadt merged commit b949b14 into main Dec 24, 2024
2 checks passed
@maleadt maleadt deleted the tb/bzip2 branch December 24, 2024 09:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant