Skip to content

Latest commit

 

History

History
22 lines (15 loc) · 745 Bytes

Readme.md

File metadata and controls

22 lines (15 loc) · 745 Bytes

inline asm

This tutorial is about how to use inline mfma GCN asm in kernel.

Introduction:

MI-100 support MFMA (Matrix Fused Multiply Add) instructions set. This example just introduces how to call and compile

  1. mfma fp32
  2. mfma fp16 in HIP source kernel.

For more insight Please read the following blogs by Ben Sander The Art of AMDGCN Assembly: How to Bend the Machine to Your Will AMD GCN Assembly: Cross-Lane Operations

For more information: AMD GCN3 ISA Architecture Manual User Guide for AMDGPU Back-end

Requirement: