Skip to content

Commit

Permalink
[TRT] Disable TRT MatMul Plugin Cases, plugin can only be called via …
Browse files Browse the repository at this point in the history
…extra flag 'ffn'
  • Loading branch information
doxutx committed Mar 20, 2024
1 parent 6be1b0d commit 4d3dc00
Showing 1 changed file with 3 additions and 0 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -134,6 +134,8 @@ ILayer* MatMulTRTPluginLayerBuilder::AddToNetwork(INetworkDefinition* network) n
// Calling Plugin CUBLAS GEMM may hurt performace, so we put a very strict prerequisite.
// Ideally, Batched-GEMM plugin should only be called by Models with Transformer Kernels.
// Update: Disable custom plugin for case 2 above for Myelin optimization to speed-up network.
// Update: Disable all plugin cases below, plugin should only be called via extra flag "ffn"
/*
if (opA == MatrixOperation::kNONE && opB == MatrixOperation::kNONE &&
input_tensors.size() == 2 &&
input_tensors[0]->getDimensions().nbDims == input_tensors[1]->getDimensions().nbDims) {
Expand All @@ -151,6 +153,7 @@ ILayer* MatMulTRTPluginLayerBuilder::AddToNetwork(INetworkDefinition* network) n
return TensorRTPluginLayerBuilder::AddToNetwork(network);
}
}
*/
IMatrixMultiplyLayer* layer = network->addMatrixMultiply(*matrix_a, opA, *matrix_b, opB);

if (layer != nullptr) {
Expand Down

0 comments on commit 4d3dc00

Please sign in to comment.