-
Notifications
You must be signed in to change notification settings - Fork 536
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Onnxruntime and Pytorch for Multi AMD GPUs Runs #1515
Onnxruntime and Pytorch for Multi AMD GPUs Runs #1515
Conversation
MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for these useful changes. Can you please see the review comments?
@@ -43,6 +46,12 @@ def get_args(): | |||
help="audit config for LoadGen settings during compliance runs") | |||
parser.add_argument("--max_examples", type=int, | |||
help="Maximum number of examples to consider (not limited by default)") | |||
parser.add_argument("--batch_size", type=int, default=1, | |||
help="Check your input patch size") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
batch :)
language/bert/pytorch_SUT.py
Outdated
@@ -15,6 +15,8 @@ | |||
# See the License for the specific language governing permissions and | |||
# limitations under the License. | |||
|
|||
# Modified by SCC23 UCSD Zixian Wang, Nov 11, 2023 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No modification in this file right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some places in the code that I modify so that it runs with multiple gpu devices.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for some reason those changes are not part of this PR. Can you please check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I think you are correct. This is not my modified version of pytorch_SUT.py. Should I git push the newer one then start a new pull request?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can just push to this branch and the PR will be updated automatically :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed the modified pytorch_SUT.py. Thank you.
#*.Server.target_qps = 1.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes are not required right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not required.
@@ -43,6 +46,12 @@ def get_args(): | |||
help="audit config for LoadGen settings during compliance runs") | |||
parser.add_argument("--max_examples", type=int, | |||
help="Maximum number of examples to consider (not limited by default)") | |||
parser.add_argument("--batch_size", type=int, default=1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this change!
@@ -0,0 +1,137 @@ | |||
# coding=utf-8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.pynb files need not be uploaded right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not too sure why the notebook checkpoint files are up. I think they are not necessary.
|
||
print("Constructing SUT...") | ||
print("Constructing SUT...") | ||
print("Constructing SUT...") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
print statement repeated...
recheck |
1 similar comment
recheck |
I modified pytorch_SUT.py, onnxruntime_SUT.py, squad_QSL.py, and run.py for BERT inference. They are able to run using multiple AMD GPUs as long as the right environment is set up.
Credit to AMD mentor Miro Hodak for guiding and directing me in optimizing the code and Khai Vu of Student Cluster Competition UCSD Team at SC23 for helping me set up Onnxruntime environment for AMD ROCM.