Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Onnxruntime and Pytorch for Multi AMD GPUs Runs #1515

Conversation

zixianwang2022
Copy link

@zixianwang2022 zixianwang2022 commented Nov 12, 2023

I modified pytorch_SUT.py, onnxruntime_SUT.py, squad_QSL.py, and run.py for BERT inference. They are able to run using multiple AMD GPUs as long as the right environment is set up.

Credit to AMD mentor Miro Hodak for guiding and directing me in optimizing the code and Khai Vu of Student Cluster Competition UCSD Team at SC23 for helping me set up Onnxruntime environment for AMD ROCM.

@zixianwang2022 zixianwang2022 requested a review from a team as a code owner November 12, 2023 03:43
Copy link

github-actions bot commented Nov 12, 2023

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

Copy link
Contributor

@arjunsuresh arjunsuresh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for these useful changes. Can you please see the review comments?

@@ -43,6 +46,12 @@ def get_args():
help="audit config for LoadGen settings during compliance runs")
parser.add_argument("--max_examples", type=int,
help="Maximum number of examples to consider (not limited by default)")
parser.add_argument("--batch_size", type=int, default=1,
help="Check your input patch size")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

batch :)

@@ -15,6 +15,8 @@
# See the License for the specific language governing permissions and
# limitations under the License.

# Modified by SCC23 UCSD Zixian Wang, Nov 11, 2023
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No modification in this file right?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some places in the code that I modify so that it runs with multiple gpu devices.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for some reason those changes are not part of this PR. Can you please check?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I think you are correct. This is not my modified version of pytorch_SUT.py. Should I git push the newer one then start a new pull request?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can just push to this branch and the PR will be updated automatically :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed the modified pytorch_SUT.py. Thank you.

#*.Server.target_qps = 1.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes are not required right?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not required.

@@ -43,6 +46,12 @@ def get_args():
help="audit config for LoadGen settings during compliance runs")
parser.add_argument("--max_examples", type=int,
help="Maximum number of examples to consider (not limited by default)")
parser.add_argument("--batch_size", type=int, default=1,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this change!

@@ -0,0 +1,137 @@
# coding=utf-8
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.pynb files need not be uploaded right?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not too sure why the notebook checkpoint files are up. I think they are not necessary.


print("Constructing SUT...")
print("Constructing SUT...")
print("Constructing SUT...")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

print statement repeated...

@arjunsuresh
Copy link
Contributor

recheck

1 similar comment
@arjunsuresh
Copy link
Contributor

recheck

@zixianwang2022 zixianwang2022 closed this by deleting the head repository Nov 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants