Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: add a batched version of hsa_amd_profiling_convert_tick_to_system_domain. #243

Open
benvanik opened this issue Oct 1, 2024 · 0 comments

Comments

@benvanik
Copy link

benvanik commented Oct 1, 2024

Suggestion Description

I'm capturing timestamps on device with __builtin_readsteadycounter (or extracting them from signals myself) and end up with quite a few of them in large buffers that I'd like to translate without the additional API overhead of calling hsa_amd_profiling_convert_tick_to_system_domain on each one in a loop. It'd be nice for such cases to have a hsa_amd_profiling_convert_tick_batch_to_system_domain that accepted a list of ticks and either updated them in-place or in an output buffer.

What I noticed is that GpuAgent::TranslateTime takes a lock, does some looping math to see if synchronization is required, and potentially synchronizes - in a batched mode that could be done once and the lock needs not be held for the entire duration of the translation (t0/t1 can be reused). Batching has a tradeoff with accuracy as it's possible for the skew to change over the course of a batch but translating them all consistently is better behavior than an outer loop: today it's possible for the timestamps to change base in the middle of translation and produce inconsistent results and that messes up reporting. The user of such an API could choose the batch/flush frequency to balance the drift to work around that and manage it when it makes sense (in-between top-level invocations/frames/etc where there's natural points to rebase).

Operating System

No response

GPU

No response

ROCm Component

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants