Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
i#6751 trace_entry_t: pretty printer #6771
base: master
Are you sure you want to change the base?
i#6751 trace_entry_t: pretty printer #6771
Changes from 6 commits
b48eed0
20db00b
dd80234
b10b1a2
00eefa0
8788d3b
793f6a4
c7881a6
b3a3d38
2af0d67
8b1ef16
73f8295
58c6032
cdb3eaa
612a8ea
81e6b24
81ff649
c940f18
5390027
7baee4e
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That page should be updated to point at this new tool, but please leave the
od
commands: as noted below some on-disk-format debugging needs to be at the real file level and not the trace_entry_t-stream level.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a unit test similar to clients/drcachesim/tests/view_test.cpp.
Also some test that verifies the output for some checked in raw trace (e.g. clients/drcachesim/tests/drmemtrace.threadsig.aarch64.raw/raw) like clients/drcachesim/tests/offline-view.templatex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's in the TODO (see PR description).
However, before working on that, I'd like to come to a consensus on the pretty printed output of internal_record_view (<--- new tool name), as that is what I will test against.
Here's what it prints now (a short "representative" snippet, note that I OMITTED a few things because I wasn't sure I could share them here... the actual output has those values):
Is it acceptable? Do we need "--------" dividers? A header at the top? Do we want to number each line? Tabs?
I'm fine with how it looks now, just wondering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's worth spending any more time on this (esp for things like dividers and ordinals that the main view tool already has): this PR is just adding an internal tool that honestly will likely be rarely used, as evidenced by it not being needed in the past. Even now there likely is no use case where you wouldn't also go run the final view tool afterward, since that's the real view seen by tools. E.g., for your own ISA transform and filtering, you can't just go by the output of this record view tool: you have to make sure it works at the memref_t layer, so you have to use the real view tool plus invariant checks anyway. This internal tool is only a limited-use debugging step pretty much always followed up by using the "real" pretty printer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking more about when I've looked at the on-disk view with
od
: the most recent was debugging record_filter zip chunking support. There I had to look just before and after the 10M-instr chunk boundaries. Runningod
on one particular zipfile subfile at a time made this very easy. Looking at the output above with no ordinals, and having to pass in the whole trace file/dir at once: how would I focus on chunk boundaries? I might end up going back tood
! I guess-skip_instructions
is technically implemented for record_reader_t? But it's slow: the fast chunk skips are not implemented. Plus if the record_filter being debugged got the chunk boundary wrong I would have to go search for the chunk ord marker; if that's missing or wrong: would have to fall back tood
.Not sure there's an action item here: just an observation that this new tool is not enough to debug all on-disk issues as it abstracts away the file splits and I'm not sure it would have been useful the last time I did that. Adding ordinals would help -- but I don't know it's worth the effort. I would probably go run the regular view tool skipped to the chunk transition and take the closest timestamps and search for that in this tool's output or sthg. But I would again want to be at the real file layer instead of the trace_entry_t stream layer's abstracting away of the file divisions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since for trace version the enum value matters, let's add the corresponding enum value too to these strings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As in, for example: "trace_entry_version_no_kernel_pc" -> "trace_entry_version_no_kernel_pc_2" ? (2 is the enum value)
Although I'm not quite understanding why the enum value matters in the string representation of TRACE_ENTRY_VERSION_ enum. They have different names after all.
Unless you're referring to trace_version_names[0] (and [1]), which are "unknown"? If so, you mean I should just differentiate those 2?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm considering a use case where the user wants to know whether their trace has a version more recent than some other version (some of our tools want a minimum trace version). Having the enum value would be easier, but I guess the user can always look up the enum value themselves using just the enum string. So it's fine either way I guess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we print the enum value along side the string?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, I didn't understand that the enum value had a meaning (larger == more recent trace version?).
I like @xdje42 suggestion. Now printing it when we print the trace_entry_t record (but not including it in
trace_version_names[]
).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would we want the strings at clients/drcachesim/tools/view.cpp to be replaced by these? Maybe. Add a TODO at view.cpp.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would change all the output recorded in the docs here and in the sample traces repo and in internal docs so would not be taken lightly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reverse might be an easier thing to do. See all the comments about sharing code: eliminate this array and use the existing public pretty-printer's marker printing code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#6771 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I modified this array to be consistent with view.cpp (which, now, view.cpp also uses).
I think we still want a simple string representation of a marker type, we don't want to get rid of this array.
What view.cpp does is merging together the string representation of a marker type + a marker value.
Now,
trace_marker_names[]
contains the string representation of marker types only, and it's used in the newtrace_marker_type_value_as_string()
, which returns a string representation of marker type + value (and it's, for the most part, in common between view.cpp and record_view.cpp).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replacing this with a switch-case may be more readable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would make it even larger though, and we'd be basically replicating utility functions like: type_is_intr() or type_has_address().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean there'd be more lines of code? I'd still suggest that a switch fits the bill better. It makes it clear that we're only doing equality checks. Also: https://stackoverflow.com/questions/427760/when-to-use-if-else-if-else-over-switch-statements-and-vice-versa.
Can you clarify this concern? I don't understand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it would be larger (not my main concern though).
With "replicating utility functions like type_is_instr()" I mean that if we move to a switch we can't use type_is_instr() or type_has_address() anymore, and a case like:
Becomes:
Listing all the TRACE_TYPE_ cases in the switch for which type_is_instr() returns
true
is basically replicating what type_is_instr() does, which also means that if in the future we add a new TRACE_TYPE_ that is supposed to represent an instr, then we'd have (at least) two places to update (the switch in record_view.cpp and type_is_instr() in trace_entry.h) instead of one.I do agree that a switch would look more readable, but I think the cons outweight the pros in this case.
Unless I'm missing something?
Perhaps you meant replacing with a switch only the simple cases like:
And not the whole
if-else-if
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should try to share more code with clients/drcachesim/tools/view.cpp where possible; e.g. see other comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For sharing of TRACE_TYPE_ pretty printing between view.cpp and record_view.cpp we need to come to a decision for what is the right way to print TRACE_TYPE_ enum values.
Right now there are two string representations: one hardcoded in view.cpp and one in
trace_type_names[]
(used by other tools).#6771 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clients/drcachesim/tools/view.cpp already has an exhaustive switch-case. It can be shared with this tool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok.
Although not all of it can be shared.
view.cpp does some delayed printing and extra processing, which would defeat the purpose of "record_view": we want one line per trace_entry_t to show the exact sequence of on-disk records, we don't want to do the extra processing of view.cpp, it should be a 1 to 1 "mirror" of the trace file, just human readable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now sharing (between view and record_view) the printing of all TRACE_MARKER_TYPE_ cases that do not require extra processing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not how encodings should be printed for AArch{32,64}.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clients/drcachesim/tools/view.cpp disambiguates various kinds of instrs. We should share that logic with this tool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mhm, view.cpp is actually just renaming trace_type_names[], which is weird, because every other tool uses trace_type_names[], but view.cpp does not.
Should we switch this around and have view.cpp use trace_type_names[]?
Or change trace_type_names[] with what view.cpp prints?
Or does this break too much documentation and/or other assumptions I'm not aware of?
(If we decide to go this way, the aforementioned changes should go in a separate PR IMHO).