Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

i#6934: Initial commit for Memory trace synthesis framework from Instruction Replay, useful for kernel memtracing and beyond #6931

Open
wants to merge 56 commits into
base: master
Choose a base branch
from

Conversation

iansseijelly
Copy link

@iansseijelly iansseijelly commented Aug 23, 2024

This PR adds a Mirage subproject to drcachesim, providing a translation from x86 to an ISA-agnostic, RISC-style intermediate representation. The mirage subproject contains the following:

  1. dr_mir_api. This file in the root directory provides the APIs for dr tools to access the mirage translator and replayer.
  2. ir. This directory contains the definition of the IR, including opcodes, operands, instructions, and linked list of mir instructions.
  3. frontend. This directory contains the translation context and generation routine from dr_ir to mirage IR.
  4. backend. This directory contains the backend for interpreting and replaying the mirage IR. Unimplemented for now.
  5. common. This directory contains the commonly used data structure definitions like bitmaps and linked list.

Copy link
Contributor

@derekbruening derekbruening left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial observations. Have not yet really looked at the code. I assume this will be split into multiple PR's for actual merging.

@@ -168,6 +168,11 @@ add_exported_library(drmemtrace_func_view STATIC tools/func_view.cpp)
add_exported_library(drmemtrace_invariant_checker STATIC tools/invariant_checker.cpp)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this PR is meant to get feel for all the pieces of this prototype rather than being sent for merging.

To help future PR's for actual merging: note that if this PR were sent as-is for merging our first reaction would be that it's too big: the unified patch
https://patch-diff.githubusercontent.com/raw/DynamoRIO/dynamorio/pull/6931.patch is over 6K lines while https://dynamorio.org/page_code_reviews.html#autotoc_md114 suggests "Review diffs larger than about 1500 lines should be avoided." Generally large changes should be split into smaller pieces and each of those sent separately. If possible during development PR's should be sent early; sometimes an end-to-end stage has to be reached before designs can be settled, in which case the code has to kept in pieces or split later for multiple reviews.

For merging, is there a logical, natural way to split up the code into pieces? If some pieces need others, their separate PR's won't fully run and won't be able to have end-to-end tests, but they could have unit tests or possibly have no tests in the same PR with a message about tests in later PR's that fill in other pieces. For starters, I see an access_region and a reuse_pattern tool: those should each be their own PR's.

@@ -168,6 +168,11 @@ add_exported_library(drmemtrace_func_view STATIC tools/func_view.cpp)
add_exported_library(drmemtrace_invariant_checker STATIC tools/invariant_checker.cpp)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see https://dynamorio.org/page_code_reviews.html#sec_commit_messages on conventions on PR description title and body.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. So should I open a new issue to acquire an issue number?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, as I don't think one exists yet.

@@ -0,0 +1,242 @@
#include "gen_opnd_api.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a more descriptive name than "fe/": I don't know what "fe" means. "Front end"? "front_end/" would be better.

@@ -0,0 +1,40 @@
cmake_minimum_required(VERSION 3.7)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every file should have a header with copyright and license information.

@@ -168,6 +168,11 @@ add_exported_library(drmemtrace_func_view STATIC tools/func_view.cpp)
add_exported_library(drmemtrace_invariant_checker STATIC tools/invariant_checker.cpp)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be a lot easier to review this prototype code with a good overview and description. Please edit the PR description to describe the components being added here and what each one does.

@iansseijelly iansseijelly changed the title Mirage: Initial commit for Memory trace synthesis framework from Instruction Replay, useful for kernel memtracing and beyond i#6934: Initial commit for Memory trace synthesis framework from Instruction Replay, useful for kernel memtracing and beyond Aug 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants