Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compute dominator tree using semi-NCA algorithm #9603

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

amartosch
Copy link
Contributor

@amartosch amartosch commented Nov 13, 2024

Compute dominator tree using semi-NCA algorithm, described in Linear-Time Algorithms for Dominators and Related Problems. Loukas Georgiadis, Princeton University, November 2005.
The algorithm has quadratic worst-case complexity, but usually is much faster. The speedup is minor, but can be noticed on large functions, i.e. in this file.
Also add a differential fuzzing job for the new algorithm, with the old one as baseline. This addition is dubious, because it couldn't find anything overnight.
Described in #9481.

@amartosch amartosch requested review from a team as code owners November 13, 2024 20:19
@amartosch amartosch requested review from alexcrichton and cfallin and removed request for a team November 13, 2024 20:19
@github-actions github-actions bot added cranelift Issues related to the Cranelift code generator fuzzing Issues related to our fuzzing infrastructure labels Nov 13, 2024
Copy link

Subscribe to Label Action

cc @fitzgen

This issue or pull request has been labeled: "cranelift", "fuzzing"

Thus the following users have been cc'd because of the following labels:

  • fitzgen: fuzzing

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

@alexcrichton alexcrichton removed their request for review November 14, 2024 05:29
@@ -32,7 +423,7 @@ struct DomNode {
}

/// The dominator tree for a single function.
pub struct DominatorTree {
pub struct SimpleDominatorTree {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be moved to a separate file to make the code cleaner. At the moment it looks messy with 2 separate algorithms in the same file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to a separate file

Copy link
Member

@cfallin cfallin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks very much for this implementation! Some thoughts mostly on comments below but nothing major; given that you've done a fuzzing-based verification of the new algorithm against the old, and seen some compile-time speedups, I'm otherwise confident in it.

label: u32,
semi: u32,
idom: u32,
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we put doc-comments on the fields above? label, semi and idom don't tell me much about what's going on.


/// Returns pre_number for the new node
fn push(&mut self, ancestor: u32, block: Block) -> u32 {
// Push the virtual root if not present
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be a little cleaner if we pushed the virtual root at construction time (and updated clear to truncate to length 1 instead), I think?

}

/// Traversal event to compute both preorder spanning tree
/// and postorder block list. Don't use `Dfs` from traversals.rs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "don't use..." comment seems out of place here: did you mean to give a reason why we don't use it here? Or is it a reminder (for what context?) or...?

idom: PackedOption<Block>,
/// 0 for unreachable blocks
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer full sentences in doc-comments here (or at least complete thoughts) -- maybe something like "Preorder traversal number, or 0 for unreachable blocks." ?

}

/// The dominator tree for a single function.
/// The dominator tree for a single function,
/// computed using Semi-NCA algorithm
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

slight nitpick, but end comments with .?


valid: bool,
}

/// Methods for querying the dominator tree.
/// Query methods
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason we changed the doc-comment from Methods for querying the dominator tree. to Query methods (saying the same thing but missing detail, and no period)?

let pre_number = self.stree.push(parent, block);
node.pre_number = pre_number;

// Use the same traversal heuristics as in traversals.rs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than refer to another source file here (which may change or be removed in the future), can we say what the heuristics are, briefly?

self.dfs_worklist.extend(
func.block_successors(block)
.rev()
.filter(|successor| self.nodes[*successor].pre_number == 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 0 sentinel keeps occurring and I think it'd be clearer if we had a name for it -- could you define const NOT_VISITED: u32 = 0; or similar in this module?

self.nodes[block] = DomNode {
idom: self.compute_idom(block, cfg).into(),
rpo_number: (rpo_idx as u32 + 3) * STRIDE,
/// Eval-compress
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This name ("eval-compress") doesn't mean anything to the reader -- could you give a more descriptive doc-comment?

@@ -300,14 +300,14 @@ pub fn verify_context<'a, FOI: Into<FlagsOrIsa<'a>>>(
struct Verifier<'a> {
func: &'a Function,
expected_cfg: ControlFlowGraph,
expected_domtree: DominatorTree,
expected_domtree: SimpleDominatorTree,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason we have to use the old ("simple") implementation here? Or is the idea that when doing verification, we want to fall back to the more trusted algorithm? (Could we document this reasoning in a comment if so?)

Actually, thinking aloud a bit: I think I'm OK with moving everything over to the new algorithm, including the verifier; it should be enough to have the fuzz-target that checks the algorithm itself against the old algorithm. (If we don't trust that, then we shouldn't use the new algorithm anywhere!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cranelift Issues related to the Cranelift code generator fuzzing Issues related to our fuzzing infrastructure
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants