Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WinEH] Update CoreCLR EH state numbering #123

Merged
merged 273 commits into from
Dec 30, 2015

Conversation

JosephTremoulet
Copy link
Contributor

Fix the CLR state numbering to generate correct tables for the new funclet
EH representation, and update the lit test to verify them.

Functionally, there are four changes:

  • Arity of cleanuppad parameters needs to use getNumArgOperands() instead
    of getNumOperands() now that cleanuppad has a parent token operand.
  • We assign a catchswitch the state of its first catch, not its unwind
    destination.
  • Don't queue catchpads' users -- this could visit handlers for inner
    tries before handlers for outer tries, and we'll visit everything we
    need to, in correct order, by queuing pads' predecessors. Change the
    initial worklist conditions from isToplevelPadForMSVC to a suitably
    implemented isToplevelPadForCLR accordingly.
  • Adjust getEHPadFromPredecessor to allow CLR state numbering to call it
    and have preds returned regardless of their parentage.

Also change some variable names to improve readability.

lic9 and others added 30 commits December 12, 2015 00:08
…ication-to-shift conversion.

because it broke buildbot.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255395 91177308-0d34-0410-b5e6-96231b3b80d8
Many tests are now passing due to eliminateFrameIndex implementation and
the list needs to be re-triaged because it unblocks other failures, and
some previous failures are different. However I'm about to churn it more
by implementing more lowering, so will wait on that.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255396 91177308-0d34-0410-b5e6-96231b3b80d8
This branch adds hints for highly biased branches on the PPC architecture. Even
in absence of profiling information, LLVM will mark code reaching unreachable
terminators and other exceptional control flow constructs as highly unlikely to
be reached.

Patch by Tom Jablin!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255398 91177308-0d34-0410-b5e6-96231b3b80d8
This change is discussed in D15392 and should allow us to effectively
revert:
http://llvm.org/viewvc/llvm-project?view=revision&revision=255261
if we canonicalize bitcasts ahead of extracts.

It should be safe to convert any pair of bitcasts into a single bitcast, 
however, it was mentioned here:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20110829/127089.html
that we're not allowed to bitcast from an x86_mmx to some other types, but I'm 
not seeing any failures from that, and we have regression tests in CodeGen/X86
that appear to cover all of those cases. 

Some day we'll get to remove that MMX wart from LLVM IR completely?

Differential Revision: http://reviews.llvm.org/D15468



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255399 91177308-0d34-0410-b5e6-96231b3b80d8
…conversion.

Summary: This patch adds support of conversion (mul x, 2^N + 1) => (add (shl x, N), x) and (mul x, 2^N - 1) => (sub (shl x, N), x) if the multiplication can not be converted to LEA + SHL or LEA + LEA. LLVM has already supported this on ARM, and it should also be useful on X86. Note the patch currently only applies to cases where the constant operand is positive, and I am planing to add another patch to support negative cases after this.

Reviewers: craig.topper, RKSimon

Subscribers: aemerson, llvm-commits

Differential Revision: http://reviews.llvm.org/D14603

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255415 91177308-0d34-0410-b5e6-96231b3b80d8
We don't need to pass OutStreamer as a parameter to LowerSTACKMAP and
LowerPATCHPOINT. It is a member variable of PPCAsmPrinter, and thus, is already
available. NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255418 91177308-0d34-0410-b5e6-96231b3b80d8
While we have successfully implemented a funclet-oriented EH scheme on
top of LLVM IR, our scheme has some notable deficiencies:
- catchendpad and cleanupendpad are necessary in the current design
  but they are difficult to explain to others, even to seasoned LLVM
  experts.
- catchendpad and cleanupendpad are optimization barriers.  They cannot
  be split and force all potentially throwing call-sites to be invokes.
  This has a noticable effect on the quality of our code generation.
- catchpad, while similar in some aspects to invoke, is fairly awkward.
  It is unsplittable, starts a funclet, and has control flow to other
  funclets.
- The nesting relationship between funclets is currently a property of
  control flow edges.  Because of this, we are forced to carefully
  analyze the flow graph to see if there might potentially exist illegal
  nesting among funclets.  While we have logic to clone funclets when
  they are illegally nested, it would be nicer if we had a
  representation which forbade them upfront.

Let's clean this up a bit by doing the following:
- Instead, make catchpad more like cleanuppad and landingpad: no control
  flow, just a bunch of simple operands;  catchpad would be splittable.
- Introduce catchswitch, a control flow instruction designed to model
  the constraints of funclet oriented EH.
- Make funclet scoping explicit by having funclet instructions consume
  the token produced by the funclet which contains them.
- Remove catchendpad and cleanupendpad.  Their presence can be inferred
  implicitly using coloring information.

N.B.  The state numbering code for the CLR has been updated but the
veracity of it's output cannot be spoken for.  An expert should take a
look to make sure the results are reasonable.

Reviewers: rnk, JosephTremoulet, andrew.w.kaylor

Differential Revision: http://reviews.llvm.org/D15139

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255422 91177308-0d34-0410-b5e6-96231b3b80d8
The builder complains thusly:
error C2027: use of undefined type 'llvm::raw_ostream'

Try to make it happy by including raw_ostream.h

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255425 91177308-0d34-0410-b5e6-96231b3b80d8
It is X86 specific and will not be properly exercised unless LLVM is
built with the X86 target.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255426 91177308-0d34-0410-b5e6-96231b3b80d8
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255429 91177308-0d34-0410-b5e6-96231b3b80d8
Cleanup/regenerate some tests for some upcoming patches.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255432 91177308-0d34-0410-b5e6-96231b3b80d8
…lement(bitcast X))

This change was discussed in D15392. It allows us to remove the fold that was added
in:
http://reviews.llvm.org/r255261

...and it will allow us to generalize this fold:
http://reviews.llvm.org/rL112232

while preserving the order of bitcast + extract that it produces and testing shows
is better handled by the backend.

Note that the existing check for "isVectorTy()" wasn't strong enough in general
and specifically because: x86_mmx. It's not a vector, but it's not vectorizable
either. So here we check VectorType::isValidElementType() directly before 
proceeding with the transform.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255433 91177308-0d34-0410-b5e6-96231b3b80d8
Before the patch, -fprofile-instr-generate compile will fail
if no integrated-as is specified when the file contains
any static functions (the -S output is also invalid).

This is the second try. The fix in this patch is very localized.
Only profile symbol names of profile symbols with internal 
linkage are fixed up while initializer of name syms are not 
changes. This means there is no format change nor version bump.




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255434 91177308-0d34-0410-b5e6-96231b3b80d8
…truction if address space != 0.

Summary:
Previously SelectionDAGBuilder asserted that the pointer operands of
memcpy / memset / memmove intrinsics are in address space < 256.  This assert
implicitly assumed the X86 backend, where all address spaces < 256 are
equivalent to address space 0 from the code generator's point of view.  On some
targets (R600 and NVPTX) several address spaces < 256 have a target-defined
meaning, so this assert made little sense for these targets.

This patch removes this wrong assertion and adds extra checks before lowering
these intrinsics to library calls.  If a pointer operand can't be casted to
address space 0 without changing semantics, a fatal error is reported to the
user.

The new behavior should be valid for all targets that give address spaces != 0
a target-specified meaning (NVPTX, R600, X86).  NVPTX lowers big or
variable-sized memory intrinsics before SelectionDAG construction.  All other
memory intrinsics are inlined (the threshold is set very high for this target).
R600 doesn't support memcpy / memset / memmove library calls (previously the
illegal emission of a call to such library function triggered an error
somewhere in the code generator).  X86 now emits inline loads and stores for
address spaces 256 and 257 up to the same threshold that is used for address
space 0 and reports a fatal error otherwise.

I call this a "partial fix" because there are still cases that can't be
lowered.  A fatal error is reported in these cases.

Reviewers: arsenm, theraven, compnerd, hfinkel

Subscribers: hfinkel, llvm-commits, alex

Differential Revision: http://reviews.llvm.org/D7241

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255441 91177308-0d34-0410-b5e6-96231b3b80d8
This is a very simple implementation of a thread pool using C++11
thread. It accepts any std::function<void()> for asynchronous
execution. Individual task can be synchronize using the returned
future, or the client can block on the full queue completion.

In case LLVM is configured with Threading disabled, it falls back
to sequential execution using std::async with launch:deferred.

This is intended to support parallelism for ThinLTO processing in
linker plugin, but is generic enough for any other uses.

Differential Revision: http://reviews.llvm.org/D15464

From: Mehdi Amini <[email protected]>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255444 91177308-0d34-0410-b5e6-96231b3b80d8
EABI attributes should only be emitted on EABI targets.  This prevents the
emission of the optimization goals EABI attribute on Windows ARM.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255448 91177308-0d34-0410-b5e6-96231b3b80d8
…y ignoring specific instructions.

LoopVectorizationCostModel::calculateRegisterUsage() is used to estimate the
register usage for specific VFs. However, it takes into account many
instructions that won't be vectorized, such as induction variables,
GetElementPtr instruction, etc.. This makes the loop vectorizer too conservative
when choosing VF. In this patch, the induction variables that won't be
vectorized plus GetElementPtr instruction will be added to ValuesToIgnore set
so that their register usage won't be considered any more.


Differential revision: http://reviews.llvm.org/D15177




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255454 91177308-0d34-0410-b5e6-96231b3b80d8
This patch adds some missing calls to MBB::normalizeSuccProbs() in several
locations where it should be called. Those places are found by checking if the
sum of successors' probabilities is approximate one in MachineBlockPlacement
pass with some instrumented code (not in this patch).


Differential revision: http://reviews.llvm.org/D15259




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255455 91177308-0d34-0410-b5e6-96231b3b80d8
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255459 91177308-0d34-0410-b5e6-96231b3b80d8
…y ignoring specific instructions.

(This is the second attempt to check in this patch: REQUIRES: asserts is added
to reg-usage.ll now.)

LoopVectorizationCostModel::calculateRegisterUsage() is used to estimate the
register usage for specific VFs. However, it takes into account many
instructions that won't be vectorized, such as induction variables,
GetElementPtr instruction, etc.. This makes the loop vectorizer too conservative
when choosing VF. In this patch, the induction variables that won't be
vectorized plus GetElementPtr instruction will be added to ValuesToIgnore set
so that their register usage won't be considered any more.


Differential revision: http://reviews.llvm.org/D15177




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255460 91177308-0d34-0410-b5e6-96231b3b80d8
The .even directive aligns content to an evan-numbered address.

In at&t syntax .even 
In Microsoft syntax even (without the dot).

Differential Revision: http://reviews.llvm.org/D15413



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255462 91177308-0d34-0410-b5e6-96231b3b80d8
Further investigation on the failures is ongoing.




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255463 91177308-0d34-0410-b5e6-96231b3b80d8
adrian-prantl and others added 11 commits December 17, 2015 18:25
The rules for removing trivially dead stores are a lot less complicated than loads. Since we know the later store post dominates the former and the former dominates the later, unless the former has side effects other than the actual store, we can remove it. One slightly surprising thing is that we can freely remove atomic stores, even if the later one isn't atomic. There's no guarantee the atomic one was every visible.

For the moment, we don't handle DSE of ordered atomic stores. We could extend the same chain of reasoning to them, but the catch is we'd then have to model the ordering effect without a store instruction. Since our fences are a stronger than our operation orderings, simple using a fence isn't an obvious win. This arguable calls for a refinement in our fence specification, but that's (much) later work.

Differential Revision: http://reviews.llvm.org/D15352



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255914 91177308-0d34-0410-b5e6-96231b3b80d8
This fixes a failure on Windows buildbots.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255919 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
[InstCombine] Adding '\n' to debug output. NFC.

Patch by Zhaoshi Zheng <[email protected]>

Reviewers: apazos, majnemer, weimingz

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15403

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255920 91177308-0d34-0410-b5e6-96231b3b80d8
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255926 91177308-0d34-0410-b5e6-96231b3b80d8
These functions were deprecated in r97608.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255927 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r255895. The patch breaks internal tests. Reverting until a
fix is ready.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255928 91177308-0d34-0410-b5e6-96231b3b80d8
Fix the CLR state numbering to generate correct tables, and update the lit
test to verify them.

The CLR numbering assigns one state number to each catchpad and
cleanuppad.

It also computes two tree-like relations over states:
 1) Each state has a "HandlerParentState", which is the state of the next
    outer handler enclosing this state's handler (same as nearest ancestor
    per the ParentPad linkage on EH pads, but skipping over catchswitches).
 2) Each state has a "TryParentState", which:
    a) for a catchpad that's not the last handler on its catchswitch, is
       the state of the next catchpad on that catchswitch.
    b) for all other pads, is the state of the pad whose try region is the
       next outer try region enclosing this state's try region.  The "try
       regions are not present as such in the IR, but will be inferred
       based on the placement of invokes and pads which reach each other
       by exceptional exits.

Catchswitches do not get their own states, but each gets mapped to the
state of its first catchpad.

Table generation requires each state's "unwind dest" state to have a lower
state number than the given state.

Since HandlerParentState can be computed as a function of a pad's
ParentPad, and TryParentState can be computed as a function of its unwind
dest and the TryParentStates of its children, the CLR state numbering
algorithm first computes HandlerParentState in a top-down pass, then
computes TryParentState in a bottom-up pass.

Also reword some comments/names in the CLR EH table generation to make the
distinction between the different kinds of "parent" clear.
@JosephTremoulet
Copy link
Contributor Author

I've updated this to match the latest version of D15325, which means expanding the FI to include r255674 aka b46bb54. Again, the final commit (now ad51af9) shows the non-FI part.
@AndyAyersMS PTAL.

@JosephTremoulet
Copy link
Contributor Author

This change and dotnet/llilc#968 will need to go in together; http://dotnet-ci.cloudapp.net/view/dotnet_llilc/job/dotnet_llilc_prtest/1344/ is running a test of the combination.

@JosephTremoulet
Copy link
Contributor Author

http://dotnet-ci.cloudapp.net/view/dotnet_llilc/job/dotnet_llilc_prtest/1344/ is running a test of the combination.

Those tests are failing, so I've been investigating. Part of 8cec2f2 has WinEHPrepare build up a map from invokes to "eh states" stored in the WinEHFuncInfo, which is later consulted by SelectionDAGBuilder when lowering invokes. In LLILC, RewriteStatepointsForGC swoops in between and, for all invokes which are GC safepoints, replaces the original invoke with an invoke of @llvm.gc.statepoint that takes the callee as an argument, which severs the connection with the map in the WinEHFuncInfo. :/

Seems to be a classic case of two different things trying to "run last", but it's not immediately obvious how to improve on that without teaching one directly about the details of the other... maybe the InvokeStateMap should be a ValueMap so that its keys will be rewritten when the safepoint rewriting calls replaceAllUsesWith.

@JosephTremoulet
Copy link
Contributor Author

maybe the InvokeStateMap should be a ValueMap so that its keys will be rewritten when the safepoint rewriting calls replaceAllUsesWith.

Not that simple; the lookup goes through a CallLoweringInfo object that also can't find the original invoke...

@AndyAyersMS
Copy link
Member

Hmm, side tables strike again.

@JosephTremoulet
Copy link
Contributor Author

I can reproduce this on e.g. JIT\Methodical\explicit\coverage\expl_float_1_r\expl_float_1_r with conservative GC. @swaroop-sridhar, is it expected that we'd be trying to lower statepoints with conservative GC?

@swaroop-sridhar
Copy link
Contributor

Even with conservative GC, Statepoints are inserted and lowered for GC-Transitions.
But the insertion happens in the reader -- PlaceSafepoints and RewriteStatepointsforGC phases are not run if DoInsertStatepoints == false. These statepoints without any actual gc-pointers tracked within them, and lowered as usual.

@JosephTremoulet
Copy link
Contributor Author

Ah, ok, thanks. Then the issue is purely that the CallLoweringInfo constructed by lowerStatepoint doesn't link in the CallSite, which lowerCallOperands now demands on invokes in functions with funclets. Need to understand why lowerStatepoint is using that overload to construct the CallLoweringInfo and whether there's a better channel to pass that information...

The CallLoweringInfo passed to lowerInvokable for statepoint calls does
not have the CallSite associated with it, but when lowerInvokable gets
passed a funclet-style EHPadBB it needs to find the InvokeInst so that it
can look up the state number in the InvokeStateMap.  The lowerInvokable
method is a private helper, so add an explicit InvokeInst parameter to it
and update both callers to set it appropriately.
@JosephTremoulet
Copy link
Contributor Author

I did some poking around. The two means of creating CallLoweringInfos (i.e. with and without the CallSite) have existed for as long as CallLoweringInfo has. The EH state map needs to use InvokeInsts as keys now because there are no longer EndPads and so it can't just use the unwind destination like it did previously. The InvokeInst is needed in SelectionDAGBuilder::lowerInvokable, which is a private helper method. It has two callers, one of which is for statepoints and the other of which always passes a CallLoweringInfo with a CallSite. To me, it makes more sense to just add the InvokeInst as a parameter to the helper, to avoid upsetting expectations in the call lowering code about what it means when a CallLoweringInfo has a CallSite attached to it and to avoid expanding CallLoweringInfo to include EH state map info that's only relevant for funclet EH. So, I've pushed a change does just that and would like to merge it. I'd have submitted the same for upstream review, except that it ought to be accompanied by a lit test, and reproducing the error requires a statepoint invoke which unwinds to a funclet EH pad, and the upstream statepoint lowering code fails elsewhere first on that combination -- #106 stubbed out those paths so they don't crash, but I think it makes sense to upstream this change only after the stubs have been replaced with a real implementation (tracked by #105) and that has been upstreamed as well.

@AndyAyersMS PTAL. The only change is the additional commit.

@AndyAyersMS
Copy link
Member

Latest commit LGTM.

JosephTremoulet added a commit that referenced this pull request Dec 30, 2015
[WinEH] Update CoreCLR EH state numbering
@JosephTremoulet JosephTremoulet merged commit 141baab into microsoft:MS Dec 30, 2015
@JosephTremoulet JosephTremoulet deleted the NewRep-LLILC branch December 30, 2015 01:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.