RFC: New EH representation for MSVC compatibility

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
38 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: RFC: New EH representation for MSVC compatibility

Steve Cheng
On 2015-05-18 18:11:29 -0400, Reid Kleckner said:

>
> - Win32 (x86) frame-based SEH
>
> But for __C_specific_handler, I see a potential issue versus x86-64, in
> that RtlUnwind can't restore non-volatile registers, so when registers
> are garbage when control is transferred to the landing pad.  When I
> read the Itanium ABI documentation, it says that landing pads do get
> non-volatile registers restored, so I guess that's probably the working
> assumption of LLVM.
>
> That's pretty frustrating, given that the xdata unwinder already knows
> where the non-volatile registers are saved. Anyway, I think it can be
> overcome in the backend with the right register allocation constraints.

I'm referring to 32-bit code here, i.e. the FS:[0] SEH frames, if I
wasn't clear. There's no xdata on 32-bit.

>
> __C_specific_handler's registration frame saves down EBP, but no other
> registers, even ESP. If we use dynamic alloca or frame pointer
> omission, we are dead in the water, right?
>
> Are you sure the unwinder doesn't restore RSP? Anyway, the address of a
> dynamic alloca can easily be spilled to the stack and reloaded.

Should be s/__C_specific_handler/__except_handler3/

I think ESP isn't reliable because there could be intermediate
functions, in between where RaiseException got called and the target
__except block, that do not install any SEH FS:[0] frames at all. Those
intermediate functions might not need to run any SEH clean-up code
themselves, but they could muck with registers in any way they like,
expecting to restore them in their epilogues. But their epilogues don't
get executed (or simulated), unlike in Win64.

>
> I think LLVM has to know about the table format and landingpad PC
> values, because that's its business. The RTTI data, though, is
> completely between the frontend and the EH personality. I could imagine
> a personality that uses an Itanium LSDA, but the RTTI pointers are
> really pointers to functions that get called during phase 1 to
> implement SEH filters. The new representation will actually allow you
> to pass more data here to support passing in "adjectives" as required
> for MSVC, but LLVM will have to know where to put it in the table and
> there's no way to avoid that.

Perfect, I was wondering if you're re-designing EH in LLVM, would it
support that.

Anyway, thank you, Reid, for patiently explaining this stuff to me. I
think SEH's pretty decent and reasonable, but, __CxxFrameHandler3 is
well, horrible...

Steve



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: New EH representation for MSVC compatibility

Joseph Tremoulet
In reply to this post by Reid Kleckner-2

> I want to have separate normal and exceptional codepaths

I assume you at least mean that's what you'll be doing in Clang's initial IR generation.  Do you also mean to impose this as a restriction on other IR generators, and as a property that IR transformations must preserve?  I.e., is this something that EH preparation can assume?

 

> For __finally, we outline the finally body early in clang and emit two calls to it as before, but passing in the frameaddress as an argument

But then you have to frameescape any __finally-referenced local before optimization, and doesn't that defeat the purpose of delaying funclet outlining to EH preparation?

 

> tail merging could kick in like you mention. Undoing this would be and currently is the job of WinEHPrepare. I guess I felt like the extra representational complexity wasn't worth the confidence that it would buy us

For one, it seems counterproductive to let tail merge think it can kick in when it's doomed to be undone.

For another, if we're talking about a setup where EH paths might mingle with non-EH paths but the funclet will only be invoked for the EH cases, then I believe this would help the  "pruning as many unreachable CFG edges as possible " step be more effective -- after finding out which blocks are reachable from a catch/cleanup, you could intersect that with the set of blocks from which a corresponding resume can be reached.  Any funclet blocks ending in condbr/switch that only wind up with one successor in the funclet could then have their terminators rewritten as unconditional branches, without needing to recover dataflow and chase constants through phis and resolve compares/switches and all that.

 

> I'm not married to reusing 'resume', other candidate names include 'unwind' and 'continue', and I'd like more ideas

The first thing that comes to mind is 'endatch/exitcatch', but to use that you'd need to rename other things since it would be confusing vis-à-vis catchendblock and lack symmetry with catchblock that isn't prefixed with begin/enter.

You could consider 'filter' (or 'filterblock') for 'catchblock', since conceptually it plays the role of a filter (typically one which consults type information; I've seen such things called "typetestfilter" before).  Or 'dispatch'/'dispatchblock'/'exceptiondispatch'/'dispatchexception' (isn't that what Clang names the blocks it creates for the explicit dispatch code?); 'catchendblock' would then be something like 'unwinddispatch' or 'continuedispatch' or 'resumedispatch' and the resume that returns to normal execution could be 'exitdispatch' or 'exitcatch' or even 'uncatch'.

For the resumes that end cleanups, something like 'endcleanup' might work.

Names are hard…

 

 

Thanks

-Joseph

 

 

From: Reid Kleckner [mailto:[hidden email]]
Sent: Monday, May 18, 2015 5:36 PM
To: Joseph Tremoulet
Cc: LLVM Developers Mailing List; Bill Wendling; Nick Lewycky; Kaylor, Andrew
Subject: Re: [LLVMdev] RFC: New EH representation for MSVC compatibility

 

On Mon, May 18, 2015 at 12:03 PM, Joseph Tremoulet <[hidden email]> wrote:

Hi,

 

Thanks for sending this out.  We're looking forward to seeing this come about, since we need funclet separation for LLILC as well (and I have cycles to spend on it, if that would be helpful).

 

Some questions about the new proposal:

 

- Do the new forms of resume have any implied read/write side-effects, or do they work just like a branch?  In particular, I'm wondering what prevents reordering a call across a resume.  Is this just something that code motion optimizations are expected to check for explicitly to avoid introducing UB per the "Executing such an invoke [or call] that does not transitively unwind to the correct catchend block has undefined behavior" rule?

 

Yes, crossing a resume from a catchblock ends the lifetime of the exception object, so I'd say that's a "writes escaped memory" constraint. That said, a resume after a cleanupblock doesn't, but I'm not sure it's worth having this kind of fine-grained analysis. I'm OK teaching SimplifyCFG to combine cleanupblocks and leaving it at that.

  

- Does LLVM already have other examples of terminators that are glued to the top of their basic blocks, or will these be the first?  I ask because it's a somewhat nonstandard thing (a block in the CFG that can't have instructions added to it) that any code placement algorithms (PRE, PGO probe insertion, Phi elimination, RA spill/copy placement, etc.) may need to be adjusted for.  The adjustments aren't terrible (conceptually it's no worse than having unsplittable edges from each of the block's preds to each of its succs), but it's something to be aware of.

 

No, LLVM doesn't have anything like this yet. It does have unsplittable critical edges, which can come from indirectbr and the unwind edge of an invoke. I don't think it'll be too hard to teach transforms how to deal with one more, but maybe that's unrealistic youthful optimism. :)

 

- Since this will require auditing any code with special processing of resume instructions to make sure it handles the new resume forms correctly, I wonder if it might be helpful to give resume (or the new forms of it) a different name, since then it would be immediately clear which code has/hasn't been updated to the new model.

 

There aren't that many references to ResumeInst across LLVM, so I'm not too scared. I'm not married to reusing 'resume', other candidate names include 'unwind' and 'continue', and I'd like more ideas.

 

- Is the idea that a resume (of the sort that resumes normal execution) ends only one catch/cleanup, or that it can end any number of them?  Did you consider having it end a single one, and giving it a source that references (in a non-flow-edge-inducing way) the related catchend?  If you did that, then:

+ The code to find a funclet region could terminate with confidence when it reaches this sort of resume, and

+ Resumes which exit different catches would have different sources and thus couldn't be merged, reducing the need to undo tail-merging with code duplication in EH preparation (by blocking the tail-merging in the first place)

 

We already have something like this for cleanupblocks because the resume target and unwind label of the cleanupblock must match. It isn't as strong as having a reference to the catchblock itself, because tail merging could kick in like you mention. Undoing this would be and currently is the job of WinEHPrepare. I guess I felt like the extra representational complexity wasn't worth the confidence that it would buy us.

 

- What is the plan for cleanup/__finally code that may be executed on either normal paths or EH paths?  One could imagine a number of options here:

+ require the IR producer to duplicate code for EH vs non-EH paths

+ duplicate code for EH vs non-EH paths during EH preparation

+ use resume to exit these even on the non-EH paths; code doesn't need to be duplicated (but could and often would be as an optimization for hot/non-EH paths), and normal paths could call the funclet at the end of the day

and it isn't clear to me which you're suggesting.  Requiring duplication can worst-case quadratically expand the code (in that if you have n levels of cleanup-inside-cleanup-inside-cleanup-…, and each cleanup has k code bytes outside the next-inner cleanup, after duplication you'll have k*n + k*(n-1) + … or O(k*n^2) bytes total [compared to k*n before duplication]), which I'd think could potentially be a problem in pathological inputs.

 

I want to have separate normal and exceptional codepaths, but at -O0 all the cleanup work should be bundled up in a function that gets called from both those paths.

 

Today, for C++ destructors, we emit two calls to the destructor: one on the normal path and one on the EH path. For __finally, we outline the finally body early in clang and emit two calls to it as before, but passing in the frameaddress as an argument. I think this is a great place to be. It keeps our -O0 code size small, simplifies the implementation, and allows us to inline one or both call sites if we think it's profitable.


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: New EH representation for MSVC compatibility

Reid Kleckner-2
On Mon, May 18, 2015 at 8:40 PM, Joseph Tremoulet <[hidden email]> wrote:

> I want to have separate normal and exceptional codepaths

I assume you at least mean that's what you'll be doing in Clang's initial IR generation.  Do you also mean to impose this as a restriction on other IR generators, and as a property that IR transformations must preserve?  I.e., is this something that EH preparation can assume?


EH preparation should not assume that each basic block lives in exactly one funclet. People seem to get really upset when I suggest that we change the language rules to maintain that invariant. :-)

Instead, the EH preparation pass is responsible for establishing that invariant by duplicating blocks, but it should be *way* easier with real basic block terminator instructions to mark handler transitions.

> For __finally, we outline the finally body early in clang and emit two calls to it as before, but passing in the frameaddress as an argument

But then you have to frameescape any __finally-referenced local before optimization, and doesn't that defeat the purpose of delaying funclet outlining to EH preparation?


True. I'm not committed to this approach, I just want to preserve functionality until we implement the de-commoning part of EH preparation, which I figured we could do later, since it's one of the more obscure corner cases.

Also, __finally is pretty rare, so it's not too crazy to teach the inliner how to undo framerecover and let the optimization fall out that way.
 

> tail merging could kick in like you mention. Undoing this would be and currently is the job of WinEHPrepare. I guess I felt like the extra representational complexity wasn't worth the confidence that it would buy us

For one, it seems counterproductive to let tail merge think it can kick in when it's doomed to be undone.


I'm only saying that the tail merging is legal, and not that it is desirable. We can teach simplifycfg that tail merging 'resume' instructions is a waste of time, for example.
 

For another, if we're talking about a setup where EH paths might mingle with non-EH paths but the funclet will only be invoked for the EH cases, then I believe this would help the  "pruning as many unreachable CFG edges as possible " step be more effective -- after finding out which blocks are reachable from a catch/cleanup, you could intersect that with the set of blocks from which a corresponding resume can be reached.  Any funclet blocks ending in condbr/switch that only wind up with one successor in the funclet could then have their terminators rewritten as unconditional branches, without needing to recover dataflow and chase constants through phis and resolve compares/switches and all that.


For discussion purposes, let's imagine that 'resume' takes a 'from' label which must be an EH block (it starts with one of these new instructions). The nice thing about using a label here is that, unlike most SSA values, labels cannot be phi'd. Now tail merging will have to give up or insert an i1 ph and a conditional branch on the resume instruction, which realistically it won't since it's not a clear win.

Originally, I was thinking that this extra funclet cloning precision wasn't worth it, because we'd arrange the frontend and optimizers to make it unlikely, but I'm coming around to it. It mirrors the EH pointer value required by Itanium EH.
 

> I'm not married to reusing 'resume', other candidate names include 'unwind' and 'continue', and I'd like more ideas

The first thing that comes to mind is 'endatch/exitcatch', but to use that you'd need to rename other things since it would be confusing vis-à-vis catchendblock and lack symmetry with catchblock that isn't prefixed with begin/enter.

You could consider 'filter' (or 'filterblock') for 'catchblock', since conceptually it plays the role of a filter (typically one which consults type information; I've seen such things called "typetestfilter" before).  Or 'dispatch'/'dispatchblock'/'exceptiondispatch'/'dispatchexception' (isn't that what Clang names the blocks it creates for the explicit dispatch code?); 'catchendblock' would then be something like 'unwinddispatch' or 'continuedispatch' or 'resumedispatch' and the resume that returns to normal execution could be 'exitdispatch' or 'exitcatch' or even 'uncatch'.

For the resumes that end cleanups, something like 'endcleanup' might work.

Names are hard…


'dispatch' might work for the instruction which transitions from a cleanup to the next EH block. :) I don't really see how to work filter in, especially since we've already used it for landingpad filters, which exist to support exception specifications.

Chatting around the office, we came up with 'recover' and 'unwind' for ending catch and cleanup blocks respectively.

A 'recover' instruction would end a catch block, and would target the block where the exception is over. This instruction would modify memory, because it destroys the exception object. 

An 'unwind' instruction would end a cleanup block, and would target the next action. I like this because I talk a lot about "unwind edges" in the CFG, and a cleanup finishing feels like an unwind edge. I could also see 'dispatch' here.

Some possible new syntax:

recover from label %maycatch.int to label %endcatchbb
unwind from label %cleanup.obj to label %nextaction
unwind from label %cleanup.obj    ; unwinds out of the function, hook it up to the unwind edge of an inlined call site

For Itanium, tail merging is profitable and doable with phis, so we might want to do this instead:

recover to label %endcatchbb
unwind i8* %ehptr to label %nextaction
unwind i8* %ehptr    ; unwinds out of the function, hook it up to the unwind edge of an inlined call site

Itanium requires threading the active exception pointer through to the next action or _Unwind_Resume, so it passes along an SSA value. MSVC needs this weird kind of control dependence instead. We could drop the 'from' token as unnecessary, but I want to make it easier to reason about the textual representation.

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: New EH representation for MSVC compatibility

Joseph Tremoulet

> teach the inliner how to undo framerecover and let the optimization fall out that way

You'd still have the problem that the referenced variables were frameescaped and appear to be read/written at unrelated calls.

 

> __finally is pretty rare

`__finally` may be rare in SEH code, but `finally` is quite common in .Net code, so we're going to need a solution that avoids early `finally` outlining for the same reasons we're avoiding early `catch` outlining.

While we do need something that's amenable to duplicating `finally` code on hot/non-EH paths (since this is an important optimization for .Net code), I think there also needs to be a path that can be taken for very cold code / very large cleanups / very deeply-nested cleanups to avoid quadratic code expansion on those cases.  Early outlining of just those cases could maybe be made to work, but would require that the decision be made early; it would be much more palatable to make the decision in the natural course of optimization (e.g. some sort of tail duplication).

So I think we should have a way to enter/exit cleanup blocks on non-EH paths.

The smallest change to your proposal that comes to mind which would allow this would be:

   - a normal branch can target a cleanupblock

   - the resume/unwind terminator that ends a cleanup would have another optional target label (%normalsucc).  The semantics would be that the EH edge to %nextaction would be taken if the cleanupblock were entered by an EH edge, and the normal edge to %normalsucc would be taken otherwise.

   - converting this to landingpads would manifest an i1 that would need to be set before branching to the cleanupblock, and turn the unwind/resume into a condbr on that

   - codegen for windows could turn the branch to cleanupblock into a call to the handler followed by a jump to the %normalsucc

A change that I'd consider preferable since it avoids the need for an auxiliary switch in the non-EH code would be:

    - some new terminator along the lines of 'entercleanup %cleanupblock recover to %recoverblock', rather than a plain branch, can target cleanupblocks

    - the resume/unwind terminator that ends a cleanup would have any number of additional successors, and just like the rule that the EH target must match the %nextaction referenced on the cleanupblock, the non-EH targets would need to match one of the %recoverblocks specified on one of the entercleanups targeting the cleanupblock

    - converting this to landingpads would manifest an int that would need to be set before branching to the cleanupblock, and turn the unwind/resume into a switch

    - codegen for windows could turn entercleanup into a call to the handler followed by a jump to the %recoverblock

Those are the first two options that come to mind, anyway.  I'm sure there are other options, but the key points are:

1) can avoid early outlining of finally bodies, and

2) can avoid runaway code expansions

 

Regarding naming, I like your new suggestions (though the changes I'm suggesting to cleanup-ending terminators are a mix between recover and unwind… FWIW, 'endfinally' would be a natural name for that in my world J.  Personally I wouldn't hate something verbose like recover_or_unwind for the forms that may be doing either.  'endcleanup' also comes to mind.).

 

Thanks

-Joseph

 

 

From: Reid Kleckner [mailto:[hidden email]]
Sent: Tuesday, May 19, 2015 3:40 PM
To: Joseph Tremoulet
Cc: LLVM Developers Mailing List; Bill Wendling; Nick Lewycky; Kaylor, Andrew
Subject: Re: [LLVMdev] RFC: New EH representation for MSVC compatibility

 

On Mon, May 18, 2015 at 8:40 PM, Joseph Tremoulet <[hidden email]> wrote:

> I want to have separate normal and exceptional codepaths

I assume you at least mean that's what you'll be doing in Clang's initial IR generation.  Do you also mean to impose this as a restriction on other IR generators, and as a property that IR transformations must preserve?  I.e., is this something that EH preparation can assume?

 

EH preparation should not assume that each basic block lives in exactly one funclet. People seem to get really upset when I suggest that we change the language rules to maintain that invariant. :-)

 

Instead, the EH preparation pass is responsible for establishing that invariant by duplicating blocks, but it should be *way* easier with real basic block terminator instructions to mark handler transitions.

 

> For __finally, we outline the finally body early in clang and emit two calls to it as before, but passing in the frameaddress as an argument

But then you have to frameescape any __finally-referenced local before optimization, and doesn't that defeat the purpose of delaying funclet outlining to EH preparation?

 

True. I'm not committed to this approach, I just want to preserve functionality until we implement the de-commoning part of EH preparation, which I figured we could do later, since it's one of the more obscure corner cases.

 

Also, __finally is pretty rare, so it's not too crazy to teach the inliner how to undo framerecover and let the optimization fall out that way.

 

> tail merging could kick in like you mention. Undoing this would be and currently is the job of WinEHPrepare. I guess I felt like the extra representational complexity wasn't worth the confidence that it would buy us

For one, it seems counterproductive to let tail merge think it can kick in when it's doomed to be undone.

 

I'm only saying that the tail merging is legal, and not that it is desirable. We can teach simplifycfg that tail merging 'resume' instructions is a waste of time, for example.

 

For another, if we're talking about a setup where EH paths might mingle with non-EH paths but the funclet will only be invoked for the EH cases, then I believe this would help the  "pruning as many unreachable CFG edges as possible " step be more effective -- after finding out which blocks are reachable from a catch/cleanup, you could intersect that with the set of blocks from which a corresponding resume can be reached.  Any funclet blocks ending in condbr/switch that only wind up with one successor in the funclet could then have their terminators rewritten as unconditional branches, without needing to recover dataflow and chase constants through phis and resolve compares/switches and all that.

 

For discussion purposes, let's imagine that 'resume' takes a 'from' label which must be an EH block (it starts with one of these new instructions). The nice thing about using a label here is that, unlike most SSA values, labels cannot be phi'd. Now tail merging will have to give up or insert an i1 ph and a conditional branch on the resume instruction, which realistically it won't since it's not a clear win.

 

Originally, I was thinking that this extra funclet cloning precision wasn't worth it, because we'd arrange the frontend and optimizers to make it unlikely, but I'm coming around to it. It mirrors the EH pointer value required by Itanium EH.

 

> I'm not married to reusing 'resume', other candidate names include 'unwind' and 'continue', and I'd like more ideas

The first thing that comes to mind is 'endatch/exitcatch', but to use that you'd need to rename other things since it would be confusing vis-à-vis catchendblock and lack symmetry with catchblock that isn't prefixed with begin/enter.

You could consider 'filter' (or 'filterblock') for 'catchblock', since conceptually it plays the role of a filter (typically one which consults type information; I've seen such things called "typetestfilter" before).  Or 'dispatch'/'dispatchblock'/'exceptiondispatch'/'dispatchexception' (isn't that what Clang names the blocks it creates for the explicit dispatch code?); 'catchendblock' would then be something like 'unwinddispatch' or 'continuedispatch' or 'resumedispatch' and the resume that returns to normal execution could be 'exitdispatch' or 'exitcatch' or even 'uncatch'.

For the resumes that end cleanups, something like 'endcleanup' might work.

Names are hard…

 

'dispatch' might work for the instruction which transitions from a cleanup to the next EH block. :) I don't really see how to work filter in, especially since we've already used it for landingpad filters, which exist to support exception specifications.

 

Chatting around the office, we came up with 'recover' and 'unwind' for ending catch and cleanup blocks respectively.

 

A 'recover' instruction would end a catch block, and would target the block where the exception is over. This instruction would modify memory, because it destroys the exception object. 

 

An 'unwind' instruction would end a cleanup block, and would target the next action. I like this because I talk a lot about "unwind edges" in the CFG, and a cleanup finishing feels like an unwind edge. I could also see 'dispatch' here.

 

Some possible new syntax:

 

recover from label %maycatch.int to label %endcatchbb

unwind from label %cleanup.obj to label %nextaction

unwind from label %cleanup.obj    ; unwinds out of the function, hook it up to the unwind edge of an inlined call site

 

For Itanium, tail merging is profitable and doable with phis, so we might want to do this instead:

 

recover to label %endcatchbb

unwind i8* %ehptr to label %nextaction

unwind i8* %ehptr    ; unwinds out of the function, hook it up to the unwind edge of an inlined call site

 

Itanium requires threading the active exception pointer through to the next action or _Unwind_Resume, so it passes along an SSA value. MSVC needs this weird kind of control dependence instead. We could drop the 'from' token as unnecessary, but I want to make it easier to reason about the textual representation.


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: New EH representation for MSVC compatibility

Reid Kleckner-2
I think adding transitions to cleanupblocks on the normal execution path would be an optimization barrier. Most passes would see the cleanupblock instruction and run the other way. It's definitely appealing from the perspective of getting the smallest possible code, but I'm OK with having no more than two copies of everything in the finally block.

I think with the addition of the 'from' label we can just use shared basic blocks without runaway code duplication, like so:
try {
  f();
} finally {
  do_finally();
}

  ...
  invoke void @f()
    to label %cont unwind label %cleanup
cont:
  br label %finally
cleanup:
  cleanupblock
  br label %finally
finally:
  %abnormal = phi i1 [true, %cleanup], [false, %cont]
  call void @do_finally()
  br i1 %abnormal, label %finally.unwind, label %finally.normal
finally.unwind:
  unwind from label %cleanupblock
finally.normal:
  ret void

With two reachability checks we can see that cleanup, finally, and finally.unwind are the only BBs in the finally funclet, and only finally is shared. We can duplicate it in WinEHPrepare and fix up the CFG to reflect that. I don't think the quadratic case is possible, even in a nested case like:

try {
  try {
    f();
  } finally {
    do_finally(1);
  }
} finally {
  do_finally(2);
}

We'll end up with four calls to do_finally in this example. Does that seem OK?

If that much duplication isn't OK, you can also have your frontend do early outlining without using frameescape. Instead of passing the frame pointer, it would pass the address of each escaped local variable. If the cleanup is simple, it'll be inlined and some locals may be eliminated, and if not, the outlined funclet will only contain a single call with lots of lea+push pairs.

On Tue, May 19, 2015 at 2:22 PM, Joseph Tremoulet <[hidden email]> wrote:

> teach the inliner how to undo framerecover and let the optimization fall out that way

You'd still have the problem that the referenced variables were frameescaped and appear to be read/written at unrelated calls.

 

> __finally is pretty rare

`__finally` may be rare in SEH code, but `finally` is quite common in .Net code, so we're going to need a solution that avoids early `finally` outlining for the same reasons we're avoiding early `catch` outlining.

While we do need something that's amenable to duplicating `finally` code on hot/non-EH paths (since this is an important optimization for .Net code), I think there also needs to be a path that can be taken for very cold code / very large cleanups / very deeply-nested cleanups to avoid quadratic code expansion on those cases.  Early outlining of just those cases could maybe be made to work, but would require that the decision be made early; it would be much more palatable to make the decision in the natural course of optimization (e.g. some sort of tail duplication).

So I think we should have a way to enter/exit cleanup blocks on non-EH paths.

The smallest change to your proposal that comes to mind which would allow this would be:

   - a normal branch can target a cleanupblock

   - the resume/unwind terminator that ends a cleanup would have another optional target label (%normalsucc).  The semantics would be that the EH edge to %nextaction would be taken if the cleanupblock were entered by an EH edge, and the normal edge to %normalsucc would be taken otherwise.

   - converting this to landingpads would manifest an i1 that would need to be set before branching to the cleanupblock, and turn the unwind/resume into a condbr on that

   - codegen for windows could turn the branch to cleanupblock into a call to the handler followed by a jump to the %normalsucc

A change that I'd consider preferable since it avoids the need for an auxiliary switch in the non-EH code would be:

    - some new terminator along the lines of 'entercleanup %cleanupblock recover to %recoverblock', rather than a plain branch, can target cleanupblocks

    - the resume/unwind terminator that ends a cleanup would have any number of additional successors, and just like the rule that the EH target must match the %nextaction referenced on the cleanupblock, the non-EH targets would need to match one of the %recoverblocks specified on one of the entercleanups targeting the cleanupblock

    - converting this to landingpads would manifest an int that would need to be set before branching to the cleanupblock, and turn the unwind/resume into a switch

    - codegen for windows could turn entercleanup into a call to the handler followed by a jump to the %recoverblock

Those are the first two options that come to mind, anyway.  I'm sure there are other options, but the key points are:

1) can avoid early outlining of finally bodies, and

2) can avoid runaway code expansions

 

Regarding naming, I like your new suggestions (though the changes I'm suggesting to cleanup-ending terminators are a mix between recover and unwind… FWIW, 'endfinally' would be a natural name for that in my world J.  Personally I wouldn't hate something verbose like recover_or_unwind for the forms that may be doing either.  'endcleanup' also comes to mind.).

 

Thanks

-Joseph

 

 

From: Reid Kleckner [mailto:[hidden email]]
Sent: Tuesday, May 19, 2015 3:40 PM
To: Joseph Tremoulet
Cc: LLVM Developers Mailing List; Bill Wendling; Nick Lewycky; Kaylor, Andrew
Subject: Re: [LLVMdev] RFC: New EH representation for MSVC compatibility

 

On Mon, May 18, 2015 at 8:40 PM, Joseph Tremoulet <[hidden email]> wrote:

> I want to have separate normal and exceptional codepaths

I assume you at least mean that's what you'll be doing in Clang's initial IR generation.  Do you also mean to impose this as a restriction on other IR generators, and as a property that IR transformations must preserve?  I.e., is this something that EH preparation can assume?

 

EH preparation should not assume that each basic block lives in exactly one funclet. People seem to get really upset when I suggest that we change the language rules to maintain that invariant. :-)

 

Instead, the EH preparation pass is responsible for establishing that invariant by duplicating blocks, but it should be *way* easier with real basic block terminator instructions to mark handler transitions.

 

> For __finally, we outline the finally body early in clang and emit two calls to it as before, but passing in the frameaddress as an argument

But then you have to frameescape any __finally-referenced local before optimization, and doesn't that defeat the purpose of delaying funclet outlining to EH preparation?

 

True. I'm not committed to this approach, I just want to preserve functionality until we implement the de-commoning part of EH preparation, which I figured we could do later, since it's one of the more obscure corner cases.

 

Also, __finally is pretty rare, so it's not too crazy to teach the inliner how to undo framerecover and let the optimization fall out that way.

 

> tail merging could kick in like you mention. Undoing this would be and currently is the job of WinEHPrepare. I guess I felt like the extra representational complexity wasn't worth the confidence that it would buy us

For one, it seems counterproductive to let tail merge think it can kick in when it's doomed to be undone.

 

I'm only saying that the tail merging is legal, and not that it is desirable. We can teach simplifycfg that tail merging 'resume' instructions is a waste of time, for example.

 

For another, if we're talking about a setup where EH paths might mingle with non-EH paths but the funclet will only be invoked for the EH cases, then I believe this would help the  "pruning as many unreachable CFG edges as possible " step be more effective -- after finding out which blocks are reachable from a catch/cleanup, you could intersect that with the set of blocks from which a corresponding resume can be reached.  Any funclet blocks ending in condbr/switch that only wind up with one successor in the funclet could then have their terminators rewritten as unconditional branches, without needing to recover dataflow and chase constants through phis and resolve compares/switches and all that.

 

For discussion purposes, let's imagine that 'resume' takes a 'from' label which must be an EH block (it starts with one of these new instructions). The nice thing about using a label here is that, unlike most SSA values, labels cannot be phi'd. Now tail merging will have to give up or insert an i1 ph and a conditional branch on the resume instruction, which realistically it won't since it's not a clear win.

 

Originally, I was thinking that this extra funclet cloning precision wasn't worth it, because we'd arrange the frontend and optimizers to make it unlikely, but I'm coming around to it. It mirrors the EH pointer value required by Itanium EH.

 

> I'm not married to reusing 'resume', other candidate names include 'unwind' and 'continue', and I'd like more ideas

The first thing that comes to mind is 'endatch/exitcatch', but to use that you'd need to rename other things since it would be confusing vis-à-vis catchendblock and lack symmetry with catchblock that isn't prefixed with begin/enter.

You could consider 'filter' (or 'filterblock') for 'catchblock', since conceptually it plays the role of a filter (typically one which consults type information; I've seen such things called "typetestfilter" before).  Or 'dispatch'/'dispatchblock'/'exceptiondispatch'/'dispatchexception' (isn't that what Clang names the blocks it creates for the explicit dispatch code?); 'catchendblock' would then be something like 'unwinddispatch' or 'continuedispatch' or 'resumedispatch' and the resume that returns to normal execution could be 'exitdispatch' or 'exitcatch' or even 'uncatch'.

For the resumes that end cleanups, something like 'endcleanup' might work.

Names are hard…

 

'dispatch' might work for the instruction which transitions from a cleanup to the next EH block. :) I don't really see how to work filter in, especially since we've already used it for landingpad filters, which exist to support exception specifications.

 

Chatting around the office, we came up with 'recover' and 'unwind' for ending catch and cleanup blocks respectively.

 

A 'recover' instruction would end a catch block, and would target the block where the exception is over. This instruction would modify memory, because it destroys the exception object. 

 

An 'unwind' instruction would end a cleanup block, and would target the next action. I like this because I talk a lot about "unwind edges" in the CFG, and a cleanup finishing feels like an unwind edge. I could also see 'dispatch' here.

 

Some possible new syntax:

 

recover from label %maycatch.int to label %endcatchbb

unwind from label %cleanup.obj to label %nextaction

unwind from label %cleanup.obj    ; unwinds out of the function, hook it up to the unwind edge of an inlined call site

 

For Itanium, tail merging is profitable and doable with phis, so we might want to do this instead:

 

recover to label %endcatchbb

unwind i8* %ehptr to label %nextaction

unwind i8* %ehptr    ; unwinds out of the function, hook it up to the unwind edge of an inlined call site

 

Itanium requires threading the active exception pointer through to the next action or _Unwind_Resume, so it passes along an SSA value. MSVC needs this weird kind of control dependence instead. We could drop the 'from' token as unnecessary, but I want to make it easier to reason about the textual representation.



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: New EH representation for MSVC compatibility

Joseph Tremoulet

> Most passes would see the cleanupblock instruction and run the other way

1) That statement surprises me.  I would have expected the cleanupblock instruction to look pretty benign; it doesn't have any side-effects, does it?  And its only source is a label reference?  Likewise, I would have expected unwind to pretty much just look like a branch

2) Regardless, I certainly do agree that optimizations have less freedom when sharing the IR for the cleanup between the EH and non-EH paths, because of the impact on the flow-graph (the tail of the non-EH path now looks like it could have been reached from any point in the 'try' where an exception was raised without completing the 'try'), which is why I think it's an important optimization to do the duplication … I just want there to be a fallback that's guaranteed to be correct and avoid explosive code growth for the pathological cases.

 

Regarding your suggestion to have the normal path branch into the finally (after the cleanupblock) and back out, it delays the duplication to EH preparation, but still requires duplication.  I want something like that but where EH preparation can also rewrite the non-EH paths to call the funclet, and to do that robustly I think we'd need to wed the cleanupblock to that %finally label and wed that conditional branch to the unwind.

 

Regarding how bad the growth can be, the problem case is finally-inside-finally, not try-inside-try:

 

foo() {

  try {

    expr1;

  }  finally {

    try {

      expr2;

    } finally {

      try {

        expr3;

      } finally {

        try {

          expr4;

        } finally

           …

        }

      }

    }

  }

}

 

With duplication, the non-EH path through foo will include each of expr1 through exprn.  The outermost funclet will include (in its non-EH path) each of expr2 through exprn.  The next funclet will include each of expr3 through exprn.  Etc.

 

Another problem I'd have relying on early inlining is that when the CLR asks the LLILC Jit to jit one method, it expects the result to be one method (plus funclets), not n methods…

 

Thanks

-Joseph

 

 

From: Reid Kleckner [mailto:[hidden email]]
Sent: Tuesday, May 19, 2015 7:26 PM
To: Joseph Tremoulet
Cc: LLVM Developers Mailing List; Bill Wendling; Nick Lewycky; Kaylor, Andrew
Subject: Re: [LLVMdev] RFC: New EH representation for MSVC compatibility

 

I think adding transitions to cleanupblocks on the normal execution path would be an optimization barrier. Most passes would see the cleanupblock instruction and run the other way. It's definitely appealing from the perspective of getting the smallest possible code, but I'm OK with having no more than two copies of everything in the finally block.

 

I think with the addition of the 'from' label we can just use shared basic blocks without runaway code duplication, like so:

try {

  f();
} finally {

  do_finally();
}

 

  ...

  invoke void @f()

    to label %cont unwind label %cleanup

cont:

  br label %finally

cleanup:

  cleanupblock

  br label %finally

finally:

  %abnormal = phi i1 [true, %cleanup], [false, %cont]

  call void @do_finally()

  br i1 %abnormal, label %finally.unwind, label %finally.normal

finally.unwind:

  unwind from label %cleanupblock

finally.normal:

  ret void

 

With two reachability checks we can see that cleanup, finally, and finally.unwind are the only BBs in the finally funclet, and only finally is shared. We can duplicate it in WinEHPrepare and fix up the CFG to reflect that. I don't think the quadratic case is possible, even in a nested case like:

 

try {

  try {

    f();
  } finally {

    do_finally(1);
  }

} finally {

  do_finally(2);
}

 

We'll end up with four calls to do_finally in this example. Does that seem OK?

 

If that much duplication isn't OK, you can also have your frontend do early outlining without using frameescape. Instead of passing the frame pointer, it would pass the address of each escaped local variable. If the cleanup is simple, it'll be inlined and some locals may be eliminated, and if not, the outlined funclet will only contain a single call with lots of lea+push pairs.

 

On Tue, May 19, 2015 at 2:22 PM, Joseph Tremoulet <[hidden email]> wrote:

> teach the inliner how to undo framerecover and let the optimization fall out that way

You'd still have the problem that the referenced variables were frameescaped and appear to be read/written at unrelated calls.

 

> __finally is pretty rare

`__finally` may be rare in SEH code, but `finally` is quite common in .Net code, so we're going to need a solution that avoids early `finally` outlining for the same reasons we're avoiding early `catch` outlining.

While we do need something that's amenable to duplicating `finally` code on hot/non-EH paths (since this is an important optimization for .Net code), I think there also needs to be a path that can be taken for very cold code / very large cleanups / very deeply-nested cleanups to avoid quadratic code expansion on those cases.  Early outlining of just those cases could maybe be made to work, but would require that the decision be made early; it would be much more palatable to make the decision in the natural course of optimization (e.g. some sort of tail duplication).

So I think we should have a way to enter/exit cleanup blocks on non-EH paths.

The smallest change to your proposal that comes to mind which would allow this would be:

   - a normal branch can target a cleanupblock

   - the resume/unwind terminator that ends a cleanup would have another optional target label (%normalsucc).  The semantics would be that the EH edge to %nextaction would be taken if the cleanupblock were entered by an EH edge, and the normal edge to %normalsucc would be taken otherwise.

   - converting this to landingpads would manifest an i1 that would need to be set before branching to the cleanupblock, and turn the unwind/resume into a condbr on that

   - codegen for windows could turn the branch to cleanupblock into a call to the handler followed by a jump to the %normalsucc

A change that I'd consider preferable since it avoids the need for an auxiliary switch in the non-EH code would be:

    - some new terminator along the lines of 'entercleanup %cleanupblock recover to %recoverblock', rather than a plain branch, can target cleanupblocks

    - the resume/unwind terminator that ends a cleanup would have any number of additional successors, and just like the rule that the EH target must match the %nextaction referenced on the cleanupblock, the non-EH targets would need to match one of the %recoverblocks specified on one of the entercleanups targeting the cleanupblock

    - converting this to landingpads would manifest an int that would need to be set before branching to the cleanupblock, and turn the unwind/resume into a switch

    - codegen for windows could turn entercleanup into a call to the handler followed by a jump to the %recoverblock

Those are the first two options that come to mind, anyway.  I'm sure there are other options, but the key points are:

1) can avoid early outlining of finally bodies, and

2) can avoid runaway code expansions

 

Regarding naming, I like your new suggestions (though the changes I'm suggesting to cleanup-ending terminators are a mix between recover and unwind… FWIW, 'endfinally' would be a natural name for that in my world J.  Personally I wouldn't hate something verbose like recover_or_unwind for the forms that may be doing either.  'endcleanup' also comes to mind.).

 

Thanks

-Joseph

 

 

From: Reid Kleckner [mailto:[hidden email]]
Sent: Tuesday, May 19, 2015 3:40 PM
To: Joseph Tremoulet
Cc: LLVM Developers Mailing List; Bill Wendling; Nick Lewycky; Kaylor, Andrew
Subject: Re: [LLVMdev] RFC: New EH representation for MSVC compatibility

 

On Mon, May 18, 2015 at 8:40 PM, Joseph Tremoulet <[hidden email]> wrote:

> I want to have separate normal and exceptional codepaths

I assume you at least mean that's what you'll be doing in Clang's initial IR generation.  Do you also mean to impose this as a restriction on other IR generators, and as a property that IR transformations must preserve?  I.e., is this something that EH preparation can assume?

 

EH preparation should not assume that each basic block lives in exactly one funclet. People seem to get really upset when I suggest that we change the language rules to maintain that invariant. :-)

 

Instead, the EH preparation pass is responsible for establishing that invariant by duplicating blocks, but it should be *way* easier with real basic block terminator instructions to mark handler transitions.

 

> For __finally, we outline the finally body early in clang and emit two calls to it as before, but passing in the frameaddress as an argument

But then you have to frameescape any __finally-referenced local before optimization, and doesn't that defeat the purpose of delaying funclet outlining to EH preparation?

 

True. I'm not committed to this approach, I just want to preserve functionality until we implement the de-commoning part of EH preparation, which I figured we could do later, since it's one of the more obscure corner cases.

 

Also, __finally is pretty rare, so it's not too crazy to teach the inliner how to undo framerecover and let the optimization fall out that way.

 

> tail merging could kick in like you mention. Undoing this would be and currently is the job of WinEHPrepare. I guess I felt like the extra representational complexity wasn't worth the confidence that it would buy us

For one, it seems counterproductive to let tail merge think it can kick in when it's doomed to be undone.

 

I'm only saying that the tail merging is legal, and not that it is desirable. We can teach simplifycfg that tail merging 'resume' instructions is a waste of time, for example.

 

For another, if we're talking about a setup where EH paths might mingle with non-EH paths but the funclet will only be invoked for the EH cases, then I believe this would help the  "pruning as many unreachable CFG edges as possible " step be more effective -- after finding out which blocks are reachable from a catch/cleanup, you could intersect that with the set of blocks from which a corresponding resume can be reached.  Any funclet blocks ending in condbr/switch that only wind up with one successor in the funclet could then have their terminators rewritten as unconditional branches, without needing to recover dataflow and chase constants through phis and resolve compares/switches and all that.

 

For discussion purposes, let's imagine that 'resume' takes a 'from' label which must be an EH block (it starts with one of these new instructions). The nice thing about using a label here is that, unlike most SSA values, labels cannot be phi'd. Now tail merging will have to give up or insert an i1 ph and a conditional branch on the resume instruction, which realistically it won't since it's not a clear win.

 

Originally, I was thinking that this extra funclet cloning precision wasn't worth it, because we'd arrange the frontend and optimizers to make it unlikely, but I'm coming around to it. It mirrors the EH pointer value required by Itanium EH.

 

> I'm not married to reusing 'resume', other candidate names include 'unwind' and 'continue', and I'd like more ideas

The first thing that comes to mind is 'endatch/exitcatch', but to use that you'd need to rename other things since it would be confusing vis-à-vis catchendblock and lack symmetry with catchblock that isn't prefixed with begin/enter.

You could consider 'filter' (or 'filterblock') for 'catchblock', since conceptually it plays the role of a filter (typically one which consults type information; I've seen such things called "typetestfilter" before).  Or 'dispatch'/'dispatchblock'/'exceptiondispatch'/'dispatchexception' (isn't that what Clang names the blocks it creates for the explicit dispatch code?); 'catchendblock' would then be something like 'unwinddispatch' or 'continuedispatch' or 'resumedispatch' and the resume that returns to normal execution could be 'exitdispatch' or 'exitcatch' or even 'uncatch'.

For the resumes that end cleanups, something like 'endcleanup' might work.

Names are hard…

 

'dispatch' might work for the instruction which transitions from a cleanup to the next EH block. :) I don't really see how to work filter in, especially since we've already used it for landingpad filters, which exist to support exception specifications.

 

Chatting around the office, we came up with 'recover' and 'unwind' for ending catch and cleanup blocks respectively.

 

A 'recover' instruction would end a catch block, and would target the block where the exception is over. This instruction would modify memory, because it destroys the exception object. 

 

An 'unwind' instruction would end a cleanup block, and would target the next action. I like this because I talk a lot about "unwind edges" in the CFG, and a cleanup finishing feels like an unwind edge. I could also see 'dispatch' here.

 

Some possible new syntax:

 

recover from label %maycatch.int to label %endcatchbb

unwind from label %cleanup.obj to label %nextaction

unwind from label %cleanup.obj    ; unwinds out of the function, hook it up to the unwind edge of an inlined call site

 

For Itanium, tail merging is profitable and doable with phis, so we might want to do this instead:

 

recover to label %endcatchbb

unwind i8* %ehptr to label %nextaction

unwind i8* %ehptr    ; unwinds out of the function, hook it up to the unwind edge of an inlined call site

 

Itanium requires threading the active exception pointer through to the next action or _Unwind_Resume, so it passes along an SSA value. MSVC needs this weird kind of control dependence instead. We could drop the 'from' token as unnecessary, but I want to make it easier to reason about the textual representation.

 


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: New EH representation for MSVC compatibility

Reid Kleckner-2
This example is pretty compelling, but I still think I want to limit the scope of this change to not include this feature. It's probably reasonable to hold onto the idea as future work, though. In the meantime, frontends can decide for themselves whether they want code size or stronger optimization of cleanups by doing early outlining or not.

The only way we can hit the quadratic growth from clang is if someone is nesting inside SEH __finally blocks, at which point I'm pretty OK letting that get bloated. If we're not going to use the construct in Clang, LLVM passes won't see it as often, and it'll stay buggy longer.

The version of this feature that I like so far is probably entercleanup / endcleanup, something along these lines:

try {
  f(1);
} finally {
  f(2);
}

  invoke void @f(i32 1)
    to label %cont unwind label %cleanup
cont:
  entercleanup label %cleanup
return:
  ret void
cleanup:
  cleanupblock ; unwind label would go here if there were an action
  call void @f(i32 2)
  endcleanup from label %cleanup to label %return ; unwind label would go here if there were an action

The endcleanup instruction would use 'to' and 'unwind' label operands to mirror the structure of an invoke. In fact, endcleanup is almost nicer than 'unwind'. Maybe we should use that.

On Tue, May 19, 2015 at 7:19 PM, Joseph Tremoulet <[hidden email]> wrote:

> Most passes would see the cleanupblock instruction and run the other way

1) That statement surprises me.  I would have expected the cleanupblock instruction to look pretty benign; it doesn't have any side-effects, does it?  And its only source is a label reference?  Likewise, I would have expected unwind to pretty much just look like a branch

2) Regardless, I certainly do agree that optimizations have less freedom when sharing the IR for the cleanup between the EH and non-EH paths, because of the impact on the flow-graph (the tail of the non-EH path now looks like it could have been reached from any point in the 'try' where an exception was raised without completing the 'try'), which is why I think it's an important optimization to do the duplication … I just want there to be a fallback that's guaranteed to be correct and avoid explosive code growth for the pathological cases.

 

Regarding your suggestion to have the normal path branch into the finally (after the cleanupblock) and back out, it delays the duplication to EH preparation, but still requires duplication.  I want something like that but where EH preparation can also rewrite the non-EH paths to call the funclet, and to do that robustly I think we'd need to wed the cleanupblock to that %finally label and wed that conditional branch to the unwind.

 

Regarding how bad the growth can be, the problem case is finally-inside-finally, not try-inside-try:

 

foo() {

  try {

    expr1;

  }  finally {

    try {

      expr2;

    } finally {

      try {

        expr3;

      } finally {

        try {

          expr4;

        } finally

           …

        }

      }

    }

  }

}

 

With duplication, the non-EH path through foo will include each of expr1 through exprn.  The outermost funclet will include (in its non-EH path) each of expr2 through exprn.  The next funclet will include each of expr3 through exprn.  Etc.

 

Another problem I'd have relying on early inlining is that when the CLR asks the LLILC Jit to jit one method, it expects the result to be one method (plus funclets), not n methods…

 

Thanks

-Joseph

 

 

From: Reid Kleckner [mailto:[hidden email]]
Sent: Tuesday, May 19, 2015 7:26 PM


To: Joseph Tremoulet
Cc: LLVM Developers Mailing List; Bill Wendling; Nick Lewycky; Kaylor, Andrew
Subject: Re: [LLVMdev] RFC: New EH representation for MSVC compatibility

 

I think adding transitions to cleanupblocks on the normal execution path would be an optimization barrier. Most passes would see the cleanupblock instruction and run the other way. It's definitely appealing from the perspective of getting the smallest possible code, but I'm OK with having no more than two copies of everything in the finally block.

 

I think with the addition of the 'from' label we can just use shared basic blocks without runaway code duplication, like so:

try {

  f();
} finally {

  do_finally();
}

 

  ...

  invoke void @f()

    to label %cont unwind label %cleanup

cont:

  br label %finally

cleanup:

  cleanupblock

  br label %finally

finally:

  %abnormal = phi i1 [true, %cleanup], [false, %cont]

  call void @do_finally()

  br i1 %abnormal, label %finally.unwind, label %finally.normal

finally.unwind:

  unwind from label %cleanupblock

finally.normal:

  ret void

 

With two reachability checks we can see that cleanup, finally, and finally.unwind are the only BBs in the finally funclet, and only finally is shared. We can duplicate it in WinEHPrepare and fix up the CFG to reflect that. I don't think the quadratic case is possible, even in a nested case like:

 

try {

  try {

    f();
  } finally {

    do_finally(1);
  }

} finally {

  do_finally(2);
}

 

We'll end up with four calls to do_finally in this example. Does that seem OK?

 

If that much duplication isn't OK, you can also have your frontend do early outlining without using frameescape. Instead of passing the frame pointer, it would pass the address of each escaped local variable. If the cleanup is simple, it'll be inlined and some locals may be eliminated, and if not, the outlined funclet will only contain a single call with lots of lea+push pairs.

 

On Tue, May 19, 2015 at 2:22 PM, Joseph Tremoulet <[hidden email]> wrote:

> teach the inliner how to undo framerecover and let the optimization fall out that way

You'd still have the problem that the referenced variables were frameescaped and appear to be read/written at unrelated calls.

 

> __finally is pretty rare

`__finally` may be rare in SEH code, but `finally` is quite common in .Net code, so we're going to need a solution that avoids early `finally` outlining for the same reasons we're avoiding early `catch` outlining.

While we do need something that's amenable to duplicating `finally` code on hot/non-EH paths (since this is an important optimization for .Net code), I think there also needs to be a path that can be taken for very cold code / very large cleanups / very deeply-nested cleanups to avoid quadratic code expansion on those cases.  Early outlining of just those cases could maybe be made to work, but would require that the decision be made early; it would be much more palatable to make the decision in the natural course of optimization (e.g. some sort of tail duplication).

So I think we should have a way to enter/exit cleanup blocks on non-EH paths.

The smallest change to your proposal that comes to mind which would allow this would be:

   - a normal branch can target a cleanupblock

   - the resume/unwind terminator that ends a cleanup would have another optional target label (%normalsucc).  The semantics would be that the EH edge to %nextaction would be taken if the cleanupblock were entered by an EH edge, and the normal edge to %normalsucc would be taken otherwise.

   - converting this to landingpads would manifest an i1 that would need to be set before branching to the cleanupblock, and turn the unwind/resume into a condbr on that

   - codegen for windows could turn the branch to cleanupblock into a call to the handler followed by a jump to the %normalsucc

A change that I'd consider preferable since it avoids the need for an auxiliary switch in the non-EH code would be:

    - some new terminator along the lines of 'entercleanup %cleanupblock recover to %recoverblock', rather than a plain branch, can target cleanupblocks

    - the resume/unwind terminator that ends a cleanup would have any number of additional successors, and just like the rule that the EH target must match the %nextaction referenced on the cleanupblock, the non-EH targets would need to match one of the %recoverblocks specified on one of the entercleanups targeting the cleanupblock

    - converting this to landingpads would manifest an int that would need to be set before branching to the cleanupblock, and turn the unwind/resume into a switch

    - codegen for windows could turn entercleanup into a call to the handler followed by a jump to the %recoverblock

Those are the first two options that come to mind, anyway.  I'm sure there are other options, but the key points are:

1) can avoid early outlining of finally bodies, and

2) can avoid runaway code expansions

 

Regarding naming, I like your new suggestions (though the changes I'm suggesting to cleanup-ending terminators are a mix between recover and unwind… FWIW, 'endfinally' would be a natural name for that in my world J.  Personally I wouldn't hate something verbose like recover_or_unwind for the forms that may be doing either.  'endcleanup' also comes to mind.).

 

Thanks

-Joseph

 

 

From: Reid Kleckner [mailto:[hidden email]]
Sent: Tuesday, May 19, 2015 3:40 PM
To: Joseph Tremoulet
Cc: LLVM Developers Mailing List; Bill Wendling; Nick Lewycky; Kaylor, Andrew
Subject: Re: [LLVMdev] RFC: New EH representation for MSVC compatibility

 

On Mon, May 18, 2015 at 8:40 PM, Joseph Tremoulet <[hidden email]> wrote:

> I want to have separate normal and exceptional codepaths

I assume you at least mean that's what you'll be doing in Clang's initial IR generation.  Do you also mean to impose this as a restriction on other IR generators, and as a property that IR transformations must preserve?  I.e., is this something that EH preparation can assume?

 

EH preparation should not assume that each basic block lives in exactly one funclet. People seem to get really upset when I suggest that we change the language rules to maintain that invariant. :-)

 

Instead, the EH preparation pass is responsible for establishing that invariant by duplicating blocks, but it should be *way* easier with real basic block terminator instructions to mark handler transitions.

 

> For __finally, we outline the finally body early in clang and emit two calls to it as before, but passing in the frameaddress as an argument

But then you have to frameescape any __finally-referenced local before optimization, and doesn't that defeat the purpose of delaying funclet outlining to EH preparation?

 

True. I'm not committed to this approach, I just want to preserve functionality until we implement the de-commoning part of EH preparation, which I figured we could do later, since it's one of the more obscure corner cases.

 

Also, __finally is pretty rare, so it's not too crazy to teach the inliner how to undo framerecover and let the optimization fall out that way.

 

> tail merging could kick in like you mention. Undoing this would be and currently is the job of WinEHPrepare. I guess I felt like the extra representational complexity wasn't worth the confidence that it would buy us

For one, it seems counterproductive to let tail merge think it can kick in when it's doomed to be undone.

 

I'm only saying that the tail merging is legal, and not that it is desirable. We can teach simplifycfg that tail merging 'resume' instructions is a waste of time, for example.

 

For another, if we're talking about a setup where EH paths might mingle with non-EH paths but the funclet will only be invoked for the EH cases, then I believe this would help the  "pruning as many unreachable CFG edges as possible " step be more effective -- after finding out which blocks are reachable from a catch/cleanup, you could intersect that with the set of blocks from which a corresponding resume can be reached.  Any funclet blocks ending in condbr/switch that only wind up with one successor in the funclet could then have their terminators rewritten as unconditional branches, without needing to recover dataflow and chase constants through phis and resolve compares/switches and all that.

 

For discussion purposes, let's imagine that 'resume' takes a 'from' label which must be an EH block (it starts with one of these new instructions). The nice thing about using a label here is that, unlike most SSA values, labels cannot be phi'd. Now tail merging will have to give up or insert an i1 ph and a conditional branch on the resume instruction, which realistically it won't since it's not a clear win.

 

Originally, I was thinking that this extra funclet cloning precision wasn't worth it, because we'd arrange the frontend and optimizers to make it unlikely, but I'm coming around to it. It mirrors the EH pointer value required by Itanium EH.

 

> I'm not married to reusing 'resume', other candidate names include 'unwind' and 'continue', and I'd like more ideas

The first thing that comes to mind is 'endatch/exitcatch', but to use that you'd need to rename other things since it would be confusing vis-à-vis catchendblock and lack symmetry with catchblock that isn't prefixed with begin/enter.

You could consider 'filter' (or 'filterblock') for 'catchblock', since conceptually it plays the role of a filter (typically one which consults type information; I've seen such things called "typetestfilter" before).  Or 'dispatch'/'dispatchblock'/'exceptiondispatch'/'dispatchexception' (isn't that what Clang names the blocks it creates for the explicit dispatch code?); 'catchendblock' would then be something like 'unwinddispatch' or 'continuedispatch' or 'resumedispatch' and the resume that returns to normal execution could be 'exitdispatch' or 'exitcatch' or even 'uncatch'.

For the resumes that end cleanups, something like 'endcleanup' might work.

Names are hard…

 

'dispatch' might work for the instruction which transitions from a cleanup to the next EH block. :) I don't really see how to work filter in, especially since we've already used it for landingpad filters, which exist to support exception specifications.

 

Chatting around the office, we came up with 'recover' and 'unwind' for ending catch and cleanup blocks respectively.

 

A 'recover' instruction would end a catch block, and would target the block where the exception is over. This instruction would modify memory, because it destroys the exception object. 

 

An 'unwind' instruction would end a cleanup block, and would target the next action. I like this because I talk a lot about "unwind edges" in the CFG, and a cleanup finishing feels like an unwind edge. I could also see 'dispatch' here.

 

Some possible new syntax:

 

recover from label %maycatch.int to label %endcatchbb

unwind from label %cleanup.obj to label %nextaction

unwind from label %cleanup.obj    ; unwinds out of the function, hook it up to the unwind edge of an inlined call site

 

For Itanium, tail merging is profitable and doable with phis, so we might want to do this instead:

 

recover to label %endcatchbb

unwind i8* %ehptr to label %nextaction

unwind i8* %ehptr    ; unwinds out of the function, hook it up to the unwind edge of an inlined call site

 

Itanium requires threading the active exception pointer through to the next action or _Unwind_Resume, so it passes along an SSA value. MSVC needs this weird kind of control dependence instead. We could drop the 'from' token as unnecessary, but I want to make it easier to reason about the textual representation.

 



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: New EH representation for MSVC compatibility

Joseph Tremoulet

Sure, I could do this as a follow-on change when LLILC is in a position to exercise it.  If you wanted to go ahead and use the "endcleanup from label %cleanup unwind label %nextaction" naming/syntax in the meantime, that would be cool.

 

> If we're not going to use the construct in Clang

Once the construct and the attendant code duplication optimization are in place, would there be any reason not to switch Clang over to use it?  I'd think that doing so would let you eschew the frameescape and get better-optimized code (even on the non-EH paths) in functions that use SEH __finally.

 

Thanks

-Joseph

 

 

From: Reid Kleckner [mailto:[hidden email]]
Sent: Wednesday, May 20, 2015 2:17 PM
To: Joseph Tremoulet
Cc: LLVM Developers Mailing List; Bill Wendling; Nick Lewycky; Kaylor, Andrew
Subject: Re: [LLVMdev] RFC: New EH representation for MSVC compatibility

 

This example is pretty compelling, but I still think I want to limit the scope of this change to not include this feature. It's probably reasonable to hold onto the idea as future work, though. In the meantime, frontends can decide for themselves whether they want code size or stronger optimization of cleanups by doing early outlining or not.

 

The only way we can hit the quadratic growth from clang is if someone is nesting inside SEH __finally blocks, at which point I'm pretty OK letting that get bloated. If we're not going to use the construct in Clang, LLVM passes won't see it as often, and it'll stay buggy longer.

 

The version of this feature that I like so far is probably entercleanup / endcleanup, something along these lines:

 

try {

  f(1);
} finally {
  f(2);

}

 

  invoke void @f(i32 1)

    to label %cont unwind label %cleanup

cont:

  entercleanup label %cleanup

return:

  ret void

cleanup:

  cleanupblock ; unwind label would go here if there were an action

  call void @f(i32 2)

  endcleanup from label %cleanup to label %return ; unwind label would go here if there were an action

 

The endcleanup instruction would use 'to' and 'unwind' label operands to mirror the structure of an invoke. In fact, endcleanup is almost nicer than 'unwind'. Maybe we should use that.

 

On Tue, May 19, 2015 at 7:19 PM, Joseph Tremoulet <[hidden email]> wrote:

> Most passes would see the cleanupblock instruction and run the other way

1) That statement surprises me.  I would have expected the cleanupblock instruction to look pretty benign; it doesn't have any side-effects, does it?  And its only source is a label reference?  Likewise, I would have expected unwind to pretty much just look like a branch

2) Regardless, I certainly do agree that optimizations have less freedom when sharing the IR for the cleanup between the EH and non-EH paths, because of the impact on the flow-graph (the tail of the non-EH path now looks like it could have been reached from any point in the 'try' where an exception was raised without completing the 'try'), which is why I think it's an important optimization to do the duplication … I just want there to be a fallback that's guaranteed to be correct and avoid explosive code growth for the pathological cases.

 

Regarding your suggestion to have the normal path branch into the finally (after the cleanupblock) and back out, it delays the duplication to EH preparation, but still requires duplication.  I want something like that but where EH preparation can also rewrite the non-EH paths to call the funclet, and to do that robustly I think we'd need to wed the cleanupblock to that %finally label and wed that conditional branch to the unwind.

 

Regarding how bad the growth can be, the problem case is finally-inside-finally, not try-inside-try:

 

foo() {

  try {

    expr1;

  }  finally {

    try {

      expr2;

    } finally {

      try {

        expr3;

      } finally {

        try {

          expr4;

        } finally

           …

        }

      }

    }

  }

}

 

With duplication, the non-EH path through foo will include each of expr1 through exprn.  The outermost funclet will include (in its non-EH path) each of expr2 through exprn.  The next funclet will include each of expr3 through exprn.  Etc.

 

Another problem I'd have relying on early inlining is that when the CLR asks the LLILC Jit to jit one method, it expects the result to be one method (plus funclets), not n methods…

 

Thanks

-Joseph

 

 

From: Reid Kleckner [mailto:[hidden email]]
Sent: Tuesday, May 19, 2015 7:26 PM


To: Joseph Tremoulet
Cc: LLVM Developers Mailing List; Bill Wendling; Nick Lewycky; Kaylor, Andrew
Subject: Re: [LLVMdev] RFC: New EH representation for MSVC compatibility

 

I think adding transitions to cleanupblocks on the normal execution path would be an optimization barrier. Most passes would see the cleanupblock instruction and run the other way. It's definitely appealing from the perspective of getting the smallest possible code, but I'm OK with having no more than two copies of everything in the finally block.

 

I think with the addition of the 'from' label we can just use shared basic blocks without runaway code duplication, like so:

try {

  f();
} finally {

  do_finally();
}

 

  ...

  invoke void @f()

    to label %cont unwind label %cleanup

cont:

  br label %finally

cleanup:

  cleanupblock

  br label %finally

finally:

  %abnormal = phi i1 [true, %cleanup], [false, %cont]

  call void @do_finally()

  br i1 %abnormal, label %finally.unwind, label %finally.normal

finally.unwind:

  unwind from label %cleanupblock

finally.normal:

  ret void

 

With two reachability checks we can see that cleanup, finally, and finally.unwind are the only BBs in the finally funclet, and only finally is shared. We can duplicate it in WinEHPrepare and fix up the CFG to reflect that. I don't think the quadratic case is possible, even in a nested case like:

 

try {

  try {

    f();
  } finally {

    do_finally(1);
  }

} finally {

  do_finally(2);
}

 

We'll end up with four calls to do_finally in this example. Does that seem OK?

 

If that much duplication isn't OK, you can also have your frontend do early outlining without using frameescape. Instead of passing the frame pointer, it would pass the address of each escaped local variable. If the cleanup is simple, it'll be inlined and some locals may be eliminated, and if not, the outlined funclet will only contain a single call with lots of lea+push pairs.

 

On Tue, May 19, 2015 at 2:22 PM, Joseph Tremoulet <[hidden email]> wrote:

> teach the inliner how to undo framerecover and let the optimization fall out that way

You'd still have the problem that the referenced variables were frameescaped and appear to be read/written at unrelated calls.

 

> __finally is pretty rare

`__finally` may be rare in SEH code, but `finally` is quite common in .Net code, so we're going to need a solution that avoids early `finally` outlining for the same reasons we're avoiding early `catch` outlining.

While we do need something that's amenable to duplicating `finally` code on hot/non-EH paths (since this is an important optimization for .Net code), I think there also needs to be a path that can be taken for very cold code / very large cleanups / very deeply-nested cleanups to avoid quadratic code expansion on those cases.  Early outlining of just those cases could maybe be made to work, but would require that the decision be made early; it would be much more palatable to make the decision in the natural course of optimization (e.g. some sort of tail duplication).

So I think we should have a way to enter/exit cleanup blocks on non-EH paths.

The smallest change to your proposal that comes to mind which would allow this would be:

   - a normal branch can target a cleanupblock

   - the resume/unwind terminator that ends a cleanup would have another optional target label (%normalsucc).  The semantics would be that the EH edge to %nextaction would be taken if the cleanupblock were entered by an EH edge, and the normal edge to %normalsucc would be taken otherwise.

   - converting this to landingpads would manifest an i1 that would need to be set before branching to the cleanupblock, and turn the unwind/resume into a condbr on that

   - codegen for windows could turn the branch to cleanupblock into a call to the handler followed by a jump to the %normalsucc

A change that I'd consider preferable since it avoids the need for an auxiliary switch in the non-EH code would be:

    - some new terminator along the lines of 'entercleanup %cleanupblock recover to %recoverblock', rather than a plain branch, can target cleanupblocks

    - the resume/unwind terminator that ends a cleanup would have any number of additional successors, and just like the rule that the EH target must match the %nextaction referenced on the cleanupblock, the non-EH targets would need to match one of the %recoverblocks specified on one of the entercleanups targeting the cleanupblock

    - converting this to landingpads would manifest an int that would need to be set before branching to the cleanupblock, and turn the unwind/resume into a switch

    - codegen for windows could turn entercleanup into a call to the handler followed by a jump to the %recoverblock

Those are the first two options that come to mind, anyway.  I'm sure there are other options, but the key points are:

1) can avoid early outlining of finally bodies, and

2) can avoid runaway code expansions

 

Regarding naming, I like your new suggestions (though the changes I'm suggesting to cleanup-ending terminators are a mix between recover and unwind… FWIW, 'endfinally' would be a natural name for that in my world J.  Personally I wouldn't hate something verbose like recover_or_unwind for the forms that may be doing either.  'endcleanup' also comes to mind.).

 

Thanks

-Joseph

 

 

From: Reid Kleckner [mailto:[hidden email]]
Sent: Tuesday, May 19, 2015 3:40 PM
To: Joseph Tremoulet
Cc: LLVM Developers Mailing List; Bill Wendling; Nick Lewycky; Kaylor, Andrew
Subject: Re: [LLVMdev] RFC: New EH representation for MSVC compatibility

 

On Mon, May 18, 2015 at 8:40 PM, Joseph Tremoulet <[hidden email]> wrote:

> I want to have separate normal and exceptional codepaths

I assume you at least mean that's what you'll be doing in Clang's initial IR generation.  Do you also mean to impose this as a restriction on other IR generators, and as a property that IR transformations must preserve?  I.e., is this something that EH preparation can assume?

 

EH preparation should not assume that each basic block lives in exactly one funclet. People seem to get really upset when I suggest that we change the language rules to maintain that invariant. :-)

 

Instead, the EH preparation pass is responsible for establishing that invariant by duplicating blocks, but it should be *way* easier with real basic block terminator instructions to mark handler transitions.

 

> For __finally, we outline the finally body early in clang and emit two calls to it as before, but passing in the frameaddress as an argument

But then you have to frameescape any __finally-referenced local before optimization, and doesn't that defeat the purpose of delaying funclet outlining to EH preparation?

 

True. I'm not committed to this approach, I just want to preserve functionality until we implement the de-commoning part of EH preparation, which I figured we could do later, since it's one of the more obscure corner cases.

 

Also, __finally is pretty rare, so it's not too crazy to teach the inliner how to undo framerecover and let the optimization fall out that way.

 

> tail merging could kick in like you mention. Undoing this would be and currently is the job of WinEHPrepare. I guess I felt like the extra representational complexity wasn't worth the confidence that it would buy us

For one, it seems counterproductive to let tail merge think it can kick in when it's doomed to be undone.

 

I'm only saying that the tail merging is legal, and not that it is desirable. We can teach simplifycfg that tail merging 'resume' instructions is a waste of time, for example.

 

For another, if we're talking about a setup where EH paths might mingle with non-EH paths but the funclet will only be invoked for the EH cases, then I believe this would help the  "pruning as many unreachable CFG edges as possible " step be more effective -- after finding out which blocks are reachable from a catch/cleanup, you could intersect that with the set of blocks from which a corresponding resume can be reached.  Any funclet blocks ending in condbr/switch that only wind up with one successor in the funclet could then have their terminators rewritten as unconditional branches, without needing to recover dataflow and chase constants through phis and resolve compares/switches and all that.

 

For discussion purposes, let's imagine that 'resume' takes a 'from' label which must be an EH block (it starts with one of these new instructions). The nice thing about using a label here is that, unlike most SSA values, labels cannot be phi'd. Now tail merging will have to give up or insert an i1 ph and a conditional branch on the resume instruction, which realistically it won't since it's not a clear win.

 

Originally, I was thinking that this extra funclet cloning precision wasn't worth it, because we'd arrange the frontend and optimizers to make it unlikely, but I'm coming around to it. It mirrors the EH pointer value required by Itanium EH.

 

> I'm not married to reusing 'resume', other candidate names include 'unwind' and 'continue', and I'd like more ideas

The first thing that comes to mind is 'endatch/exitcatch', but to use that you'd need to rename other things since it would be confusing vis-à-vis catchendblock and lack symmetry with catchblock that isn't prefixed with begin/enter.

You could consider 'filter' (or 'filterblock') for 'catchblock', since conceptually it plays the role of a filter (typically one which consults type information; I've seen such things called "typetestfilter" before).  Or 'dispatch'/'dispatchblock'/'exceptiondispatch'/'dispatchexception' (isn't that what Clang names the blocks it creates for the explicit dispatch code?); 'catchendblock' would then be something like 'unwinddispatch' or 'continuedispatch' or 'resumedispatch' and the resume that returns to normal execution could be 'exitdispatch' or 'exitcatch' or even 'uncatch'.

For the resumes that end cleanups, something like 'endcleanup' might work.

Names are hard…

 

'dispatch' might work for the instruction which transitions from a cleanup to the next EH block. :) I don't really see how to work filter in, especially since we've already used it for landingpad filters, which exist to support exception specifications.

 

Chatting around the office, we came up with 'recover' and 'unwind' for ending catch and cleanup blocks respectively.

 

A 'recover' instruction would end a catch block, and would target the block where the exception is over. This instruction would modify memory, because it destroys the exception object. 

 

An 'unwind' instruction would end a cleanup block, and would target the next action. I like this because I talk a lot about "unwind edges" in the CFG, and a cleanup finishing feels like an unwind edge. I could also see 'dispatch' here.

 

Some possible new syntax:

 

recover from label %maycatch.int to label %endcatchbb

unwind from label %cleanup.obj to label %nextaction

unwind from label %cleanup.obj    ; unwinds out of the function, hook it up to the unwind edge of an inlined call site

 

For Itanium, tail merging is profitable and doable with phis, so we might want to do this instead:

 

recover to label %endcatchbb

unwind i8* %ehptr to label %nextaction

unwind i8* %ehptr    ; unwinds out of the function, hook it up to the unwind edge of an inlined call site

 

Itanium requires threading the active exception pointer through to the next action or _Unwind_Resume, so it passes along an SSA value. MSVC needs this weird kind of control dependence instead. We could drop the 'from' token as unnecessary, but I want to make it easier to reason about the textual representation.

 

 


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: New EH representation for MSVC compatibility

Reid Kleckner-2
On Wed, May 20, 2015 at 12:04 PM, Joseph Tremoulet <[hidden email]> wrote:

Sure, I could do this as a follow-on change when LLILC is in a position to exercise it.  If you wanted to go ahead and use the "endcleanup from label %cleanup unwind label %nextaction" naming/syntax in the meantime, that would be cool.

 

> If we're not going to use the construct in Clang

Once the construct and the attendant code duplication optimization are in place, would there be any reason not to switch Clang over to use it?  I'd think that doing so would let you eschew the frameescape and get better-optimized code (even on the non-EH paths) in functions that use SEH __finally.


Yeah, once LLVM supports entercleanup/endcleanup then we could definitely use it for SEH __finally. 

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: New EH representation for MSVC compatibility

Reid Kleckner-2
In reply to this post by Kaylor, Andrew
On Mon, May 18, 2015 at 3:28 PM, Kaylor, Andrew <[hidden email]> wrote:

I hadn’t noticed the “noexcept” specifier in your example.  That clears up part of my concerns, but I still have some problems.

 

 

With regard to the multiple meanings of ‘resume’ I am more concerning about developers who are reading the IR understanding it than about passes operating on it.  Apart from making it harder to debug problems related to control flow at resume instructions I think this makes it more likely that code which mishandles it will be introduced down the road.  If I’m reading things correctly, a resume instruction in your proposal could mean:

 

a) We’re done handling this exception, continue normal execution at this label.

b) We’re done handling this exception, continue execution in an enclosing catch handler at this label.

c) We’re done executing this termination handler, check the catch handler at this label to see if it can handle the current exception.

d) We’re done executing this termination handler, now execute the termination handler at this label.

e) We’re done executing this termination handler, continue handling the exception in the runtime.

 

I suppose (a) and (b) are more or less the same and it doesn’t entirely matter whether the destination is in normal code or exception code.  In practical terms (c) and (d) may be the same also, but logically, in terms of how the runtime works, they are different.  I’m pretty sure there’s a gap in my understanding of your proposal because I don’t understand how e() is represented at all.


Yeah, Joseph agreed this was overloading resume too much. The new idea is that a-b would be handled by 'endcatch' (I previously called this 'recover', but endcatch will be more readable) and c-e are represented with an 'endcleanup' instruction. The endcleanup will have an unwind label operand indicating which EH action runs next, which handles both c and d, depending on the nature of the block. If the unwind label is missing, then execution continues in the calling function's actions.
  

As an exercise, I tried to work through the IR that would be produced in the non-optimized case for the following code:

 

void test() {

  try {

    Obj o1;

    try {

      f();

    } catch (int) {}

    Obj o2;

    try {

      g();

    } catch (int) {}

    h();

  } catch (int) {}

}

 

Here’s what I came up with:

 

define void @foo() personality i32 (...)* @__CxxFrameHandler3 {

  %e.addr = alloca i32

  invoke void @f(i32 1)

    to label %cont1 unwind label %cleanup.Obj

cont1:

  invoke void @g(i32 2)

    to label %cont2 unwind label %cleanup.Obj.1

cont2:

  invoke void @h(i32 2)

    to label %cont3 unwind label %cleanup.Obj.2

cont3:

  call void @~Obj()

  call void @~Obj()

  br label %return

return:

  ret void

 

cleanup.Obj:

  cleanupblock unwind label %maycatch.int

  call void @~Obj()

  resume label %maycatch.int

 

maycatch.int:

  catchblock void [i8* @typeid.int, i32 7, i32* %e.addr]

    to label %catch.int unwind label %catchend

catch.int:

  resume label %cont1

catchend:

  resume

 

cleanup.Obj.1:

  cleanupblock unwind label %maycatch.int.1

  call void @~Obj()

  call void @~Obj()

  resume label %maycatch.int.1

 

maycatch.int.1:

  catchblock void [i8* @typeid.int, i32 7, i32* %e.addr]

    to label %catch.int.1 unwind label %catchend.1

catch.int.1:

  resume label %cont2

catchend.1:

  resume

 

 

cleanup.Obj.2:

  cleanupblock unwind label %maycatch.int.2

  call void @~Obj()

  call void @~Obj()

  resume label %maycatch.int.2

 

maycatch.int.2:

  catchblock void [i8* @typeid.int, i32 7, i32* %e.addr]

    to label %catch.int.2 unwind label %catchend.2

catch.int.2:

  resume label %return

catchend.2:

  resume

}

 

I don’t know if I got that right, but it seems to me that there are a couple of problems with this.  Most obviously, there is a good bit of duplicated code here (which the optimization passes will probably want to combine).


First, each destructor gets it's own cleanupblock, and ultimately ends up in it's own outlined funclet. This cuts down on ~Obj duplication.

Second, you seem to always have the invokes unwinding to cleanups which then unwind to catch handlers, even if the try is more closely nested than the cleanup. If f() throws int, we don't actually run cleanup.Obj, we run the catch and rejoin normal control flow, which will call ~Obj for o1.

Here's how I'd translate that to IR, assuming Obj is empty and has no ctor:

define void @foo() personality i32 (...)* @__CxxFrameHandler3 {
entry:
  %o1 = alloca i8
  %o2 = alloca i8
  invoke void @f()
    to label %cont1 unwind label %maycatch.int2
cont1:
  invoke void @g()
    to label %cont2 unwind label %maycatch.int3
cont2:
  invoke void @h()
    to label %cont3 unwind label %cleanup.Obj1
cont3:
  call void @~Obj(i8* %o2)
  call void @~Obj(i8* %o1)
  br label %return
return:
  ret void

maycatch.int2:
  catchblock [i8* @typeid.int]
    to label %catch.int2 unwind label %catchendblock2
catch.int2:
  endcatch label %cont2
catchendblock2:
  catchendblock unwind label %cleanup.Obj1

maycatch.int3:
  catchblock [i8* @typeid.int]
    to label %catch.int3 unwind label %catchendblock3
catch.int3:
  endcatch label %cont3
catchendblock3:
  catchendblock unwind label %cleanup.Obj2

cleanup.Obj2:
  cleanupblock unwind label %cleanup.Obj1
  call void @~Obj(i8* %o2)
  endcleanup from label %cleanup.Obj2 unwind label %cleanup.Obj1

cleanup.Obj1:
  cleanupblock unwind label %maycatch.int1
  call void @~Obj(i8* %o1)
  endcleanup from label %cleanup.Obj1 unwind label %maycatch.int1

maycatch.int1:
  catchblock [i8* @typeid.int]
    to label %catch.int1 unwind label %catchendblock1
catch.int1:
  endcatch label %return
catchendblock1:
  catchendblock ; no unwind label, uncaught exceptions leave the function
}

This has a total of 4 calls to ~Obj, so each cleanup is emitted exactly twice: once for normal flow and once for exceptional flow.
 

More significantly though is that it doesn’t correctly describe what happens if a non-int exception is thrown in any of the called functions.  For instance, if a non-int exception is thrown from g() that is caught somewhere further down the stack, the runtime should call a terminate handler that destructs o1 and then call a terminate handler that destructs o2.  However, my IR doesn’t describe a terminate handler that destructs just o2 and I don’t know how I could get it to do so within the scheme that you have proposed.


The idea is that the program notionally "executes" the EH blocks and each block describes the intended successor. In the Itanium landingpad model, that's pretty close to what actually happens, with an optimization to skip execution of frames that have no cleanups and have no matching catches.

If h throws a float exception, we should run, in order, maycatch.int3, catchendblock3, cleanup.Obj2, cleanup.Obj1, maycatch.int1, catchendblock4, and then unwind to the parent. Make sense?
 

In a mostly unrelated matter, have you thought about what needs to be done to prevent catchblock blocks from being combined?  For example, suppose you have code that looks like this:

 

void test() {

  try {

    f();

  } catch (int) {

    x();

    y();

    z();

  }

  try {

    g();

  } catch (…) {

  }

  try {

    h();

  } catch (int) {

    x();

    y();

    z();

  }

}

 

I think it’s very likely that if we don’t do anything to prevent it the IR generated for this will be indistinguishable from the IR generated for this:

 

void test() {

  try {

    f();

    try {

      g();

    } catch (…) {

    }

    h();

  } catch (int) {

    x();

    y();

    z();

  }

}


In the original source, the catch-all block unwinds to the parent frame. In the transformed example, it doesn't. Mid-level optimization passes should keep the catch-all in the first example unwinding out to the parent, otherwise things will break down.

Practically, the current middle-end will be unable to merge the catchblocks because of catchendblock and this rule: "Invokes that are reached after a catchblock without following any unwind edges must transitively unwind to the first catchend block that the catchblock unwinds to."

The catchblocks will look like:

maycatch.int1:
  catchblock [i8* @typeid.int]
    to label %catch.int1 ; unwind to parent
catch.int1:
  invoke void @x() to label %cont1 unwind label %catchendblock1
cont1:
  invoke void @y() to label %cont2 unwind label %catchendblock1
cont2:
  invoke void @z() to label %cont3 unwind label %catchendblock1
cont4:
  endcatch label %...
catchendblock1:
  catchendblock ; unwind to parent

Any code merging optimization would have to ensure that those xyz calls unwind to the catchendblock referenced by the catchblock. Practically, this will require merging the entire catchblock, which *is* a legal transformation, and we can compensate for it in the ip2state tables. Imagine assigning these states:

call f() // state 0
call g() // state 2
catchall // state 3
call h() // state 0
catchint // state 1

So, f and h can have the same states even though they are split across the g call in source order.

I need to write more about the state numbering algorithm we came up with, it's the primary constraint on the design.
 

In this case that might be OK, but theoretically the calls to f() and h() should get different states and there are almost certainly cases where failing to recognize that will cause problems.  What’s more, the same basic pattern arises for this case:

 

void test() {

  try {

    f();

  } catch (int) {

    x();

    y();

    z();

  }

  try {

    g();

  } catch (float) {

  }

  try {

    h();

  } catch (int) {

    x();

    y();

    z();

  }

}

 

But in this case, if we get the state numbering wrong an int-exception from g() could end up being incorrectly caught by the xyz handler.

 

BTW, finding cases like this is the primary reason that I’ve been trying to push my current in-flight patch onto the sinking ship that is our current implementation.  I mentioned to you before that the test suite I’m using passes with my proposed patch, but that’s only true with optimizations disabled.  With optimizations turned on I’m seeing all kinds of fun things like similar handlers being combined and common instructions being hoisted above a shared(!) eh_begincatch call in if-else paired handlers.  I don’t know if it will be worth trying to fix these problems, but seeing them in action has been very instructive.


 

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: New EH representation for MSVC compatibility

John McCall-2
In reply to this post by Reid Kleckner-2
On May 15, 2015, at 3:37 PM, Reid Kleckner <[hidden email]> wrote:
After a long tale of sorrow and woe, my colleagues and I stand here before you defeated. The Itanium EH representation is not amenable to implementing MSVC-compatible exceptions. We need a new representation that preserves information about how try-catch blocks are nested.

A couple quick apologies: this response is pretty late, and for the same reasons I’ve only been able to skim the rest of the thread.

I think the basic ideas in this proposal seem reasonable, although it seems they may have evolved a bit over the course of the thread.  A few points do stand out to me:

Instead, all MSVC EH personality functions (x86, x64, ARM) cross (C++, SEH) are implemented with interval tables that express the nesting levels of various source constructs like destructors, try ranges, catch ranges, etc. When you rinse your program through LLVM IR today, this structure is what gets lost.

Yes.  It seems to me that the main additional thing that your proposed IR preserves is the ability to very easily reconstruct the tree of control flow.  Correct?

New information
-------------------------

Recently, we have discovered that the tables for __CxxFrameHandler3 have the additional constraint that the EH states assigned to a catch body must immediately follow the state numbers assigned to the try body. The natural scoping rules of C++ make it so that doing this numbering at the source level is trivial, but once we go to LLVM IR CFG soup, scopes are gone. If you want to know exactly what corner cases break down, search the bug database and mailing lists. The explanations are too long for this RFC.

I don’t quite understand this constraint (aren’t the EH states within the catch body in a different function and therefore numbered separately?), and I don’t really understand why it makes anything about EH harder (aren’t you still going to have exactly the same numbering problems with cleanups/catches being shared between invoke sites?) as opposed to simply being something that you didn’t design your current implementation around, but it doesn’t really matter.  If the new representation is better, it’s better.

New representation
------------------------------

I propose adding the following new instructions, all of which (except for resume) are glued to the top of their basic blocks, just like landingpads.

The fact that most of these are terminators makes pinning to the absolutely beginning really problematic.  An edge that can’t support arbitrary code is one thing (although usually they’re at least splittable!), but at the very least, we need to be able to drop phis and debug instructions in basic blocks, or you’re going to completely wreck basic optimizability.

You should consider giving these a consistent prefix, like “eh_” or “eh.”, just to clearly distinguish them.  You can rename “resume” and “landingpad”, too.

They all have an optional ‘unwind’ label operand, which provides the IR with a tree-like structure of what EH action to take after this EH action completes. The unwind label only participates in the formation of the CFG when used in a catch block, and in other blocks it is considered opaque, personality-specific information. If the unwind label is missing, then control leaves the function after the EH action is completed. If a function is inlined, EH blocks with missing unwind labels are wired up to the unwind label used by the inlined call site.

For the terminators, this unwind label makes sense.  For the non-terminators (just cleanupblock, I think), you’re going to need to define what it means, its strength of reference, etc.  For example, if I have a cleanup that doesn’t terminate (imagine a destructor that calls abort()), the IR will contain no CFG links from the basic block with the cleanupblock to the basic block with the resume.  The basic block with the resume, and all the downstream EH blocks, may even be unreachable; and so the natural tendency would be to remove them.  Does a reference from a cleanupblock keep its target alive?

The new representation is designed to be able to represent Itanium EH in case we want to converge on a single EH representation in LLVM and Clang. An IR pass can convert these actions to landingpads, typeid selector comparisons, and branches, which means we can phase this representation in on Windows at first and experiment with it slowly on other platforms. Over time, we can move the landingpad conversion lower and lower in the stack until it’s moved into DwarfEHPrepare. We’ll need to support landingpads at least until LLVM 4.0, but we may want to keep them because they are the natural representation for Itanium-style EH, and have a relatively low support burden.

I agree that we could migrate Itanium to this pattern fairly successfully, as long as we’re agreed that we’re not setting of goal of eventually emitting identical IR for different personalities.

resume
-------------

; Old form still works, still means control is leaving the function.
resume <valty> %val
; New form overloaded for intra-frame unwinding or resuming normal execution
resume <valty> %val, label %nextaction
; New form for EH personalities that produce no value
resume void

Now resume takes an optional label operand which is the next EH action to run. The label must point to a block starting with an EH action. The various EH action blocks impose personality-specific rules about what the targets of the resume can be.

I agree with the feedback elsewhere that you should separate these instructions.

catchendblock
----------------

catchend unwind label %nextaction

The catchend is a terminator that unconditionally unwinds to the next action. It is merely a placeholder to help reconstruct which invokes were part of the catch blocks of a try. Invokes that are reached after a catchblock without following any unwind edges must transitively unwind to the first catchend block that the catchblock unwinds to. Executing such an invoke that does not transitively unwind to the correct catchend block has undefined behavior.

I think the rule you’re looking for here is that it’s undefined behavior if control flow from a catchblock doesn’t eventually reach the corresponding catchendblock (or reaches the catchblock again before that point).  It’s not unwind-specific.

cleanupblock
--------------------

%val = cleanupblock <valty> unwind label %nextaction

This is not a terminator, and control is expected to flow into a resume instruction which indicates which EH block runs next. If the resume instruction and the unwind label disagree, behavior is undefined.

What’s the expectation here?  Is each cleanupblock conceptually an independent cleanup, or is it a legal transformation to combine successive cleanupblocks?  Is it okay for the code within a cleanupblock to be reachable from multiple cleanupblock instructions?  Is there an expectation about how many resumes are reachable?  Does the cleanupblock instruction itself prevent reordering in any way?

terminateblock
----------------------

; for noexcept
terminateblock [void ()* @std.terminate] unwind label %nextaction
; for exception specifications, throw(int)
terminateblock [void ()* @__cxa_unexpected, @typeid.int, ...] unwind label %nextaction

This is a terminator, and the unwind label is where execution will continue if the program continues execution. It also has an opaque, personality-specific list of constant operands interpreted by the backend of LLVM. The convention is that the first operand is the function to call to end the program, and the rest determine if the program should end.

It looks like you’re expecting this be useful for things like EH filters; that's cool, but it suggests the name might not be appropriate.  Maybe the instruction should be called “eh.filter” and it should take a string label to tell the personality the kind of filtering to do?  And then the function pointer is just another thing in the list of constant parameters.

Also, std::unexpected doesn’t necessarily terminate the program; it gets a chance to remap the exception.  I don’t know that that changes anything — it just means that control continues to the unwind label — but you should keep that in mind in your examples.

sehfilterblock?
------------------

One big hole in the new representation is SEH filter expressions. They present a major complication because they do not follow a stack discipline. Any EH action is reachable after an SEH filter runs. Because the CFG is so useless for optimization purposes, it’s better to outline the filter in the frontend and assume the filter can run during any potentially throwing function call.

It’s not that it doesn’t follow a stack discipline, it’s that it’s not quite part of the same stack as other EH actions.  It’s similar in spirit to a catch block in Smalltalk, which runs before the stack is unwound because it has the ability to resume control at the throw point.  One way to model this would be to have a list of such blocks handing off the invoke (or a landingpad-like instruction that’s easily found from the invoke), where inlining would pull them from outer invokes to inner calls.  But the control flow is inherently strange because it can essentially loop within the invoke, and that’s really hard to model.

John.

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: New EH representation for MSVC compatibility

Philip Reames-4
In reply to this post by Kaylor, Andrew


On 05/18/2015 04:36 PM, Kaylor, Andrew wrote:

> optimizing EH codepaths is not usually performance critical.

>> Leaving aside the rest of the thread, I feel the need to refute this point in isolation.  I've found that optimizing (usually simplifying and eliminating) exception paths ends up being *extremely* important for my workloads.  Failing to optimize exception paths sufficiently tends to indirectly hurt things like inlining for example.  Any design which starts with the assumption that optimizing exception paths isn't important is going to be extremely problematic for me. 

That’s interesting. 

Sorry for not responding here for so long.

 

I wasn’t thinking about performance so much as code size in my original comment.  I’ve been looking at IR recently where code from multiple exception handlers was combined while still maintaining the basic control flow of the EH code.  This kind of optimization is wreaking havoc for our current MSVC compatible EH implementation (hence the redesign), but I guess the Itanium ABI scheme doesn’t have a problem with it.

Since I'm pretty sure I added at least one optimization to simplify cfg which sounds problematic for you, sorry! :)

 

I suppose that is closely related to your concerns about inlining, I just hadn’t made the connection.

The effect on inlining is that combining the exception handler blocks reduces the perceived cost to inline a function with lots of exceptional control flow.  When you're coming from a language where *every* call is an invoke and you need some per function/frame code to recover/rethrow just about any exception (that's just an implementation detail), commoning exceptional control flow turns out to be rather important. 

 

In theory the funclets should be able to share code blocks without any problem.  The entry and exit points are the critical parts that make them funclets.  I’m just not sure how we can get the optimization passes to recognize this fact while still meeting the MSVC runtime constraints.  Reid’s proposal of separate catch blocks should help with that, but I’m still not sure we’ll want to use this representation for targets that don’t need it.

If you fundamentally need the funclets, I think that makes you incompatible with the optimizations we want to support for itanium and related ABIs.  I have no problem with supporting both separately, though I am worried about how much maintenance burden that will create going forward. 



Since I haven't been following the thread closely, is there an updated summary of the original proposal available?  I know that various parts I had concerns with (the unsplitable blocks for instance) got addressed, but I'm having trouble tracking all the changes in discussion.  Even just a quick list of "and we changed X" would be helpful. 

Philip


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: New EH representation for MSVC compatibility

Reid Kleckner-2
On Wed, Jun 17, 2015 at 6:18 PM, Philip Reames <[hidden email]> wrote:
Since I haven't been following the thread closely, is there an updated summary of the original proposal available?  I know that various parts I had concerns with (the unsplitable blocks for instance) got addressed, but I'm having trouble tracking all the changes in discussion.  Even just a quick list of "and we changed X" would be helpful.  

We should do an updated summary, but I've been busy.

We really do need to keep the unsplittable catchblocks, though. Fundamentally, what we have is control flow described by data. This proposal essentially adds LLVM instructions that model that data. We can't insert real instructions like loads and stores between EH dispatch table entries, so it's forbidden.

We can compensate for all the other CFG merging operations after the fact by duplicating code, but it's really hard to pattern match back EH dispatch once it's been allowed to round trip through memory, phis, GVN, etc.

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: New EH representation for MSVC compatibility

Joseph Tremoulet

Hi,

 

I'd be interested in an updated summary as well, especially any thoughts you have on when the various pieces might come online and where extra hands could be helpful; we're eager to build our EH support in LLILC on top of this.

 

Thanks

-Joseph

 

 

From: [hidden email] [mailto:[hidden email]] On Behalf Of Reid Kleckner
Sent: Thursday, June 18, 2015 12:22 PM
To: Philip Reames
Cc: Nick Lewycky; LLVM Developers Mailing List
Subject: Re: [LLVMdev] New EH representation for MSVC compatibility

 

On Wed, Jun 17, 2015 at 6:18 PM, Philip Reames <[hidden email]> wrote:

Since I haven't been following the thread closely, is there an updated summary of the original proposal available?  I know that various parts I had concerns with (the unsplitable blocks for instance) got addressed, but I'm having trouble tracking all the changes in discussion.  Even just a quick list of "and we changed X" would be helpful.  

 

We should do an updated summary, but I've been busy.

 

We really do need to keep the unsplittable catchblocks, though. Fundamentally, what we have is control flow described by data. This proposal essentially adds LLVM instructions that model that data. We can't insert real instructions like loads and stores between EH dispatch table entries, so it's forbidden.

 

We can compensate for all the other CFG merging operations after the fact by duplicating code, but it's really hard to pattern match back EH dispatch once it's been allowed to round trip through memory, phis, GVN, etc.


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: New EH representation for MSVC compatibility

David Majnemer


On Wed, Jul 1, 2015 at 2:10 PM, Joseph Tremoulet <[hidden email]> wrote:

Hi,

 

I'd be interested in an updated summary as well, especially any thoughts you have on when the various pieces might come online and where extra hands could be helpful; we're eager to build our EH support in LLILC on top of this.


A preliminary step was to push personality routines off of LandingPadInst and onto functions, this is already in trunk.

On the IR front, we ended up with the following set of instructions:

catcblock, catchret, cleanupblock, cleanupret, terminateblock

Of these, I've locally added basic support for catchblock, catchret and cleanupret.  I'm working on cleanupblock and terminateblock as we speak.

We'll still need to hook up CodeGenPrep and SDAG up to the new representation.

Extra hands will be appreciated auditing LLVM's optimizations to ensure that they do not miscompile in the face of our new instructions.

 

Thanks

-Joseph

 

 

From: [hidden email] [mailto:[hidden email]] On Behalf Of Reid Kleckner
Sent: Thursday, June 18, 2015 12:22 PM
To: Philip Reames
Cc: Nick Lewycky; LLVM Developers Mailing List
Subject: Re: [LLVMdev] New EH representation for MSVC compatibility

 

On Wed, Jun 17, 2015 at 6:18 PM, Philip Reames <[hidden email]> wrote:

Since I haven't been following the thread closely, is there an updated summary of the original proposal available?  I know that various parts I had concerns with (the unsplitable blocks for instance) got addressed, but I'm having trouble tracking all the changes in discussion.  Even just a quick list of "and we changed X" would be helpful.  

 

We should do an updated summary, but I've been busy.

 

We really do need to keep the unsplittable catchblocks, though. Fundamentally, what we have is control flow described by data. This proposal essentially adds LLVM instructions that model that data. We can't insert real instructions like loads and stores between EH dispatch table entries, so it's forbidden.

 

We can compensate for all the other CFG merging operations after the fact by duplicating code, but it's really hard to pattern match back EH dispatch once it's been allowed to round trip through memory, phis, GVN, etc.


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: New EH representation for MSVC compatibility

David Majnemer


On Wed, Jul 1, 2015 at 4:42 PM, David Majnemer <[hidden email]> wrote:


On Wed, Jul 1, 2015 at 2:10 PM, Joseph Tremoulet <[hidden email]> wrote:

Hi,

 

I'd be interested in an updated summary as well, especially any thoughts you have on when the various pieces might come online and where extra hands could be helpful; we're eager to build our EH support in LLILC on top of this.


A preliminary step was to push personality routines off of LandingPadInst and onto functions, this is already in trunk.

On the IR front, we ended up with the following set of instructions:

catcblock, catchret, cleanupblock, cleanupret, terminateblock

Of these, I've locally added basic support for catchblock, catchret and cleanupret.  I'm working on cleanupblock and terminateblock as we speak.

We'll still need to hook up CodeGenPrep and SDAG up to the new representation.

Extra hands will be appreciated auditing LLVM's optimizations to ensure that they do not miscompile in the face of our new instructions.

And I forgot to mention: I hope to see the new instructions out for review by the end of the week.
 

 

Thanks

-Joseph

 

 

From: [hidden email] [mailto:[hidden email]] On Behalf Of Reid Kleckner
Sent: Thursday, June 18, 2015 12:22 PM
To: Philip Reames
Cc: Nick Lewycky; LLVM Developers Mailing List
Subject: Re: [LLVMdev] New EH representation for MSVC compatibility

 

On Wed, Jun 17, 2015 at 6:18 PM, Philip Reames <[hidden email]> wrote:

Since I haven't been following the thread closely, is there an updated summary of the original proposal available?  I know that various parts I had concerns with (the unsplitable blocks for instance) got addressed, but I'm having trouble tracking all the changes in discussion.  Even just a quick list of "and we changed X" would be helpful.  

 

We should do an updated summary, but I've been busy.

 

We really do need to keep the unsplittable catchblocks, though. Fundamentally, what we have is control flow described by data. This proposal essentially adds LLVM instructions that model that data. We can't insert real instructions like loads and stores between EH dispatch table entries, so it's forbidden.

 

We can compensate for all the other CFG merging operations after the fact by duplicating code, but it's really hard to pattern match back EH dispatch once it's been allowed to round trip through memory, phis, GVN, etc.


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev




_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: New EH representation for MSVC compatibility

Philip Reames-4


On 07/01/2015 04:45 PM, David Majnemer wrote:


On Wed, Jul 1, 2015 at 4:42 PM, David Majnemer <[hidden email]> wrote:


On Wed, Jul 1, 2015 at 2:10 PM, Joseph Tremoulet <[hidden email]> wrote:

Hi,

 

I'd be interested in an updated summary as well, especially any thoughts you have on when the various pieces might come online and where extra hands could be helpful; we're eager to build our EH support in LLILC on top of this.


A preliminary step was to push personality routines off of LandingPadInst and onto functions, this is already in trunk.

On the IR front, we ended up with the following set of instructions:

catcblock, catchret, cleanupblock, cleanupret, terminateblock

Of these, I've locally added basic support for catchblock, catchret and cleanupret.  I'm working on cleanupblock and terminateblock as we speak.

We'll still need to hook up CodeGenPrep and SDAG up to the new representation.

Extra hands will be appreciated auditing LLVM's optimizations to ensure that they do not miscompile in the face of our new instructions.

And I forgot to mention: I hope to see the new instructions out for review by the end of the week.
I look forward to these patches.  Please CC me as a reviewer so that I see them; I'm a bit behind on commit traffic at the moment.

Please make sure the LangRef changes are reasonable complete when you post the patch.  The semantics will be the part I'm most concerned about. 
 

 

Thanks

-Joseph

 

 

From: [hidden email] [mailto:[hidden email]] On Behalf Of Reid Kleckner
Sent: Thursday, June 18, 2015 12:22 PM
To: Philip Reames
Cc: Nick Lewycky; LLVM Developers Mailing List
Subject: Re: [LLVMdev] New EH representation for MSVC compatibility

 

On Wed, Jun 17, 2015 at 6:18 PM, Philip Reames <[hidden email]> wrote:

Since I haven't been following the thread closely, is there an updated summary of the original proposal available?  I know that various parts I had concerns with (the unsplitable blocks for instance) got addressed, but I'm having trouble tracking all the changes in discussion.  Even just a quick list of "and we changed X" would be helpful.  

 

We should do an updated summary, but I've been busy.

 

We really do need to keep the unsplittable catchblocks, though. Fundamentally, what we have is control flow described by data. This proposal essentially adds LLVM instructions that model that data. We can't insert real instructions like loads and stores between EH dispatch table entries, so it's forbidden.

 

We can compensate for all the other CFG merging operations after the fact by duplicating code, but it's really hard to pattern match back EH dispatch once it's been allowed to round trip through memory, phis, GVN, etc.


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev





_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: New EH representation for MSVC compatibility

Joseph Tremoulet

> Extra hands will be appreciated auditing LLVM's optimizations to ensure that they do not miscompile in the face of our new instructions

thinking through what the output of that would be...  I suppose that for each relevant optimization, we'd want either a plausible explanation why it won't miscompile as-is, or a lit test demonstrating a miscompile (accompanied by a fix if small or a bug if large), is that the right idea?  Any particular optimization(s) I should start with?

 

> I look forward to these patches.  Please CC me as a reviewer

Likewise.

 

 

Thanks

-Joseph

 

 

From: Philip Reames [mailto:[hidden email]]
Sent: Thursday, July 2, 2015 8:25 PM
To: David Majnemer; Joseph Tremoulet
Cc: LLVM Developers Mailing List
Subject: Re: [LLVMdev] New EH representation for MSVC compatibility

 

 

On 07/01/2015 04:45 PM, David Majnemer wrote:

 

 

On Wed, Jul 1, 2015 at 4:42 PM, David Majnemer <[hidden email]> wrote:

 

 

On Wed, Jul 1, 2015 at 2:10 PM, Joseph Tremoulet <[hidden email]> wrote:

Hi,

 

I'd be interested in an updated summary as well, especially any thoughts you have on when the various pieces might come online and where extra hands could be helpful; we're eager to build our EH support in LLILC on top of this.

 

A preliminary step was to push personality routines off of LandingPadInst and onto functions, this is already in trunk.

 

On the IR front, we ended up with the following set of instructions:

 

catcblock, catchret, cleanupblock, cleanupret, terminateblock

 

Of these, I've locally added basic support for catchblock, catchret and cleanupret.  I'm working on cleanupblock and terminateblock as we speak.

 

We'll still need to hook up CodeGenPrep and SDAG up to the new representation.

 

Extra hands will be appreciated auditing LLVM's optimizations to ensure that they do not miscompile in the face of our new instructions.

 

And I forgot to mention: I hope to see the new instructions out for review by the end of the week.

I look forward to these patches.  Please CC me as a reviewer so that I see them; I'm a bit behind on commit traffic at the moment.

Please make sure the LangRef changes are reasonable complete when you post the patch.  The semantics will be the part I'm most concerned about. 

 

 

 

Thanks

-Joseph

 

 

From: [hidden email] [mailto:[hidden email]] On Behalf Of Reid Kleckner
Sent: Thursday, June 18, 2015 12:22 PM
To: Philip Reames
Cc: Nick Lewycky; LLVM Developers Mailing List
Subject: Re: [LLVMdev] New EH representation for MSVC compatibility

 

On Wed, Jun 17, 2015 at 6:18 PM, Philip Reames <[hidden email]> wrote:

Since I haven't been following the thread closely, is there an updated summary of the original proposal available?  I know that various parts I had concerns with (the unsplitable blocks for instance) got addressed, but I'm having trouble tracking all the changes in discussion.  Even just a quick list of "and we changed X" would be helpful.  

 

We should do an updated summary, but I've been busy.

 

We really do need to keep the unsplittable catchblocks, though. Fundamentally, what we have is control flow described by data. This proposal essentially adds LLVM instructions that model that data. We can't insert real instructions like loads and stores between EH dispatch table entries, so it's forbidden.

 

We can compensate for all the other CFG merging operations after the fact by duplicating code, but it's really hard to pattern match back EH dispatch once it's been allowed to round trip through memory, phis, GVN, etc.


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

 

 




_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

 


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
12