RFC: New EH representation for MSVC compatibility

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
38 messages Options
12
Reply | Threaded
Open this post in threaded view
|

RFC: New EH representation for MSVC compatibility

Reid Kleckner-2
After a long tale of sorrow and woe, my colleagues and I stand here before you defeated. The Itanium EH representation is not amenable to implementing MSVC-compatible exceptions. We need a new representation that preserves information about how try-catch blocks are nested.

WinEH background
-------------------------------

Skip this if you already know a lot about Windows exceptions. On Windows, every exceptional action that you can imagine is a function call. Throwing an exception is a call. Destructor cleanups and finally blocks are calls to outlined functions that run the cleanup code. Even catching an exception is implemented as an outlined catch handler function which returns the address of the basic block at which normal execution should continue.

This is *not* how Itanium landingpads work, where cleanups and catches are executed after unwinding and clearing old function frames off the stack. The transition to a landingpad is *not* like a function call, and this is the only special control transfer used for Itanium EH. In retrospect, having exactly one kind of control transfer turns out to be a great design simplification. Go Itanium!

Instead, all MSVC EH personality functions (x86, x64, ARM) cross (C++, SEH) are implemented with interval tables that express the nesting levels of various source constructs like destructors, try ranges, catch ranges, etc. When you rinse your program through LLVM IR today, this structure is what gets lost.

New information
-------------------------

Recently, we have discovered that the tables for __CxxFrameHandler3 have the additional constraint that the EH states assigned to a catch body must immediately follow the state numbers assigned to the try body. The natural scoping rules of C++ make it so that doing this numbering at the source level is trivial, but once we go to LLVM IR CFG soup, scopes are gone. If you want to know exactly what corner cases break down, search the bug database and mailing lists. The explanations are too long for this RFC.


New representation
------------------------------

I propose adding the following new instructions, all of which (except for resume) are glued to the top of their basic blocks, just like landingpads. They all have an optional ‘unwind’ label operand, which provides the IR with a tree-like structure of what EH action to take after this EH action completes. The unwind label only participates in the formation of the CFG when used in a catch block, and in other blocks it is considered opaque, personality-specific information. If the unwind label is missing, then control leaves the function after the EH action is completed. If a function is inlined, EH blocks with missing unwind labels are wired up to the unwind label used by the inlined call site.

The new representation is designed to be able to represent Itanium EH in case we want to converge on a single EH representation in LLVM and Clang. An IR pass can convert these actions to landingpads, typeid selector comparisons, and branches, which means we can phase this representation in on Windows at first and experiment with it slowly on other platforms. Over time, we can move the landingpad conversion lower and lower in the stack until it’s moved into DwarfEHPrepare. We’ll need to support landingpads at least until LLVM 4.0, but we may want to keep them because they are the natural representation for Itanium-style EH, and have a relatively low support burden.

resume
-------------

; Old form still works, still means control is leaving the function.
resume <valty> %val
; New form overloaded for intra-frame unwinding or resuming normal execution
resume <valty> %val, label %nextaction
; New form for EH personalities that produce no value
resume void

Now resume takes an optional label operand which is the next EH action to run. The label must point to a block starting with an EH action. The various EH action blocks impose personality-specific rules about what the targets of the resume can be.

catchblock
---------------

%val = catchblock <valty> [i8* @typeid.int, i32 7, i32* %e.addr]
    to label %catch.int unwind label %nextaction

The catchblock is a terminator that conditionally selects which block to execute based on the opaque operands interpreted by the personality function. If the exception is caught, the ‘to’ block is executed. If unwinding should continue, the ‘unwind’ block is executed. Because the catchblock is a terminator, no instructions can be inserted into a catchblock. The MSVC personality function requires more than just a pointer to RTTI data, so a variable list of operands is accepted. For an Itanium personality, only one RTTI operand is needed. The ‘unwind’ label of a catchblock must point to a catchend.

catchendblock
----------------

catchend unwind label %nextaction

The catchend is a terminator that unconditionally unwinds to the next action. It is merely a placeholder to help reconstruct which invokes were part of the catch blocks of a try. Invokes that are reached after a catchblock without following any unwind edges must transitively unwind to the first catchend block that the catchblock unwinds to. Executing such an invoke that does not transitively unwind to the correct catchend block has undefined behavior.

cleanupblock
--------------------

%val = cleanupblock <valty> unwind label %nextaction

This is not a terminator, and control is expected to flow into a resume instruction which indicates which EH block runs next. If the resume instruction and the unwind label disagree, behavior is undefined.

terminateblock
----------------------

; for noexcept
terminateblock [void ()* @std.terminate] unwind label %nextaction
; for exception specifications, throw(int)
terminateblock [void ()* @__cxa_unexpected, @typeid.int, ...] unwind label %nextaction

This is a terminator, and the unwind label is where execution will continue if the program continues execution. It also has an opaque, personality-specific list of constant operands interpreted by the backend of LLVM. The convention is that the first operand is the function to call to end the program, and the rest determine if the program should end.

sehfilterblock?
------------------

One big hole in the new representation is SEH filter expressions. They present a major complication because they do not follow a stack discipline. Any EH action is reachable after an SEH filter runs. Because the CFG is so useless for optimization purposes, it’s better to outline the filter in the frontend and assume the filter can run during any potentially throwing function call.

MSVC EH implementation strategy
----------------------------------------------

Skim this if you just need the semantics of the representation above, and not the implementation details.

The new EH block representation allows WinEHPrepare to get a lot simpler. EH blocks should now look a lot more familiar, they are single entry, multi-exit regions of code. This is exactly equivalent to a function, and we can call them funclets. The plan is to generate code for the parent function first, skipping all exceptional blocks, and then generate separate MachineFunctions for each subfunction in turn. I repeat, we can stop doing outlining in IR. This was just a mistake, because I was afraid of grappling with CodeGen.

WinEHPrepare will have two jobs now:
1. Mark down which basic blocks are reachable from which handler. Duplicate any blocks that are reachable from two handlers until each block belongs to exactly one funclet, pruning as many unreachable CFG edges as possible.
2. Demote SSA values that are defined in a funclet and used in another funclet.

The instruction selection pass is the pass that builds MachineFunctions from IR Functions. This is the pass that will be responsible for the split. It will maintain information about the offsets of static allocas in FunctionLoweringInfo, and will throw it away when all funclets have been generated for this function. This means we don’t need to insert framerecover calls anymore.

Generating EH state numbers for the TryBlockMap and StateUnwindTable is a matter of building a tree of EH blocks and invokes. Every unwind edge from an invoke or an EH block represents that the instruction is a child of the target block. If the unwind edge is empty, it is a child of the parent function, which is the root node of the tree. State numbers can be assigned by doing a DFS traversal where invokes are visited before EH blocks, and EH blocks can be visited in an arbitrary-but-deterministic order that vaguely corresponds to source order. Invokes are immediately assigned the current state number. Upon visiting an EH block, the state number is recorded as the “low” state of the block. All invokes are assigned this state number. The state number is incremented, and each child EH block is visited, passing in the state number and producing a new state number. The final state number is returned to the parent node.

Example IR from Clang
----------------------------------------

The C++:

struct Obj { ~Obj(); };
void f(int);
void foo() noexcept {
  try {
    f(1);
    Obj o;
    f(2);
  } catch (int e) {
    f(3);
    try {
      f(4);
    } catch (...) {
      f(5);
    }
  }
}

The IR for __CxxFrameHandler3:

define void @foo() personality i32 (...)* @__CxxFrameHandler3 {
  %e.addr = alloca i32
  invoke void @f(i32 1)
    to label %cont1 unwind label %maycatch.int
cont1:
  invoke void @f(i32 2)
    to label %cont2 unwind label %cleanup.Obj
cont2:
  call void @~Obj()
  br label %return
return:
  ret void

cleanup.Obj:
  cleanupblock unwind label %maycatch.int
  call void @~Obj()
  resume label %maycatch.int

  catchblock void [i8* @typeid.int, i32 7, i32* %e.addr]
    to label %catch.int unwind label %catchend1
  invoke void @f(i32 3)
    to label %cont3 unwind label %catchend1
cont3:
  invoke void @f(i32 4)
    to label %cont4 unwind label %maycatch.all
cont4:
  resume label %return

maycatch.all:
  catchblock void [i8* null, i32 0, i8* null]
    to label %catch.all unwind label %catchend2
catch.all:
  invoke void @f(i32 5)
    to label %cont5 unwind label %catchend2
cont5:
  resume label %cont4

catchend2:
  catchendblock unwind label %catchend1

catchend1:
  catchendblock unwind label %callterminate

callterminate:
  terminateblock [void ()* @std.terminate]
}

From this IR, we can recover the original scoped nesting form that the table formation requires.

I think that covers it. Feedback welcome. :)

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: New EH representation for MSVC compatibility

Kaylor, Andrew

I like the way this sorts out with regard to funclet code generation.  It feels very natural for Windows EH, though obviously not as natural for non-Windows targets and I think it is likely to block some optimizations that are currently possible with those targets.

 

 

> If the unwind label is missing, then control leaves the function after the EH action is completed. If a function is inlined, EH blocks with missing unwind labels are wired up to the unwind label used by the inlined call site.

 

Is this saying that a “missing” unwind label corresponds to telling the runtime to continue the search at the next frame?

 

Your example looks wrong in this regard, unless I’m misunderstanding it.  It looks like any exceptions that aren’t caught in that function will lead to a terminate call.

 

 

> Invokes that are reached after a catchblock without following any unwind edges must transitively unwind to the first catchend block that the catchblock unwinds to.

 

I’m not sure I understand this correctly.  In particular, I’m confused about the roles of resume and catchend.

 

 

> %val = cleanupblock <valty> unwind label %nextaction

 

Why isn’t this a terminator?  It seems like it performs the same sort of role as catchblock, except presumably it is always entered.  I suppose that’s probably the answer to my question, but it strikes me as an ambiguity in the scheme.  The catchblock instruction is more or less a conditional branch whereas the cleanupblock is more like a label with a hint as to an unconditional branch that will happen later.  And I guess that’s another thing that bothers me -- a resume instruction at the end of a catch implementation means something subtly different than a resume instruction at the end of a cleanup implementation.

 

 

 

 

From: Reid Kleckner [mailto:[hidden email]]
Sent: Friday, May 15, 2015 3:38 PM
To: LLVM Developers Mailing List; Bill Wendling; Nick Lewycky; Kaylor, Andrew
Subject: RFC: New EH representation for MSVC compatibility

 

After a long tale of sorrow and woe, my colleagues and I stand here before you defeated. The Itanium EH representation is not amenable to implementing MSVC-compatible exceptions. We need a new representation that preserves information about how try-catch blocks are nested.

 

WinEH background

-------------------------------

 

Skip this if you already know a lot about Windows exceptions. On Windows, every exceptional action that you can imagine is a function call. Throwing an exception is a call. Destructor cleanups and finally blocks are calls to outlined functions that run the cleanup code. Even catching an exception is implemented as an outlined catch handler function which returns the address of the basic block at which normal execution should continue.

 

This is *not* how Itanium landingpads work, where cleanups and catches are executed after unwinding and clearing old function frames off the stack. The transition to a landingpad is *not* like a function call, and this is the only special control transfer used for Itanium EH. In retrospect, having exactly one kind of control transfer turns out to be a great design simplification. Go Itanium!

 

Instead, all MSVC EH personality functions (x86, x64, ARM) cross (C++, SEH) are implemented with interval tables that express the nesting levels of various source constructs like destructors, try ranges, catch ranges, etc. When you rinse your program through LLVM IR today, this structure is what gets lost.

 

New information

-------------------------

 

Recently, we have discovered that the tables for __CxxFrameHandler3 have the additional constraint that the EH states assigned to a catch body must immediately follow the state numbers assigned to the try body. The natural scoping rules of C++ make it so that doing this numbering at the source level is trivial, but once we go to LLVM IR CFG soup, scopes are gone. If you want to know exactly what corner cases break down, search the bug database and mailing lists. The explanations are too long for this RFC.

 

 

New representation

------------------------------

 

I propose adding the following new instructions, all of which (except for resume) are glued to the top of their basic blocks, just like landingpads. They all have an optional ‘unwind’ label operand, which provides the IR with a tree-like structure of what EH action to take after this EH action completes. The unwind label only participates in the formation of the CFG when used in a catch block, and in other blocks it is considered opaque, personality-specific information. If the unwind label is missing, then control leaves the function after the EH action is completed. If a function is inlined, EH blocks with missing unwind labels are wired up to the unwind label used by the inlined call site.

 

The new representation is designed to be able to represent Itanium EH in case we want to converge on a single EH representation in LLVM and Clang. An IR pass can convert these actions to landingpads, typeid selector comparisons, and branches, which means we can phase this representation in on Windows at first and experiment with it slowly on other platforms. Over time, we can move the landingpad conversion lower and lower in the stack until it’s moved into DwarfEHPrepare. We’ll need to support landingpads at least until LLVM 4.0, but we may want to keep them because they are the natural representation for Itanium-style EH, and have a relatively low support burden.

 

resume

-------------

 

; Old form still works, still means control is leaving the function.

resume <valty> %val

; New form overloaded for intra-frame unwinding or resuming normal execution

resume <valty> %val, label %nextaction

; New form for EH personalities that produce no value

resume void

 

Now resume takes an optional label operand which is the next EH action to run. The label must point to a block starting with an EH action. The various EH action blocks impose personality-specific rules about what the targets of the resume can be.

 

catchblock

---------------

 

%val = catchblock <valty> [i8* @typeid.int, i32 7, i32* %e.addr]

    to label %catch.int unwind label %nextaction

 

The catchblock is a terminator that conditionally selects which block to execute based on the opaque operands interpreted by the personality function. If the exception is caught, the ‘to’ block is executed. If unwinding should continue, the ‘unwind’ block is executed. Because the catchblock is a terminator, no instructions can be inserted into a catchblock. The MSVC personality function requires more than just a pointer to RTTI data, so a variable list of operands is accepted. For an Itanium personality, only one RTTI operand is needed. The ‘unwind’ label of a catchblock must point to a catchend.

 

catchendblock

----------------

 

catchend unwind label %nextaction

 

The catchend is a terminator that unconditionally unwinds to the next action. It is merely a placeholder to help reconstruct which invokes were part of the catch blocks of a try. Invokes that are reached after a catchblock without following any unwind edges must transitively unwind to the first catchend block that the catchblock unwinds to. Executing such an invoke that does not transitively unwind to the correct catchend block has undefined behavior.

 

cleanupblock

--------------------

 

%val = cleanupblock <valty> unwind label %nextaction

 

This is not a terminator, and control is expected to flow into a resume instruction which indicates which EH block runs next. If the resume instruction and the unwind label disagree, behavior is undefined.

 

terminateblock

----------------------

 

; for noexcept

terminateblock [void ()* @std.terminate] unwind label %nextaction

; for exception specifications, throw(int)

terminateblock [void ()* @__cxa_unexpected, @typeid.int, ...] unwind label %nextaction

 

This is a terminator, and the unwind label is where execution will continue if the program continues execution. It also has an opaque, personality-specific list of constant operands interpreted by the backend of LLVM. The convention is that the first operand is the function to call to end the program, and the rest determine if the program should end.

 

sehfilterblock?

------------------

 

One big hole in the new representation is SEH filter expressions. They present a major complication because they do not follow a stack discipline. Any EH action is reachable after an SEH filter runs. Because the CFG is so useless for optimization purposes, it’s better to outline the filter in the frontend and assume the filter can run during any potentially throwing function call.

 

MSVC EH implementation strategy

----------------------------------------------

 

Skim this if you just need the semantics of the representation above, and not the implementation details.

 

The new EH block representation allows WinEHPrepare to get a lot simpler. EH blocks should now look a lot more familiar, they are single entry, multi-exit regions of code. This is exactly equivalent to a function, and we can call them funclets. The plan is to generate code for the parent function first, skipping all exceptional blocks, and then generate separate MachineFunctions for each subfunction in turn. I repeat, we can stop doing outlining in IR. This was just a mistake, because I was afraid of grappling with CodeGen.

 

WinEHPrepare will have two jobs now:

1. Mark down which basic blocks are reachable from which handler. Duplicate any blocks that are reachable from two handlers until each block belongs to exactly one funclet, pruning as many unreachable CFG edges as possible.

2. Demote SSA values that are defined in a funclet and used in another funclet.

 

The instruction selection pass is the pass that builds MachineFunctions from IR Functions. This is the pass that will be responsible for the split. It will maintain information about the offsets of static allocas in FunctionLoweringInfo, and will throw it away when all funclets have been generated for this function. This means we don’t need to insert framerecover calls anymore.

 

Generating EH state numbers for the TryBlockMap and StateUnwindTable is a matter of building a tree of EH blocks and invokes. Every unwind edge from an invoke or an EH block represents that the instruction is a child of the target block. If the unwind edge is empty, it is a child of the parent function, which is the root node of the tree. State numbers can be assigned by doing a DFS traversal where invokes are visited before EH blocks, and EH blocks can be visited in an arbitrary-but-deterministic order that vaguely corresponds to source order. Invokes are immediately assigned the current state number. Upon visiting an EH block, the state number is recorded as the “low” state of the block. All invokes are assigned this state number. The state number is incremented, and each child EH block is visited, passing in the state number and producing a new state number. The final state number is returned to the parent node.

 

Example IR from Clang

----------------------------------------

 

The C++:

 

struct Obj { ~Obj(); };

void f(int);

void foo() noexcept {

  try {

    f(1);

    Obj o;

    f(2);

  } catch (int e) {

    f(3);

    try {

      f(4);

    } catch (...) {

      f(5);

    }

  }

}

 

The IR for __CxxFrameHandler3:

 

define void @foo() personality i32 (...)* @__CxxFrameHandler3 {

  %e.addr = alloca i32

  invoke void @f(i32 1)

    to label %cont1 unwind label %maycatch.int

cont1:

  invoke void @f(i32 2)

    to label %cont2 unwind label %cleanup.Obj

cont2:

  call void @~Obj()

  br label %return

return:

  ret void

 

cleanup.Obj:

  cleanupblock unwind label %maycatch.int

  call void @~Obj()

  resume label %maycatch.int

 

  catchblock void [i8* @typeid.int, i32 7, i32* %e.addr]

    to label %catch.int unwind label %catchend1

  invoke void @f(i32 3)

    to label %cont3 unwind label %catchend1

cont3:

  invoke void @f(i32 4)

    to label %cont4 unwind label %maycatch.all

cont4:

  resume label %return

 

maycatch.all:

  catchblock void [i8* null, i32 0, i8* null]

    to label %catch.all unwind label %catchend2

catch.all:

  invoke void @f(i32 5)

    to label %cont5 unwind label %catchend2

cont5:

  resume label %cont4

 

catchend2:

  catchendblock unwind label %catchend1

 

catchend1:

  catchendblock unwind label %callterminate

 

callterminate:

  terminateblock [void ()* @std.terminate]

}

 

From this IR, we can recover the original scoped nesting form that the table formation requires.

 

I think that covers it. Feedback welcome. :)


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: New EH representation for MSVC compatibility

Steve Cheng
In reply to this post by Reid Kleckner-2
On 2015-05-15 18:37:58 -0400, Reid Kleckner said:

> After a long tale of sorrow and woe, my colleagues and I stand here
> before you defeated. The Itanium EH representation is not amenable to
> implementing MSVC-compatible exceptions. We need a new representation
> that preserves information about how try-catch blocks are nested.

Hi,

Newbie here. This must be a dumb question, but it's not something I can
understand from reading the design documents and RFCs.

Why don't we write and use our own personality function, and then we
avoid all these restrictions on the interval tables? On Windows, we
would still have to catch exceptions with SEH, of course. But SEH
should be language-independent, in the sense that it concerns only
unwinding for the "low level" parts of the ABI, like restoring
non-volatile registers. It doesn't seem to make sense that LLVM, being
a language-independent IR, should concern itself with personality
functions specific to Microsoft's C++ run-time library.

I understand we want to link compatibility with object code from Visual
Studio, but I didn't think that the personality-specific unwind tables
were actually an ABI "artifact". I mean, let's say you compile a
function in Visual Studio, the result is a function with some mangled
symbol that other object code can make references through the linker.
But the other object code never incestuously refers to the unwind
tables of the function it calls, right?

I'm speaking from the point of view of an implementor of a new
programming language who wants to interoperate with C++. I've already
got code that can decode the Microsoft RTTI info coming from C++
exceptions, and the pointers to those can be extracted with SEH
GetExceptionPointers, etc. To support catching exceptions in my
language, or at least calling cleanups while unwinding, I really don't
care about the C++ personality function. After all my language might
not even have the concept of (nested) try/catch blocks. The custom
personality function does have to be supplied as part of the run-time,
but as a frontend implementor I'm prepared to have to write a run-time
anyway.

Steve


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: New EH representation for MSVC compatibility

Reid Kleckner-2
On Sat, May 16, 2015 at 7:29 AM, Steve Cheng <[hidden email]> wrote:
On 2015-05-15 18:37:58 -0400, Reid Kleckner said:

After a long tale of sorrow and woe, my colleagues and I stand here before you defeated. The Itanium EH representation is not amenable to implementing MSVC-compatible exceptions. We need a new representation that preserves information about how try-catch blocks are nested.

Hi,

Newbie here. This must be a dumb question, but it's not something I can understand from reading the design documents and RFCs.

Why don't we write and use our own personality function, and then we avoid all these restrictions on the interval tables? On Windows, we would still have to catch exceptions with SEH, of course. But SEH should be language-independent, in the sense that it concerns only unwinding for the "low level" parts of the ABI, like restoring non-volatile registers. It doesn't seem to make sense that LLVM, being a language-independent IR, should concern itself with personality functions specific to Microsoft's C++ run-time library.

I understand we want to link compatibility with object code from Visual Studio, but I didn't think that the personality-specific unwind tables were actually an ABI "artifact". I mean, let's say you compile a function in Visual Studio, the result is a function with some mangled symbol that other object code can make references through the linker. But the other object code never incestuously refers to the unwind tables of the function it calls, right?

I'm speaking from the point of view of an implementor of a new programming language who wants to interoperate with C++. I've already got code that can decode the Microsoft RTTI info coming from C++ exceptions, and the pointers to those can be extracted with SEH GetExceptionPointers, etc. To support catching exceptions in my language, or at least calling cleanups while unwinding, I really don't care about the C++ personality function. After all my language might not even have the concept of (nested) try/catch blocks. The custom personality function does have to be supplied as part of the run-time, but as a frontend implementor I'm prepared to have to write a run-time anyway.

Right, doing our own personality function is possible, but still has half the challenge of using __CxxFrameHandler3, and introduces a new runtime dependency that isn't there currently. Given that it won't save that much work, I'd rather not introduce a dependency that wasn't there before.

The reason it's still hard is that you have to split the main function up into more than one subfunction. The exception object is allocated in the frame of the function calling __CxxThrow, and it has to stay alive until control leaves the catch block receiving the exception. This is different from Itanium, where the exception object is constructed in heap memory and the pointer is saved in TLS. If this were not the case, we'd use the __gxx_personaltity_v0-style landingpad approach and make a new personality variant that understands MS RTTI.

We could try to do all this outlining in Clang, but that blocks a lot of LLVM optimizations. Any object with a destructor (std::string) is now escaped into the funclet that calls the destructor, and simple transformations (SROA) require interprocedural analysis. This affects the code on the normal code path and not just the exceptional path. While EH constructs like try / catch are fairly rare in C++, destructor cleanups are very, very common, and I'd rather not pessimize so much code.

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: New EH representation for MSVC compatibility

Kaylor, Andrew
In reply to this post by Steve Cheng
We already have something like what you describe in the form of mingw support.  It runs on Windows and handles exceptions using a DWARF-style personality function and (I think) an LLVM-provided implementation of the libc++abi library.

What this doesn't do is provide interoperability with MSVC-compiled objects.  For instance, you can't throw an exception from MSVC-compiled code and catch it with clang/LLVM-compiled code or vice versa.  With the (too fragile) implementation we have in place right now you can do that (at least in cases that don't break for other reasons), and we want to be able to continue that capability with a new, more robust, solution.

-Andy

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Steve Cheng
Sent: Saturday, May 16, 2015 7:30 AM
To: [hidden email]
Subject: Re: [LLVMdev] RFC: New EH representation for MSVC compatibility

On 2015-05-15 18:37:58 -0400, Reid Kleckner said:

> After a long tale of sorrow and woe, my colleagues and I stand here
> before you defeated. The Itanium EH representation is not amenable to
> implementing MSVC-compatible exceptions. We need a new representation
> that preserves information about how try-catch blocks are nested.

Hi,

Newbie here. This must be a dumb question, but it's not something I can understand from reading the design documents and RFCs.

Why don't we write and use our own personality function, and then we avoid all these restrictions on the interval tables? On Windows, we would still have to catch exceptions with SEH, of course. But SEH should be language-independent, in the sense that it concerns only unwinding for the "low level" parts of the ABI, like restoring non-volatile registers. It doesn't seem to make sense that LLVM, being a language-independent IR, should concern itself with personality functions specific to Microsoft's C++ run-time library.

I understand we want to link compatibility with object code from Visual Studio, but I didn't think that the personality-specific unwind tables were actually an ABI "artifact". I mean, let's say you compile a function in Visual Studio, the result is a function with some mangled symbol that other object code can make references through the linker.
But the other object code never incestuously refers to the unwind tables of the function it calls, right?

I'm speaking from the point of view of an implementor of a new programming language who wants to interoperate with C++. I've already got code that can decode the Microsoft RTTI info coming from C++ exceptions, and the pointers to those can be extracted with SEH GetExceptionPointers, etc. To support catching exceptions in my language, or at least calling cleanups while unwinding, I really don't care about the C++ personality function. After all my language might not even have the concept of (nested) try/catch blocks. The custom personality function does have to be supplied as part of the run-time, but as a frontend implementor I'm prepared to have to write a run-time anyway.

Steve


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: New EH representation for MSVC compatibility

Vadim Chugunov
Are you guys talking specifically about Win32 EH here?   AFAIK, Win64
EH works with gcc-style personality routines just fine.

On Mon, May 18, 2015 at 11:02 AM, Kaylor, Andrew
<[hidden email]> wrote:

> We already have something like what you describe in the form of mingw support.  It runs on Windows and handles exceptions using a DWARF-style personality function and (I think) an LLVM-provided implementation of the libc++abi library.
>
> What this doesn't do is provide interoperability with MSVC-compiled objects.  For instance, you can't throw an exception from MSVC-compiled code and catch it with clang/LLVM-compiled code or vice versa.  With the (too fragile) implementation we have in place right now you can do that (at least in cases that don't break for other reasons), and we want to be able to continue that capability with a new, more robust, solution.
>
> -Andy
>
> -----Original Message-----
> From: [hidden email] [mailto:[hidden email]] On Behalf Of Steve Cheng
> Sent: Saturday, May 16, 2015 7:30 AM
> To: [hidden email]
> Subject: Re: [LLVMdev] RFC: New EH representation for MSVC compatibility
>
> On 2015-05-15 18:37:58 -0400, Reid Kleckner said:
>
>> After a long tale of sorrow and woe, my colleagues and I stand here
>> before you defeated. The Itanium EH representation is not amenable to
>> implementing MSVC-compatible exceptions. We need a new representation
>> that preserves information about how try-catch blocks are nested.
>
> Hi,
>
> Newbie here. This must be a dumb question, but it's not something I can understand from reading the design documents and RFCs.
>
> Why don't we write and use our own personality function, and then we avoid all these restrictions on the interval tables? On Windows, we would still have to catch exceptions with SEH, of course. But SEH should be language-independent, in the sense that it concerns only unwinding for the "low level" parts of the ABI, like restoring non-volatile registers. It doesn't seem to make sense that LLVM, being a language-independent IR, should concern itself with personality functions specific to Microsoft's C++ run-time library.
>
> I understand we want to link compatibility with object code from Visual Studio, but I didn't think that the personality-specific unwind tables were actually an ABI "artifact". I mean, let's say you compile a function in Visual Studio, the result is a function with some mangled symbol that other object code can make references through the linker.
> But the other object code never incestuously refers to the unwind tables of the function it calls, right?
>
> I'm speaking from the point of view of an implementor of a new programming language who wants to interoperate with C++. I've already got code that can decode the Microsoft RTTI info coming from C++ exceptions, and the pointers to those can be extracted with SEH GetExceptionPointers, etc. To support catching exceptions in my language, or at least calling cleanups while unwinding, I really don't care about the C++ personality function. After all my language might not even have the concept of (nested) try/catch blocks. The custom personality function does have to be supplied as part of the run-time, but as a frontend implementor I'm prepared to have to write a run-time anyway.
>
> Steve
>
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: New EH representation for MSVC compatibility

Kaylor, Andrew
No, we're talking about 32- and 64-bit programs.  The goal is specifically to get these programs to work with the MSVC runtime.  If the MSVC runtime starts handling an exception (for instance, within a library compiled with MSVC) it is only going to dispatch that exception to a handler that it is able to recognize.

If all you are interested in is handling exceptions within a closed system, then there are certainly a lot of ways it can be accomplished.  It's the desire for MSVC compatibility that constrains the implementation.

-----Original Message-----
From: Vadim Chugunov [mailto:[hidden email]]
Sent: Monday, May 18, 2015 11:17 AM
To: Kaylor, Andrew
Cc: Steve Cheng; [hidden email]
Subject: Re: [LLVMdev] RFC: New EH representation for MSVC compatibility

Are you guys talking specifically about Win32 EH here?   AFAIK, Win64
EH works with gcc-style personality routines just fine.

On Mon, May 18, 2015 at 11:02 AM, Kaylor, Andrew <[hidden email]> wrote:

> We already have something like what you describe in the form of mingw support.  It runs on Windows and handles exceptions using a DWARF-style personality function and (I think) an LLVM-provided implementation of the libc++abi library.
>
> What this doesn't do is provide interoperability with MSVC-compiled objects.  For instance, you can't throw an exception from MSVC-compiled code and catch it with clang/LLVM-compiled code or vice versa.  With the (too fragile) implementation we have in place right now you can do that (at least in cases that don't break for other reasons), and we want to be able to continue that capability with a new, more robust, solution.
>
> -Andy
>
> -----Original Message-----
> From: [hidden email] [mailto:[hidden email]]
> On Behalf Of Steve Cheng
> Sent: Saturday, May 16, 2015 7:30 AM
> To: [hidden email]
> Subject: Re: [LLVMdev] RFC: New EH representation for MSVC
> compatibility
>
> On 2015-05-15 18:37:58 -0400, Reid Kleckner said:
>
>> After a long tale of sorrow and woe, my colleagues and I stand here
>> before you defeated. The Itanium EH representation is not amenable to
>> implementing MSVC-compatible exceptions. We need a new representation
>> that preserves information about how try-catch blocks are nested.
>
> Hi,
>
> Newbie here. This must be a dumb question, but it's not something I can understand from reading the design documents and RFCs.
>
> Why don't we write and use our own personality function, and then we avoid all these restrictions on the interval tables? On Windows, we would still have to catch exceptions with SEH, of course. But SEH should be language-independent, in the sense that it concerns only unwinding for the "low level" parts of the ABI, like restoring non-volatile registers. It doesn't seem to make sense that LLVM, being a language-independent IR, should concern itself with personality functions specific to Microsoft's C++ run-time library.
>
> I understand we want to link compatibility with object code from Visual Studio, but I didn't think that the personality-specific unwind tables were actually an ABI "artifact". I mean, let's say you compile a function in Visual Studio, the result is a function with some mangled symbol that other object code can make references through the linker.
> But the other object code never incestuously refers to the unwind tables of the function it calls, right?
>
> I'm speaking from the point of view of an implementor of a new programming language who wants to interoperate with C++. I've already got code that can decode the Microsoft RTTI info coming from C++ exceptions, and the pointers to those can be extracted with SEH GetExceptionPointers, etc. To support catching exceptions in my language, or at least calling cleanups while unwinding, I really don't care about the C++ personality function. After all my language might not even have the concept of (nested) try/catch blocks. The custom personality function does have to be supplied as part of the run-time, but as a frontend implementor I'm prepared to have to write a run-time anyway.
>
> Steve
>
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: New EH representation for MSVC compatibility

Steve Cheng
In reply to this post by Reid Kleckner-2
On 2015-05-18 13:32:54 -0400, Reid Kleckner said:
>
> Right, doing our own personality function is possible, but still has
> half the challenge of using __CxxFrameHandler3, and introduces a new
> runtime dependency that isn't there currently. Given that it won't save
> that much work, I'd rather not introduce a dependency that wasn't there
> before.
>
> The reason it's still hard is that you have to split the main function
> up into more than one subfunction.

I see, but I thought are able to outline cleanup code already?

And that the hiccup you are encountering is because __CxxFrameHandler3
requires unwind tables with properly-ordered state transitions? The
compiler SEH personality (_C_specific_handler) doesn't have that,
right? If you could manage __try, __finally already, doesn't that
provide the solution?

Let me be precise. Let's take your example with the "ambiguous IR lowering":

void test1() {
  // EH state = -1
  try {
    // EH state = 0
    try {
      // EH state = 1
      throw 1;
    } catch(...) {
      // EH state = 2
      throw;
    }
    // EH state = 0
  } catch (...) {
    // EH state = 3
  }
  // EH state = -1
}

If I were "lowering" to compiler SEH, I would do something like this:

If I were "lowering" to compiler SEH, I would do something like this:

  __try {
    // [0]
    // [1]
    __try {
      // [2]
      throw 1;
      // [3]
    } __except( MyCxxFilter1() ) {
      // [4]
      throw;
      // [5]
    }
    // [6]
    // [7]
  } __except( MyCxxFilter2() ) {
    // [8]
    // [9]
  }
  // [10]
  // [11]

My scope tables for _C_specific_handler look like this:

  BeginAddress EndAddress HandlerAddress JumpTarget
  [0]          [1]        MyCxxFilter2   [8]
  [2]          [3]        MyCxxFilter1   [4]
  [4]          [5]        MyCxxFilter2   [8]
  [6]          [7]        MyCxxFilter2   [8]

I'm "cheating" in that I can look at the source code,
but again, you already can lower __try, __except already using
_C_specific_handler.  There are no state transitions encoded
in the compiler SEH scope table so they aren't an issue...?

Now there is a subtle problem in my "lowering" in that the
there may be local objects with destructors, that have to
be lowered to __try, __finally.  Microsoft's compiler SEH,
and _C_specific_handler, does not allow a __try block
to have both __except and __finally following.  That's why
I suggest, writing a personality function, replacing
_C_specific_handler that does allow __finally + __except.


> The exception object is allocated in the frame of the function calling
> __CxxThrow, and it has to stay alive until control leaves the catch
> block receiving the exception.
> This is different from Itanium, where the exception object is
> constructed in heap memory and the pointer is saved in TLS. If this
> were not the case, we'd use the __gxx_personaltity_v0-style landingpad
> approach and make a new personality variant that understands MS RTTI.

I'm surprised, I really want to check this myself later this week. I
always thought that MSVCRT always copied your exception object because
I have always seen it invoking the copy constructor on throw X. (It was
a pain in my case because I didn't really want my exception objects to
be copyable, only movable, and at least VS 2010 still insisted that I
implement a copy constructor.)

Furthermore, the "catch info" in the MS ABI does have a field for the
destructor that the catch block has to call. It's not theoretical, I've
got code that calls that function pointer so I can properly catch C++
exceptions from a SEH handler. Though I might be mistaken in that the
field points to just an in-place destructor and not a deleting
destructor.

Also, I thought the stack already is unwinded completely when you reach
the beginning of the catch block (but not a __finally block, i.e. the
cleanup code). At least, that's the impression I get from reading
reverse-engineered source code for the personality functions and the
Windows API RtlUnwindEx.

>
> We could try to do all this outlining in Clang, but that blocks a lot
> of LLVM optimizations. Any object with a destructor (std::string) is
> now escaped into the funclet that calls the destructor, and simple
> transformations (SROA) require interprocedural analysis. This affects
> the code on the normal code path and not just the exceptional path.
> While EH constructs like try / catch are fairly rare in C++, destructor
> cleanups are very, very common, and I'd rather not pessimize so much
> code.

Right, but __CxxFrameHandler3 already forces you to outline destructor
cleanups into funclets. So if you wanted to stop doing that you have to
write your own personality function right?

What I am saying is, if you can design the personality function so that
it works naturally with LLVM IR --- which can't see the source-level
scopes --- that seems a whole lot less work versus:

* Changing the existing Itanium-based EH model in LLVM
* Incurring the wrath of people who like the Itanium model
* Having to maintain backwards compatibility or provide an upgrade path

Also, I think, if we want to eventually support trapped operations
(some kind of invoke div_with_trap mentioned in another thread),
wouldn't it be way easier to implement and optimize if the personality
function can be designed in the right way?

Steve



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: New EH representation for MSVC compatibility

Reid Kleckner-2
In reply to this post by Kaylor, Andrew
On Fri, May 15, 2015 at 5:27 PM, Kaylor, Andrew <[hidden email]> wrote:

I like the way this sorts out with regard to funclet code generation.  It feels very natural for Windows EH, though obviously not as natural for non-Windows targets and I think it is likely to block some optimizations that are currently possible with those targets.


Right, it will block some of today's optimizations by default. I'm OK with this because we can add those optimizations back by checking if the personality is Itanium-family (sjlj, arm, or dwarf), and optimizing EH codepaths is not usually performance critical.
 

> If the unwind label is missing, then control leaves the function after the EH action is completed. If a function is inlined, EH blocks with missing unwind labels are wired up to the unwind label used by the inlined call site.

 

Is this saying that a “missing” unwind label corresponds to telling the runtime to continue the search at the next frame?


Yep. For the C++ data structure it would simply be a missing or null operand.
 

Your example looks wrong in this regard, unless I’m misunderstanding it.  It looks like any exceptions that aren’t caught in that function will lead to a terminate call.


Well, those are the intended semantics of noexcept, unless I'm mistaken. And the inliner *should* wire up the unwind edge of the terminateblock to the unwind edge of the inlined invoke instruction, because it's natural to lower terminateblock to a catch-all plus termination call block. I wanted to express that as data, though, so that in the common case that the noexcept function is not inlined, we can simply flip the "noexcept" bit in the EH info. There's a similar optimization we can do for Itanium that we miss today.

> Invokes that are reached after a catchblock without following any unwind edges must transitively unwind to the first catchend block that the catchblock unwinds to.

 

I’m not sure I understand this correctly.  In particular, I’m confused about the roles of resume and catchend.


catchendblock is really there to support figuring out which calls were inside the catch scope. resume has two roles: moving to the next EH action after a cleanup, and transitioning from the catch block back to normal control flow. Some of my coworkers said it should be split into two instructions for each purpose, and I could go either way.
 

> %val = cleanupblock <valty> unwind label %nextaction

 

Why isn’t this a terminator?  It seems like it performs the same sort of role as catchblock, except presumably it is always entered.  I suppose that’s probably the answer to my question, but it strikes me as an ambiguity in the scheme.  The catchblock instruction is more or less a conditional branch whereas the cleanupblock is more like a label with a hint as to an unconditional branch that will happen later.  And I guess that’s another thing that bothers me -- a resume instruction at the end of a catch implementation means something subtly different than a resume instruction at the end of a cleanup implementation.


Yeah, reusing the resume instruction for both these things might not be good. I liked not having to add more terminator instructions, though. I think most optimizations will not care about the differences between the two kinds of resume. For CFG formation purposes, it either has one successor or none, and that's enough for most users. 

I felt that cleanupblock should not be a terminator because it keeps the IR more concise. The smaller an IR construct is, the more people seem to understand it, so I tried to go with that.

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: New EH representation for MSVC compatibility

Joseph Tremoulet
In reply to this post by Reid Kleckner-2

Hi,

 

Thanks for sending this out.  We're looking forward to seeing this come about, since we need funclet separation for LLILC as well (and I have cycles to spend on it, if that would be helpful).

 

Some questions about the new proposal:

 

- Do the new forms of resume have any implied read/write side-effects, or do they work just like a branch?  In particular, I'm wondering what prevents reordering a call across a resume.  Is this just something that code motion optimizations are expected to check for explicitly to avoid introducing UB per the "Executing such an invoke [or call] that does not transitively unwind to the correct catchend block has undefined behavior" rule?

 

- Does LLVM already have other examples of terminators that are glued to the top of their basic blocks, or will these be the first?  I ask because it's a somewhat nonstandard thing (a block in the CFG that can't have instructions added to it) that any code placement algorithms (PRE, PGO probe insertion, Phi elimination, RA spill/copy placement, etc.) may need to be adjusted for.  The adjustments aren't terrible (conceptually it's no worse than having unsplittable edges from each of the block's preds to each of its succs), but it's something to be aware of.

 

- Since this will require auditing any code with special processing of resume instructions to make sure it handles the new resume forms correctly, I wonder if it might be helpful to give resume (or the new forms of it) a different name, since then it would be immediately clear which code has/hasn't been updated to the new model.

 

- Is the idea that a resume (of the sort that resumes normal execution) ends only one catch/cleanup, or that it can end any number of them?  Did you consider having it end a single one, and giving it a source that references (in a non-flow-edge-inducing way) the related catchend?  If you did that, then:

+ The code to find a funclet region could terminate with confidence when it reaches this sort of resume, and

+ Resumes which exit different catches would have different sources and thus couldn't be merged, reducing the need to undo tail-merging with code duplication in EH preparation (by blocking the tail-merging in the first place)

 

- What is the plan for cleanup/__finally code that may be executed on either normal paths or EH paths?  One could imagine a number of options here:

+ require the IR producer to duplicate code for EH vs non-EH paths

+ duplicate code for EH vs non-EH paths during EH preparation

+ use resume to exit these even on the non-EH paths; code doesn't need to be duplicated (but could and often would be as an optimization for hot/non-EH paths), and normal paths could call the funclet at the end of the day

and it isn't clear to me which you're suggesting.  Requiring duplication can worst-case quadratically expand the code (in that if you have n levels of cleanup-inside-cleanup-inside-cleanup-…, and each cleanup has k code bytes outside the next-inner cleanup, after duplication you'll have k*n + k*(n-1) + … or O(k*n^2) bytes total [compared to k*n before duplication]), which I'd think could potentially be a problem in pathological inputs.

 

Thanks

-Joseph

 

From: [hidden email] [mailto:[hidden email]] On Behalf Of Reid Kleckner
Sent: Friday, May 15, 2015 6:38 PM
To: LLVM Developers Mailing List; Bill Wendling; Nick Lewycky; Kaylor, Andrew
Subject: [LLVMdev] RFC: New EH representation for MSVC compatibility

 

After a long tale of sorrow and woe, my colleagues and I stand here before you defeated. The Itanium EH representation is not amenable to implementing MSVC-compatible exceptions. We need a new representation that preserves information about how try-catch blocks are nested.

 

WinEH background

-------------------------------

 

Skip this if you already know a lot about Windows exceptions. On Windows, every exceptional action that you can imagine is a function call. Throwing an exception is a call. Destructor cleanups and finally blocks are calls to outlined functions that run the cleanup code. Even catching an exception is implemented as an outlined catch handler function which returns the address of the basic block at which normal execution should continue.

 

This is *not* how Itanium landingpads work, where cleanups and catches are executed after unwinding and clearing old function frames off the stack. The transition to a landingpad is *not* like a function call, and this is the only special control transfer used for Itanium EH. In retrospect, having exactly one kind of control transfer turns out to be a great design simplification. Go Itanium!

 

Instead, all MSVC EH personality functions (x86, x64, ARM) cross (C++, SEH) are implemented with interval tables that express the nesting levels of various source constructs like destructors, try ranges, catch ranges, etc. When you rinse your program through LLVM IR today, this structure is what gets lost.

 

New information

-------------------------

 

Recently, we have discovered that the tables for __CxxFrameHandler3 have the additional constraint that the EH states assigned to a catch body must immediately follow the state numbers assigned to the try body. The natural scoping rules of C++ make it so that doing this numbering at the source level is trivial, but once we go to LLVM IR CFG soup, scopes are gone. If you want to know exactly what corner cases break down, search the bug database and mailing lists. The explanations are too long for this RFC.

 

 

New representation

------------------------------

 

I propose adding the following new instructions, all of which (except for resume) are glued to the top of their basic blocks, just like landingpads. They all have an optional ‘unwind’ label operand, which provides the IR with a tree-like structure of what EH action to take after this EH action completes. The unwind label only participates in the formation of the CFG when used in a catch block, and in other blocks it is considered opaque, personality-specific information. If the unwind label is missing, then control leaves the function after the EH action is completed. If a function is inlined, EH blocks with missing unwind labels are wired up to the unwind label used by the inlined call site.

 

The new representation is designed to be able to represent Itanium EH in case we want to converge on a single EH representation in LLVM and Clang. An IR pass can convert these actions to landingpads, typeid selector comparisons, and branches, which means we can phase this representation in on Windows at first and experiment with it slowly on other platforms. Over time, we can move the landingpad conversion lower and lower in the stack until it’s moved into DwarfEHPrepare. We’ll need to support landingpads at least until LLVM 4.0, but we may want to keep them because they are the natural representation for Itanium-style EH, and have a relatively low support burden.

 

resume

-------------

 

; Old form still works, still means control is leaving the function.

resume <valty> %val

; New form overloaded for intra-frame unwinding or resuming normal execution

resume <valty> %val, label %nextaction

; New form for EH personalities that produce no value

resume void

 

Now resume takes an optional label operand which is the next EH action to run. The label must point to a block starting with an EH action. The various EH action blocks impose personality-specific rules about what the targets of the resume can be.

 

catchblock

---------------

 

%val = catchblock <valty> [i8* @typeid.int, i32 7, i32* %e.addr]

    to label %catch.int unwind label %nextaction

 

The catchblock is a terminator that conditionally selects which block to execute based on the opaque operands interpreted by the personality function. If the exception is caught, the ‘to’ block is executed. If unwinding should continue, the ‘unwind’ block is executed. Because the catchblock is a terminator, no instructions can be inserted into a catchblock. The MSVC personality function requires more than just a pointer to RTTI data, so a variable list of operands is accepted. For an Itanium personality, only one RTTI operand is needed. The ‘unwind’ label of a catchblock must point to a catchend.

 

catchendblock

----------------

 

catchend unwind label %nextaction

 

The catchend is a terminator that unconditionally unwinds to the next action. It is merely a placeholder to help reconstruct which invokes were part of the catch blocks of a try. Invokes that are reached after a catchblock without following any unwind edges must transitively unwind to the first catchend block that the catchblock unwinds to. Executing such an invoke that does not transitively unwind to the correct catchend block has undefined behavior.

 

cleanupblock

--------------------

 

%val = cleanupblock <valty> unwind label %nextaction

 

This is not a terminator, and control is expected to flow into a resume instruction which indicates which EH block runs next. If the resume instruction and the unwind label disagree, behavior is undefined.

 

terminateblock

----------------------

 

; for noexcept

terminateblock [void ()* @std.terminate] unwind label %nextaction

; for exception specifications, throw(int)

terminateblock [void ()* @__cxa_unexpected, @typeid.int, ...] unwind label %nextaction

 

This is a terminator, and the unwind label is where execution will continue if the program continues execution. It also has an opaque, personality-specific list of constant operands interpreted by the backend of LLVM. The convention is that the first operand is the function to call to end the program, and the rest determine if the program should end.

 

sehfilterblock?

------------------

 

One big hole in the new representation is SEH filter expressions. They present a major complication because they do not follow a stack discipline. Any EH action is reachable after an SEH filter runs. Because the CFG is so useless for optimization purposes, it’s better to outline the filter in the frontend and assume the filter can run during any potentially throwing function call.

 

MSVC EH implementation strategy

----------------------------------------------

 

Skim this if you just need the semantics of the representation above, and not the implementation details.

 

The new EH block representation allows WinEHPrepare to get a lot simpler. EH blocks should now look a lot more familiar, they are single entry, multi-exit regions of code. This is exactly equivalent to a function, and we can call them funclets. The plan is to generate code for the parent function first, skipping all exceptional blocks, and then generate separate MachineFunctions for each subfunction in turn. I repeat, we can stop doing outlining in IR. This was just a mistake, because I was afraid of grappling with CodeGen.

 

WinEHPrepare will have two jobs now:

1. Mark down which basic blocks are reachable from which handler. Duplicate any blocks that are reachable from two handlers until each block belongs to exactly one funclet, pruning as many unreachable CFG edges as possible.

2. Demote SSA values that are defined in a funclet and used in another funclet.

 

The instruction selection pass is the pass that builds MachineFunctions from IR Functions. This is the pass that will be responsible for the split. It will maintain information about the offsets of static allocas in FunctionLoweringInfo, and will throw it away when all funclets have been generated for this function. This means we don’t need to insert framerecover calls anymore.

 

Generating EH state numbers for the TryBlockMap and StateUnwindTable is a matter of building a tree of EH blocks and invokes. Every unwind edge from an invoke or an EH block represents that the instruction is a child of the target block. If the unwind edge is empty, it is a child of the parent function, which is the root node of the tree. State numbers can be assigned by doing a DFS traversal where invokes are visited before EH blocks, and EH blocks can be visited in an arbitrary-but-deterministic order that vaguely corresponds to source order. Invokes are immediately assigned the current state number. Upon visiting an EH block, the state number is recorded as the “low” state of the block. All invokes are assigned this state number. The state number is incremented, and each child EH block is visited, passing in the state number and producing a new state number. The final state number is returned to the parent node.

 

Example IR from Clang

----------------------------------------

 

The C++:

 

struct Obj { ~Obj(); };

void f(int);

void foo() noexcept {

  try {

    f(1);

    Obj o;

    f(2);

  } catch (int e) {

    f(3);

    try {

      f(4);

    } catch (...) {

      f(5);

    }

  }

}

 

The IR for __CxxFrameHandler3:

 

define void @foo() personality i32 (...)* @__CxxFrameHandler3 {

  %e.addr = alloca i32

  invoke void @f(i32 1)

    to label %cont1 unwind label %maycatch.int

cont1:

  invoke void @f(i32 2)

    to label %cont2 unwind label %cleanup.Obj

cont2:

  call void @~Obj()

  br label %return

return:

  ret void

 

cleanup.Obj:

  cleanupblock unwind label %maycatch.int

  call void @~Obj()

  resume label %maycatch.int

 

  catchblock void [i8* @typeid.int, i32 7, i32* %e.addr]

    to label %catch.int unwind label %catchend1

  invoke void @f(i32 3)

    to label %cont3 unwind label %catchend1

cont3:

  invoke void @f(i32 4)

    to label %cont4 unwind label %maycatch.all

cont4:

  resume label %return

 

maycatch.all:

  catchblock void [i8* null, i32 0, i8* null]

    to label %catch.all unwind label %catchend2

catch.all:

  invoke void @f(i32 5)

    to label %cont5 unwind label %catchend2

cont5:

  resume label %cont4

 

catchend2:

  catchendblock unwind label %catchend1

 

catchend1:

  catchendblock unwind label %callterminate

 

callterminate:

  terminateblock [void ()* @std.terminate]

}

 

From this IR, we can recover the original scoped nesting form that the table formation requires.

 

I think that covers it. Feedback welcome. :)


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: New EH representation for MSVC compatibility

Steve Cheng
In reply to this post by Kaylor, Andrew
On 2015-05-18 14:02:44 -0400, Kaylor, Andrew said:

> We already have something like what you describe in the form of mingw
> support.  It runs on Windows and handles exceptions using a DWARF-style
> personality function and (I think) an LLVM-provided implementation of
> the libc++abi library.
>
> What this doesn't do is provide interoperability with MSVC-compiled
> objects.  For instance, you can't throw an exception from MSVC-compiled
> code and catch it with clang/LLVM-compiled code or vice versa.  With
> the (too fragile) implementation we have in place right now you can do
> that (at least in cases that don't break for other reasons), and we
> want to be able to continue that capability with a new, more robust,
> solution.

I skimmed the source code in libgcc of that personality function. It's
rather tricky in that it threads the SEH personality function through
an existing Itanium-style personality function. I agree completely that
code might not be interoperable with MSVC, though I can't tell for
sure. But I wasn't thinking of threading an Itanium-style personality.
I was thinking of a personality that still adhered to SEH semantics as
much as possible but lift the restrictions of _CxxFrameHandler3 that
block what you're doing so far.

Even the problem that Reid mentioned about _CxxThrowException putting
the exception object in the wrong place, I think, can be worked around
with a new personality. The personality just has to copy the exception
object into the stack frame of the function with the catch block (i.e.
"landing pad" in Itanium parlance) before RtlUnwindEx transfers control
to the landing pad. Obviously, you have to pre-allocate for the size of
the exception object, I guess, in your WinEHPrepare pass. Obviously
_CxxFrameHandler3 does not do that but we could.

Steve


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: New EH representation for MSVC compatibility

Steve Cheng
On 2015-05-18 15:40:34 -0400, I said:

> Even the problem that Reid mentioned about _CxxThrowException putting
> the exception object in the wrong place, I think, can be worked around
> with a new personality. The personality just has to copy the exception

Should have thought more before opening my mouth :)
Scratch that because it won't work with rethrows unless I get to patch
the address of the in-flight exception object. Damn global variables.


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: New EH representation for MSVC compatibility

Reid Kleckner-2
In reply to this post by Steve Cheng
On Mon, May 18, 2015 at 11:48 AM, Steve Cheng <[hidden email]> wrote:
On 2015-05-18 13:32:54 -0400, Reid Kleckner said:

Right, doing our own personality function is possible, but still has half the challenge of using __CxxFrameHandler3, and introduces a new runtime dependency that isn't there currently. Given that it won't save that much work, I'd rather not introduce a dependency that wasn't there before.

The reason it's still hard is that you have to split the main function up into more than one subfunction.

I see, but I thought are able to outline cleanup code already?

We can, but frankly it's unreliable. The new representation should help make the job easier.
 
And that the hiccup you are encountering is because __CxxFrameHandler3 requires unwind tables with properly-ordered state transitions? The compiler SEH personality (_C_specific_handler) doesn't have that, right? If you could manage __try, __finally already, doesn't that provide the solution?

Right, __CxxFrameHandler3 is a lot more constraining than __C_specific_handler. The SEH personality doesn't let you rethrow exceptions, so once you catch the exception you're done, you're in the parent function. My understanding is that C++ works by having an active catch handler on the stack.
 
Let me be precise. Let's take your example with the "ambiguous IR lowering":

I snipped the example, but in general, yes, I agree we could do another personality with a less restrictive table format. I'm still not convinced it's worth it.

The exception object is allocated in the frame of the function calling __CxxThrow, and it has to stay alive until control leaves the catch block receiving the exception.
This is different from Itanium, where the exception object is constructed in heap memory and the pointer is saved in TLS. If this were not the case, we'd use the __gxx_personaltity_v0-style landingpad approach and make a new personality variant that understands MS RTTI.

I'm surprised, I really want to check this myself later this week. I always thought that MSVCRT always copied your exception object because I have always seen it invoking the copy constructor on throw X. (It was a pain in my case because I didn't really want my exception objects to be copyable, only movable, and at least VS 2010 still insisted that I implement a copy constructor.)

Right, the type does have to be copyable. I think it's supposed to be copied as part of the throw-expression, but if not, then it has to go fill out the CatchableType tables, which have copy constructors in them. Anyway, I might be wrong about where precisely the exception lives in memory, but I'm sure the catches are outlined to support rethrow.
 
Furthermore, the "catch info" in the MS ABI does have a field for the destructor that the catch block has to call. It's not theoretical, I've got code that calls that function pointer so I can properly catch C++ exceptions from a SEH handler. Though I might be mistaken in that the field points to just an in-place destructor and not a deleting destructor.

Yep.
 
Also, I thought the stack already is unwinded completely when you reach the beginning of the catch block (but not a __finally block, i.e. the cleanup code). At least, that's the impression I get from reading reverse-engineered source code for the personality functions and the Windows API RtlUnwindEx.

For __try / __except, yes, the stack is unwound at the point of the __except. For try / catch, the stack unwinds after you leave the catch body by fallthrough, goto, break, continue, return or whatever else you like, because after that point you cannot rethrow anymore.

We could try to do all this outlining in Clang, but that blocks a lot of LLVM optimizations. Any object with a destructor (std::string) is now escaped into the funclet that calls the destructor, and simple transformations (SROA) require interprocedural analysis. This affects the code on the normal code path and not just the exceptional path. While EH constructs like try / catch are fairly rare in C++, destructor cleanups are very, very common, and I'd rather not pessimize so much code.

Right, but __CxxFrameHandler3 already forces you to outline destructor cleanups into funclets. So if you wanted to stop doing that you have to write your own personality function right?

No, I believe if we want to be able ABI compatible, we need to outline at least destructor cleanups, regardless of what personality we use.
 
What I am saying is, if you can design the personality function so that it works naturally with LLVM IR --- which can't see the source-level scopes --- that seems a whole lot less work versus:

* Changing the existing Itanium-based EH model in LLVM
* Incurring the wrath of people who like the Itanium model
* Having to maintain backwards compatibility or provide an upgrade path

So, the nice thing about this design is that there are no scopes in normal control flow. The scoping is all built into the EH blocks, which most optimization passes don't care about. If you do a quick search through lib/Transforms, you'll see there are very few passes that operate on LandingPadInst and ResumeInst. Changing these instructions is actually relatively cheap, if we can agree on the new semantics.
 
Also, I think, if we want to eventually support trapped operations (some kind of invoke div_with_trap mentioned in another thread), wouldn't it be way easier to implement and optimize if the personality function can be designed in the right way?

Right, asynch exceptions are definitely something that users keep asking for, so I'd like to see it done right if we want to do it at all. I think this change is separable, though. Asynch exceptions have a lot more to do with how you represent the potentially trapping operations (BB unwind labels, lots of invoked-intrinsics, more instructions) than how you represent the things to do on exception.

Thanks for taking a look!

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: New EH representation for MSVC compatibility

Reid Kleckner-2
In reply to this post by Joseph Tremoulet
On Mon, May 18, 2015 at 12:03 PM, Joseph Tremoulet <[hidden email]> wrote:

Hi,

 

Thanks for sending this out.  We're looking forward to seeing this come about, since we need funclet separation for LLILC as well (and I have cycles to spend on it, if that would be helpful).

 

Some questions about the new proposal:

 

- Do the new forms of resume have any implied read/write side-effects, or do they work just like a branch?  In particular, I'm wondering what prevents reordering a call across a resume.  Is this just something that code motion optimizations are expected to check for explicitly to avoid introducing UB per the "Executing such an invoke [or call] that does not transitively unwind to the correct catchend block has undefined behavior" rule?


Yes, crossing a resume from a catchblock ends the lifetime of the exception object, so I'd say that's a "writes escaped memory" constraint. That said, a resume after a cleanupblock doesn't, but I'm not sure it's worth having this kind of fine-grained analysis. I'm OK teaching SimplifyCFG to combine cleanupblocks and leaving it at that.
  

- Does LLVM already have other examples of terminators that are glued to the top of their basic blocks, or will these be the first?  I ask because it's a somewhat nonstandard thing (a block in the CFG that can't have instructions added to it) that any code placement algorithms (PRE, PGO probe insertion, Phi elimination, RA spill/copy placement, etc.) may need to be adjusted for.  The adjustments aren't terrible (conceptually it's no worse than having unsplittable edges from each of the block's preds to each of its succs), but it's something to be aware of.


No, LLVM doesn't have anything like this yet. It does have unsplittable critical edges, which can come from indirectbr and the unwind edge of an invoke. I don't think it'll be too hard to teach transforms how to deal with one more, but maybe that's unrealistic youthful optimism. :)

- Since this will require auditing any code with special processing of resume instructions to make sure it handles the new resume forms correctly, I wonder if it might be helpful to give resume (or the new forms of it) a different name, since then it would be immediately clear which code has/hasn't been updated to the new model.


There aren't that many references to ResumeInst across LLVM, so I'm not too scared. I'm not married to reusing 'resume', other candidate names include 'unwind' and 'continue', and I'd like more ideas.
 

- Is the idea that a resume (of the sort that resumes normal execution) ends only one catch/cleanup, or that it can end any number of them?  Did you consider having it end a single one, and giving it a source that references (in a non-flow-edge-inducing way) the related catchend?  If you did that, then:

+ The code to find a funclet region could terminate with confidence when it reaches this sort of resume, and

+ Resumes which exit different catches would have different sources and thus couldn't be merged, reducing the need to undo tail-merging with code duplication in EH preparation (by blocking the tail-merging in the first place)


We already have something like this for cleanupblocks because the resume target and unwind label of the cleanupblock must match. It isn't as strong as having a reference to the catchblock itself, because tail merging could kick in like you mention. Undoing this would be and currently is the job of WinEHPrepare. I guess I felt like the extra representational complexity wasn't worth the confidence that it would buy us.
 

- What is the plan for cleanup/__finally code that may be executed on either normal paths or EH paths?  One could imagine a number of options here:

+ require the IR producer to duplicate code for EH vs non-EH paths

+ duplicate code for EH vs non-EH paths during EH preparation

+ use resume to exit these even on the non-EH paths; code doesn't need to be duplicated (but could and often would be as an optimization for hot/non-EH paths), and normal paths could call the funclet at the end of the day

and it isn't clear to me which you're suggesting.  Requiring duplication can worst-case quadratically expand the code (in that if you have n levels of cleanup-inside-cleanup-inside-cleanup-…, and each cleanup has k code bytes outside the next-inner cleanup, after duplication you'll have k*n + k*(n-1) + … or O(k*n^2) bytes total [compared to k*n before duplication]), which I'd think could potentially be a problem in pathological inputs.


I want to have separate normal and exceptional codepaths, but at -O0 all the cleanup work should be bundled up in a function that gets called from both those paths.

Today, for C++ destructors, we emit two calls to the destructor: one on the normal path and one on the EH path. For __finally, we outline the finally body early in clang and emit two calls to it as before, but passing in the frameaddress as an argument. I think this is a great place to be. It keeps our -O0 code size small, simplifies the implementation, and allows us to inline one or both call sites if we think it's profitable.

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: New EH representation for MSVC compatibility

Steve Cheng
In reply to this post by Reid Kleckner-2
Hi Reid,

> Right, __CxxFrameHandler3 is a lot more constraining than
> __C_specific_handler. The SEH personality doesn't let you rethrow
> exceptions, so once you catch the exception you're done, you're in the
> parent function. My understanding is that C++ works by having an active
> catch handler on the stack.

Okay, I checked the Wine source code for __CxxFrameHandler3. I stand corrected.

While we are on the topic of Windows EH, I like to know your (and
others', of course) thoughts on the following. It's my wishlist as a
frontend implementor :)

- Win32 (x86) frame-based SEH

For __CxxFrameHandler3, since destructors and catch blocks execute as
funclets while the throwing function's stack frame is still active,
it's not going to be a problem right?

But for __C_specific_handler, I see a potential issue versus x86-64, in
that RtlUnwind can't restore non-volatile registers, so when registers
are garbage when control is transferred to the landing pad.  When I
read the Itanium ABI documentation, it says that landing pads do get
non-volatile registers restored, so I guess that's probably the working
assumption of LLVM.

 __C_specific_handler's registration frame saves down EBP, but no other
registers, even ESP. If we use dynamic alloca or frame pointer
omission, we are dead in the water, right?

- Writing one's own personality functions

This makes a lot of sense if one is implementing a different language
than C++ that has exceptions, and is prepared to provide their own
run-time support.

Say, if the language supports resuming from exceptions, or can query
type information in more flexible ways than C++'s std::type_info
matching. Does it really make sense for the backend, LLVM, to hard-code
knowledge about the language-specific data area (LSDA)? Even in the
Itanium ABI it's explicitly stated that the personality is specific to
the source language, yet multiple personalities can interoperate in the
same program. Ideally, I would prefer the backend to take control of
everything to do with arranging the landing pads, branches within
landing pads, and so on, but NOT the language-dependent exception
matching.

Taken to the extreme, LLVM would have to expose tables that the LLVM
client would have to translate to their own formats, like the garbage
collection "unwind" tables. If that's too complicated at least it would
be nice to supply custom filter functions for catch clauses. Inspired
by SEH filters obviously, but we might devise a slightly more portable
version.

Even for C++ I actually wouldn't mind being able to arbitrarily replace
the personality, and/or the runtime functions for throwing and
resuming. In my C++ source code I always throw exceptions wrapped in a
macro, because I want to instrument all my throw statements. In
particular, I can construct a reliable stack trace on the spot with
RtlVirtualUnwind (or walking the EBP chain on x86). It would be a nice
bonus if we could implement this kind of instrumentation with Clang.
Encouragement to switch from MSVC :)

Steve


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: New EH representation for MSVC compatibility

Reid Kleckner-2
 On Mon, May 18, 2015 at 2:42 PM, Steve Cheng <[hidden email]> wrote:
Hi Reid,

Right, __CxxFrameHandler3 is a lot more constraining than __C_specific_handler. The SEH personality doesn't let you rethrow exceptions, so once you catch the exception you're done, you're in the parent function. My understanding is that C++ works by having an active catch handler on the stack.

Okay, I checked the Wine source code for __CxxFrameHandler3. I stand corrected.

While we are on the topic of Windows EH, I like to know your (and others', of course) thoughts on the following. It's my wishlist as a frontend implementor :)

- Win32 (x86) frame-based SEH

For __CxxFrameHandler3, since destructors and catch blocks execute as funclets while the throwing function's stack frame is still active, it's not going to be a problem right?

My understanding is that __CxxFrameHandler3 does something like the following:

for (void (*Cleanup)(bool, void*) : Cleanups) {
  __try {
    Cleanup(/*AbnormalTermination=*/true, EstablisherFrame);
  } __except(1) {
    std::terminate(); // can't rethrow
  }
}
__try {
  CallCatchBlock();
} __except(__CxxDetectRethrow(), EXCEPTION_CONTINUE_SEARCH) {
}

So I guess it's not really that the catch block has an active frame, and more that __CxxFrameHandler3 is there saying "hey, I saw a rethrow exception go by during phase 1, here's what that exception was supposed to be".

But for __C_specific_handler, I see a potential issue versus x86-64, in that RtlUnwind can't restore non-volatile registers, so when registers are garbage when control is transferred to the landing pad.  When I read the Itanium ABI documentation, it says that landing pads do get non-volatile registers restored, so I guess that's probably the working assumption of LLVM.

That's pretty frustrating, given that the xdata unwinder already knows where the non-volatile registers are saved. Anyway, I think it can be overcome in the backend with the right register allocation constraints.
 
__C_specific_handler's registration frame saves down EBP, but no other registers, even ESP. If we use dynamic alloca or frame pointer omission, we are dead in the water, right?

Are you sure the unwinder doesn't restore RSP? Anyway, the address of a dynamic alloca can easily be spilled to the stack and reloaded.
 
- Writing one's own personality functions

This makes a lot of sense if one is implementing a different language than C++ that has exceptions, and is prepared to provide their own run-time support.

Say, if the language supports resuming from exceptions, or can query type information in more flexible ways than C++'s std::type_info matching. Does it really make sense for the backend, LLVM, to hard-code knowledge about the language-specific data area (LSDA)? Even in the Itanium ABI it's explicitly stated that the personality is specific to the source language, yet multiple personalities can interoperate in the same program. Ideally, I would prefer the backend to take control of everything to do with arranging the landing pads, branches within landing pads, and so on, but NOT the language-dependent exception matching.

Taken to the extreme, LLVM would have to expose tables that the LLVM client would have to translate to their own formats, like the garbage collection "unwind" tables. If that's too complicated at least it would be nice to supply custom filter functions for catch clauses. Inspired by SEH filters obviously, but we might devise a slightly more portable version.

I think LLVM has to know about the table format and landingpad PC values, because that's its business. The RTTI data, though, is completely between the frontend and the EH personality. I could imagine a personality that uses an Itanium LSDA, but the RTTI pointers are really pointers to functions that get called during phase 1 to implement SEH filters. The new representation will actually allow you to pass more data here to support passing in "adjectives" as required for MSVC, but LLVM will have to know where to put it in the table and there's no way to avoid that.
 
Even for C++ I actually wouldn't mind being able to arbitrarily replace the personality, and/or the runtime functions for throwing and resuming. In my C++ source code I always throw exceptions wrapped in a macro, because I want to instrument all my throw statements. In particular, I can construct a reliable stack trace on the spot with RtlVirtualUnwind (or walking the EBP chain on x86). It would be a nice bonus if we could implement this kind of instrumentation with Clang. Encouragement to switch from MSVC :)


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: New EH representation for MSVC compatibility

Kaylor, Andrew
In reply to this post by Reid Kleckner-2

I hadn’t noticed the “noexcept” specifier in your example.  That clears up part of my concerns, but I still have some problems.

 

 

With regard to the multiple meanings of ‘resume’ I am more concerning about developers who are reading the IR understanding it than about passes operating on it.  Apart from making it harder to debug problems related to control flow at resume instructions I think this makes it more likely that code which mishandles it will be introduced down the road.  If I’m reading things correctly, a resume instruction in your proposal could mean:

 

a) We’re done handling this exception, continue normal execution at this label.

b) We’re done handling this exception, continue execution in an enclosing catch handler at this label.

c) We’re done executing this termination handler, check the catch handler at this label to see if it can handle the current exception.

d) We’re done executing this termination handler, now execute the termination handler at this label.

e) We’re done executing this termination handler, continue handling the exception in the runtime.

 

I suppose (a) and (b) are more or less the same and it doesn’t entirely matter whether the destination is in normal code or exception code.  In practical terms (c) and (d) may be the same also, but logically, in terms of how the runtime works, they are different.  I’m pretty sure there’s a gap in my understanding of your proposal because I don’t understand how e() is represented at all.

 

As an exercise, I tried to work through the IR that would be produced in the non-optimized case for the following code:

 

void test() {

  try {

    Obj o1;

    try {

      f();

    } catch (int) {}

    Obj o2;

    try {

      g();

    } catch (int) {}

    h();

  } catch (int) {}

}

 

Here’s what I came up with:

 

define void @foo() personality i32 (...)* @__CxxFrameHandler3 {

  %e.addr = alloca i32

  invoke void @f(i32 1)

    to label %cont1 unwind label %cleanup.Obj

cont1:

  invoke void @g(i32 2)

    to label %cont2 unwind label %cleanup.Obj.1

cont2:

  invoke void @h(i32 2)

    to label %cont3 unwind label %cleanup.Obj.2

cont3:

  call void @~Obj()

  call void @~Obj()

  br label %return

return:

  ret void

 

cleanup.Obj:

  cleanupblock unwind label %maycatch.int

  call void @~Obj()

  resume label %maycatch.int

 

maycatch.int:

  catchblock void [i8* @typeid.int, i32 7, i32* %e.addr]

    to label %catch.int unwind label %catchend

catch.int:

  resume label %cont1

catchend:

  resume

 

cleanup.Obj.1:

  cleanupblock unwind label %maycatch.int.1

  call void @~Obj()

  call void @~Obj()

  resume label %maycatch.int.1

 

maycatch.int.1:

  catchblock void [i8* @typeid.int, i32 7, i32* %e.addr]

    to label %catch.int.1 unwind label %catchend.1

catch.int.1:

  resume label %cont2

catchend.1:

  resume

 

 

cleanup.Obj.2:

  cleanupblock unwind label %maycatch.int.2

  call void @~Obj()

  call void @~Obj()

  resume label %maycatch.int.2

 

maycatch.int.2:

  catchblock void [i8* @typeid.int, i32 7, i32* %e.addr]

    to label %catch.int.2 unwind label %catchend.2

catch.int.2:

  resume label %return

catchend.2:

  resume

}

 

I don’t know if I got that right, but it seems to me that there are a couple of problems with this.  Most obviously, there is a good bit of duplicated code here (which the optimization passes will probably want to combine).

 

More significantly though is that it doesn’t correctly describe what happens if a non-int exception is thrown in any of the called functions.  For instance, if a non-int exception is thrown from g() that is caught somewhere further down the stack, the runtime should call a terminate handler that destructs o1 and then call a terminate handler that destructs o2.  However, my IR doesn’t describe a terminate handler that destructs just o2 and I don’t know how I could get it to do so within the scheme that you have proposed.

 

Do you have a way to handle this case that I haven’t perceived?

 

 

In a mostly unrelated matter, have you thought about what needs to be done to prevent catchblock blocks from being combined?  For example, suppose you have code that looks like this:

 

void test() {

  try {

    f();

  } catch (int) {

    x();

    y();

    z();

  }

  try {

    g();

  } catch (…) {

  }

  try {

    h();

  } catch (int) {

    x();

    y();

    z();

  }

}

 

I think it’s very likely that if we don’t do anything to prevent it the IR generated for this will be indistinguishable from the IR generated for this:

 

void test() {

  try {

    f();

    try {

      g();

    } catch (…) {

    }

    h();

  } catch (int) {

    x();

    y();

    z();

  }

}

 

In this case that might be OK, but theoretically the calls to f() and h() should get different states and there are almost certainly cases where failing to recognize that will cause problems.  What’s more, the same basic pattern arises for this case:

 

void test() {

  try {

    f();

  } catch (int) {

    x();

    y();

    z();

  }

  try {

    g();

  } catch (float) {

  }

  try {

    h();

  } catch (int) {

    x();

    y();

    z();

  }

}

 

But in this case, if we get the state numbering wrong an int-exception from g() could end up being incorrectly caught by the xyz handler.

 

BTW, finding cases like this is the primary reason that I’ve been trying to push my current in-flight patch onto the sinking ship that is our current implementation.  I mentioned to you before that the test suite I’m using passes with my proposed patch, but that’s only true with optimizations disabled.  With optimizations turned on I’m seeing all kinds of fun things like similar handlers being combined and common instructions being hoisted above a shared(!) eh_begincatch call in if-else paired handlers.  I don’t know if it will be worth trying to fix these problems, but seeing them in action has been very instructive.

 

-Andy

 

 

From: Reid Kleckner [mailto:[hidden email]]
Sent: Monday, May 18, 2015 11:54 AM
To: Kaylor, Andrew
Cc: LLVM Developers Mailing List; Bill Wendling; Nick Lewycky
Subject: Re: New EH representation for MSVC compatibility

 

On Fri, May 15, 2015 at 5:27 PM, Kaylor, Andrew <[hidden email]> wrote:

I like the way this sorts out with regard to funclet code generation.  It feels very natural for Windows EH, though obviously not as natural for non-Windows targets and I think it is likely to block some optimizations that are currently possible with those targets.

 

Right, it will block some of today's optimizations by default. I'm OK with this because we can add those optimizations back by checking if the personality is Itanium-family (sjlj, arm, or dwarf), and optimizing EH codepaths is not usually performance critical.

 

> If the unwind label is missing, then control leaves the function after the EH action is completed. If a function is inlined, EH blocks with missing unwind labels are wired up to the unwind label used by the inlined call site.

 

Is this saying that a “missing” unwind label corresponds to telling the runtime to continue the search at the next frame?

 

Yep. For the C++ data structure it would simply be a missing or null operand.

 

Your example looks wrong in this regard, unless I’m misunderstanding it.  It looks like any exceptions that aren’t caught in that function will lead to a terminate call.

 

Well, those are the intended semantics of noexcept, unless I'm mistaken. And the inliner *should* wire up the unwind edge of the terminateblock to the unwind edge of the inlined invoke instruction, because it's natural to lower terminateblock to a catch-all plus termination call block. I wanted to express that as data, though, so that in the common case that the noexcept function is not inlined, we can simply flip the "noexcept" bit in the EH info. There's a similar optimization we can do for Itanium that we miss today.

 

> Invokes that are reached after a catchblock without following any unwind edges must transitively unwind to the first catchend block that the catchblock unwinds to.

 

I’m not sure I understand this correctly.  In particular, I’m confused about the roles of resume and catchend.

 

catchendblock is really there to support figuring out which calls were inside the catch scope. resume has two roles: moving to the next EH action after a cleanup, and transitioning from the catch block back to normal control flow. Some of my coworkers said it should be split into two instructions for each purpose, and I could go either way.

 

> %val = cleanupblock <valty> unwind label %nextaction

 

Why isn’t this a terminator?  It seems like it performs the same sort of role as catchblock, except presumably it is always entered.  I suppose that’s probably the answer to my question, but it strikes me as an ambiguity in the scheme.  The catchblock instruction is more or less a conditional branch whereas the cleanupblock is more like a label with a hint as to an unconditional branch that will happen later.  And I guess that’s another thing that bothers me -- a resume instruction at the end of a catch implementation means something subtly different than a resume instruction at the end of a cleanup implementation.

 

Yeah, reusing the resume instruction for both these things might not be good. I liked not having to add more terminator instructions, though. I think most optimizations will not care about the differences between the two kinds of resume. For CFG formation purposes, it either has one successor or none, and that's enough for most users. 

 

I felt that cleanupblock should not be a terminator because it keeps the IR more concise. The smaller an IR construct is, the more people seem to understand it, so I tried to go with that.


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: New EH representation for MSVC compatibility

Philip Reames-4
In reply to this post by Reid Kleckner-2


On 05/18/2015 11:53 AM, Reid Kleckner wrote:
On Fri, May 15, 2015 at 5:27 PM, Kaylor, Andrew <[hidden email]> wrote:

I like the way this sorts out with regard to funclet code generation.  It feels very natural for Windows EH, though obviously not as natural for non-Windows targets and I think it is likely to block some optimizations that are currently possible with those targets.


Right, it will block some of today's optimizations by default. I'm OK with this because we can add those optimizations back by checking if the personality is Itanium-family (sjlj, arm, or dwarf), and

optimizing EH codepaths is not usually performance critical.
Leaving aside the rest of the thread, I feel the need to refute this point in isolation.  I've found that optimizing (usually simplifying and eliminating) exception paths ends up being *extremely* important for my workloads.  Failing to optimize exception paths sufficiently tends to indirectly hurt things like inlining for example.  Any design which starts with the assumption that optimizing exception paths isn't important is going to be extremely problematic for me. 
 

> If the unwind label is missing, then control leaves the function after the EH action is completed. If a function is inlined, EH blocks with missing unwind labels are wired up to the unwind label used by the inlined call site.

 

Is this saying that a “missing” unwind label corresponds to telling the runtime to continue the search at the next frame?


Yep. For the C++ data structure it would simply be a missing or null operand.
 

Your example looks wrong in this regard, unless I’m misunderstanding it.  It looks like any exceptions that aren’t caught in that function will lead to a terminate call.


Well, those are the intended semantics of noexcept, unless I'm mistaken. And the inliner *should* wire up the unwind edge of the terminateblock to the unwind edge of the inlined invoke instruction, because it's natural to lower terminateblock to a catch-all plus termination call block. I wanted to express that as data, though, so that in the common case that the noexcept function is not inlined, we can simply flip the "noexcept" bit in the EH info. There's a similar optimization we can do for Itanium that we miss today.

> Invokes that are reached after a catchblock without following any unwind edges must transitively unwind to the first catchend block that the catchblock unwinds to.

 

I’m not sure I understand this correctly.  In particular, I’m confused about the roles of resume and catchend.


catchendblock is really there to support figuring out which calls were inside the catch scope. resume has two roles: moving to the next EH action after a cleanup, and transitioning from the catch block back to normal control flow. Some of my coworkers said it should be split into two instructions for each purpose, and I could go either way.
 

> %val = cleanupblock <valty> unwind label %nextaction

 

Why isn’t this a terminator?  It seems like it performs the same sort of role as catchblock, except presumably it is always entered.  I suppose that’s probably the answer to my question, but it strikes me as an ambiguity in the scheme.  The catchblock instruction is more or less a conditional branch whereas the cleanupblock is more like a label with a hint as to an unconditional branch that will happen later.  And I guess that’s another thing that bothers me -- a resume instruction at the end of a catch implementation means something subtly different than a resume instruction at the end of a cleanup implementation.


Yeah, reusing the resume instruction for both these things might not be good. I liked not having to add more terminator instructions, though. I think most optimizations will not care about the differences between the two kinds of resume. For CFG formation purposes, it either has one successor or none, and that's enough for most users. 

I felt that cleanupblock should not be a terminator because it keeps the IR more concise. The smaller an IR construct is, the more people seem to understand it, so I tried to go with that.


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: New EH representation for MSVC compatibility

Kaylor, Andrew

> optimizing EH codepaths is not usually performance critical.

>> Leaving aside the rest of the thread, I feel the need to refute this point in isolation.  I've found that optimizing (usually simplifying and eliminating) exception paths ends up being *extremely* important for my workloads.  Failing to optimize exception paths sufficiently tends to indirectly hurt things like inlining for example.  Any design which starts with the assumption that optimizing exception paths isn't important is going to be extremely problematic for me. 

That’s interesting. 

 

I wasn’t thinking about performance so much as code size in my original comment.  I’ve been looking at IR recently where code from multiple exception handlers was combined while still maintaining the basic control flow of the EH code.  This kind of optimization is wreaking havoc for our current MSVC compatible EH implementation (hence the redesign), but I guess the Itanium ABI scheme doesn’t have a problem with it.

 

I suppose that is closely related to your concerns about inlining, I just hadn’t made the connection.

 

In theory the funclets should be able to share code blocks without any problem.  The entry and exit points are the critical parts that make them funclets.  I’m just not sure how we can get the optimization passes to recognize this fact while still meeting the MSVC runtime constraints.  Reid’s proposal of separate catch blocks should help with that, but I’m still not sure we’ll want to use this representation for targets that don’t need it.

 


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: New EH representation for MSVC compatibility

Reid Kleckner-2
On Mon, May 18, 2015 at 4:36 PM, Kaylor, Andrew <[hidden email]> wrote:

> optimizing EH codepaths is not usually performance critical.

>> Leaving aside the rest of the thread, I feel the need to refute this point in isolation.  I've found that optimizing (usually simplifying and eliminating) exception paths ends up being *extremely* important for my workloads.  Failing to optimize exception paths sufficiently tends to indirectly hurt things like inlining for example.  Any design which starts with the assumption that optimizing exception paths isn't important is going to be extremely problematic for me. 


On the whole, the whole reason we've gone down this path is to support stronger analysis of EH paths, but I always think about it in terms of supporting simplification of the normal control flow path. Consider unique_ptr:

void f() {
  std::unique_ptr<int> p(new int(42));
  g(p.get());
}

This representation should support removing the heap allocation here by inlining the destructor on the normal path and EH path and promoting the heap allocation to a stack allocation. If our representation required early outlining, this would not be possible, or at least it would require inter-procedural analysis.

That’s interesting. 

 

I wasn’t thinking about performance so much as code size in my original comment.  I’ve been looking at IR recently where code from multiple exception handlers was combined while still maintaining the basic control flow of the EH code.  This kind of optimization is wreaking havoc for our current MSVC compatible EH implementation (hence the redesign), but I guess the Itanium ABI scheme doesn’t have a problem with it.

 

I suppose that is closely related to your concerns about inlining, I just hadn’t made the connection.

 

In theory the funclets should be able to share code blocks without any problem.  The entry and exit points are the critical parts that make them funclets.  I’m just not sure how we can get the optimization passes to recognize this fact while still meeting the MSVC runtime constraints.  Reid’s proposal of separate catch blocks should help with that, but I’m still not sure we’ll want to use this representation for targets that don’t need it.


I think sharing code between funclets would require some extreme gymnastics to generate the right pdata and xdata, but I suppose it's not too different from what MSVC 2015 requires for coroutines.

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
12