unwinds to in the CFG

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

unwinds to in the CFG

Nick Lewycky
I have a new plan for handling 'unwinds to' in the control flow graph
and dominance.

Just as a quick recap the problem I encountered is how to deal
instructions in a block being used as operands in the unwind dest. Such
as this:

bb1: unwinds to %cleanup
   call void @foo() ; might throw, might not
   %x = add i32 %y, %z
   call void @foo() ; might throw, might not
   ret void
cleanup:
   call void @use(i32 %x)

The problem is that %x might not have been executed before we enter
%cleanup.

I won't reiterate what my previous plan was. The problem with it was
that a lot of optimizations use the dominance tree to decide where it's
safe to insert new instructions and it always assumes that if A dom B
then an instruction in A is always accessible in B.

Here's the new plan. Please poke holes in it.

A. redefine the CFG a bit.
  i.   pred_iterator stays the same.
  ii.  succ_iterator only iterates over successors
  iii. outedge_iterator iterates over successors and the unwind dest

There's still some code which refers to outedges by TerminatorInst +
unsigned. I can't think of any reason not to replace all of them with
outedge_iterator.

B. redefine the dominator tree by modifying the GraphTraits
  i.  A dom B means that all instructions in A are guaranteed to execute
before any instructions in B.
  ii. the domtree may have multiple roots.

Multiple roots occurs when the entry block 'unwinds to' another block.
Domtree getRoot() should just return the root represented by the
Function entry block, and if anyone needs it would could provide
getAllRoots(). We would also need to add an "isReachable" method to
domtree and replace all users who currently test for reachability by
checking whether the block is dominated by the root.

Before I start investing time implementing these changes, does anyone
foresee any problems that I missed?

Nick

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: unwinds to in the CFG

Nick Lewycky
Sorry -- Small change.

Nick Lewycky wrote:
> A. redefine the CFG a bit.
>   i.   pred_iterator stays the same.

pred_iterator becomes an inverse of outedge_iterator. That is, edges
that lead to the execution of this block, regardless of whether it's an
unwind-edge or not.

Nick

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: unwinds to in the CFG

Owen Anderson-2
Note that changing the definition of pred_iterator may have  
consequences for post-dom tree construction.  You may need to split  
pred_iterator as you are splitting succ_iterator.

--Owen

On Mar 28, 2008, at 1:20 AM, Nick Lewycky wrote:

> Sorry -- Small change.
>
> Nick Lewycky wrote:
>> A. redefine the CFG a bit.
>>  i.   pred_iterator stays the same.
>
> pred_iterator becomes an inverse of outedge_iterator. That is, edges
> that lead to the execution of this block, regardless of whether it's  
> an
> unwind-edge or not.
>
> Nick
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

smime.p7s (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: unwinds to in the CFG

Duncan Sands
In reply to this post by Nick Lewycky
Hi Nick,

> Just as a quick recap the problem I encountered is how to deal
> instructions in a block being used as operands in the unwind dest. Such
> as this:
>
> bb1: unwinds to %cleanup
>    call void @foo() ; might throw, might not
>    %x = add i32 %y, %z
>    call void @foo() ; might throw, might not
>    ret void
> cleanup:
>    call void @use(i32 %x)
>
> The problem is that %x might not have been executed before we enter
> %cleanup.

how about just declaring this illegal?  i.e. require the first call
to be in a different basic block to the second, making it possible
to use a phi node in %cleanup.

Ciao,

Duncan.
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: unwinds to in the CFG

Gordon Henriksen-3
In reply to this post by Nick Lewycky
Hi Nick,

On Mar 28, 2008, at 02:15, Nick Lewycky wrote:

Before I start investing time implementing these changes, does anyone foresee any problems that I missed?

Stepping back from the nuts and bolts for a moment, could you precisely define what constitutes a predecessor in this model?

What blocks would a phi node in %catch require for a case like this?

define i8 @f(i1 %b) {
entry:
  b label %try
try: unwinds to %catch
  b i1 %b, label %then, label %else
then: unwinds to %catch
  ret void
else: unwinds to %catch
  ret void
catch:  ; What are my predecessors?
  ret void
}

B. redefine the dominator tree by modifying the GraphTraits
 i.  A dom B means that all instructions in A are guaranteed to execute before any instructions in B.
 ii. the domtree may have multiple roots.

Multiple roots occurs when the entry block 'unwinds to' another block.

It seems highly problematical that static allocas might not dominate their uses. The entry block is already special in that it cannot be used as a branch target. Why not also forbid 'unwinds to' on the entry block?

— Gordon


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: unwinds to in the CFG

Devang Patel
In reply to this post by Nick Lewycky

On Mar 27, 2008, at 11:15 PM, Nick Lewycky wrote:

> I have a new plan for handling 'unwinds to' in the control flow graph
> and dominance.
>
> Just as a quick recap the problem I encountered is how to deal
> instructions in a block being used as operands in the unwind dest.  
> Such
> as this:
>
> bb1: unwinds to %cleanup
>   call void @foo() ; might throw, might not
>   %x = add i32 %y, %z
>   call void @foo() ; might throw, might not
>   ret void
> cleanup:
>   call void @use(i32 %x)
>
> The problem is that %x might not have been executed before we enter
> %cleanup.

This means bb1 has multiple exit points, which defeats the "single  
entry, single exit" idea. Did I miss anything here ?

-
Devang
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: unwinds to in the CFG

Nick Lewycky
In reply to this post by Duncan Sands
Duncan Sands wrote:

> Hi Nick,
>
>> Just as a quick recap the problem I encountered is how to deal
>> instructions in a block being used as operands in the unwind dest. Such
>> as this:
>>
>> bb1: unwinds to %cleanup
>>    call void @foo() ; might throw, might not
>>    %x = add i32 %y, %z
>>    call void @foo() ; might throw, might not
>>    ret void
>> cleanup:
>>    call void @use(i32 %x)
>>
>> The problem is that %x might not have been executed before we enter
>> %cleanup.
>
> how about just declaring this illegal?  i.e. require the first call
> to be in a different basic block to the second, making it possible
> to use a phi node in %cleanup.

Because it's extremely difficult for an optimization pass to maintain
that guarantee, it's expensive to scan through the instruction list
sequentially, and it leads to additional basic blocks.

I agree that the snippet should be illegal, but I don't think this is
the way to do it.

Nick
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: unwinds to in the CFG

Nick Lewycky
In reply to this post by Gordon Henriksen-3
Gordon Henriksen wrote:

> What blocks would a phi node in %catch require for a case like this?
>
>     define i8 @f(i1 %b) {
>
>     entry:
>
>       b label %try
>
>     try: unwinds to %catch
>
>       b i1 %b, label %then, label %else
>
>     then: unwinds to %catch
>
>       ret void
>
>     else: unwinds to %catch
>
>       ret void
>
>     catch:  ; What are my predecessors?
>
>       ret void
>
>     }

'catch' has 3 preds, %try, %then and %else.

>
>> B. redefine the dominator tree by modifying the GraphTraits
>>  i.  A dom B means that all instructions in A are guaranteed to
>> execute before any instructions in B.
>>  ii. the domtree may have multiple roots.
>>
>> Multiple roots occurs when the entry block 'unwinds to' another block.
>
> It seems highly problematical that static allocas might not dominate
> their uses.

I'm not sure what you mean by that. It would be invalid to "alloca" in a
BB then use the pointer in the unwind dest. You can't escape that.

  The entry block is already special in that it cannot be used
> as a branch target. Why not also forbid 'unwinds to' on the entry block?

Yes, we could also do that. I'm all for hearing arguments for and opposed.

Nick
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: unwinds to in the CFG

Gordon Henriksen-3
On 2008-03-29, at 00:57, Nick Lewycky wrote:

> Gordon Henriksen wrote:
>
>> What blocks would a phi node in %catch require for a case like this?
>>
>>    define i8 @f(i1 %b) {
>>    entry:
>>      b label %try
>>    try: unwinds to %catch
>>      b i1 %b, label %then, label %else
>>    then: unwinds to %catch
>>      ret void
>>    else: unwinds to %catch
>>      ret void
>>    catch:  ; What are my predecessors?
>>      ret void
>>    }
>
> 'catch' has 3 preds, %try, %then and %else.

Would it be more natural to claim %entry and %try as the predecessors?  
This is similar to how a high level language disallows variable  
declarations from a try block to be visible from the catch.

object x;
try { x = null; }
catch { use(x); } // error, x is uninitialized

Also, consider a phi node:

phi i32 [%x, %bb1] ; %x defined in %bb1.

Depending on what type of predecessor %bb1 is, some of the models  
you've mentioned declare this phi node illegal.

>>> B. redefine the dominator tree by modifying the GraphTraits
>>> i.  A dom B means that all instructions in A are guaranteed to  
>>> execute before any instructions in B.
>>> ii. the domtree may have multiple roots.
>>>
>>> Multiple roots occurs when the entry block 'unwinds to' another  
>>> block.
>>
>> It seems highly problematical that static allocas might not  
>> dominate their uses.
>
> I'm not sure what you mean by that. It would be invalid to "alloca"  
> in a BB then use the pointer in the unwind dest. You can't escape  
> that.

I'm just giving an example of something that breaks if the entry-block-
dominates-all-blocks property is violated. Front-ends and  
transformations rely upon this property by unconditionally placing  
their 'local variable declaration' allocas in the entry block.

— Gordon


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: unwinds to in the CFG

Nick Lewycky
Gordon Henriksen wrote:

> On 2008-03-29, at 00:57, Nick Lewycky wrote:
>
>> Gordon Henriksen wrote:
>>
>>> What blocks would a phi node in %catch require for a case like this?
>>>
>>>    define i8 @f(i1 %b) {
>>>    entry:
>>>      b label %try
>>>    try: unwinds to %catch
>>>      b i1 %b, label %then, label %else
>>>    then: unwinds to %catch
>>>      ret void
>>>    else: unwinds to %catch
>>>      ret void
>>>    catch:  ; What are my predecessors?
>>>      ret void
>>>    }
>> 'catch' has 3 preds, %try, %then and %else.
>
> Would it be more natural to claim %entry and %try as the predecessors?  
> This is similar to how a high level language disallows variable  
> declarations from a try block to be visible from the catch.

In LLVM the rule is that an instruction must be dominated by its
operands. Predecessors and successors don't enter into it, except to
define the dominance tree.

> object x;
> try { x = null; }
> catch { use(x); } // error, x is uninitialized
>

Yes, I know. I'm planning to accomplish this by defining pred/succ
sensibly such that we get a domtree where the equivalent LLVM code would
be invalid.

> Also, consider a phi node:
>
> phi i32 [%x, %bb1] ; %x defined in %bb1.
>
> Depending on what type of predecessor %bb1 is, some of the models  
> you've mentioned declare this phi node illegal.

Oh. Now that's a very good point.

bb1: unwinds to %cleanup
   %x = add i32 @foo, @bar
   ret i32 %x
cleanup:
   %y = phi i32 [%x, bb1]

If %cleanup has %bb1 as a predecessor then the above is legal. (PHI
nodes being the only Instruction that works on pred/succ and not
dominators.)

And you're right, the fix for that it to make the predecessor be bb1's
preds (if it has any). This is nastier than I was expecting...

Nick
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: unwinds to in the CFG

Nick Lewycky
Hi Gordon, I've been thinking about this problem a bit.

Nick Lewycky wrote:

> Gordon Henriksen wrote:
>> Also, consider a phi node:
>>
>> phi i32 [%x, %bb1] ; %x defined in %bb1.
>>
>> Depending on what type of predecessor %bb1 is, some of the models  
>> you've mentioned declare this phi node illegal.
>
> Oh. Now that's a very good point.
>
> bb1: unwinds to %cleanup
>    %x = add i32 @foo, @bar
>    ret i32 %x
> cleanup:
>    %y = phi i32 [%x, bb1]
>
> If %cleanup has %bb1 as a predecessor then the above is legal. (PHI
> nodes being the only Instruction that works on pred/succ and not
> dominators.)

We already have this issue with invoke. Consider:

bb1:
   %x = invoke i32 @f() to label %normal unwind label %unwind
normal:
   phi i32 [%x, %bb1]
   ret i32 %x
unwind:
   phi i32 [%x, %bb1]  ; illegal
   ret i32 %x

The PHI in %unwind must mention %bb1, but may not refer to %x.

In the code sample I pasted before (bb1: unwinds to %cleanup) it would
mean that the PHI node in %cleanup must reference %bb1 but may not
contain any instructions inside of %bb1. Yes this rule will require
fixing up some optimizations, but I'm prepared to do that.

This only applies to PHI nodes so it'll be fast to check for. The usual
dominance test will take care of the case where %unwind branches out to
other blocks that try to use %x.

Any other issues I should keep in mind?

Nick
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: unwinds to in the CFG

Duncan Sands
In reply to this post by Devang Patel
Hi Devang,

> > Just as a quick recap the problem I encountered is how to deal
> > instructions in a block being used as operands in the unwind dest.  
> > Such
> > as this:
> >
> > bb1: unwinds to %cleanup
> >   call void @foo() ; might throw, might not
> >   %x = add i32 %y, %z
> >   call void @foo() ; might throw, might not
> >   ret void
> > cleanup:
> >   call void @use(i32 %x)
> >
> > The problem is that %x might not have been executed before we enter
> > %cleanup.
>
> This means bb1 has multiple exit points, which defeats the "single  
> entry, single exit" idea. Did I miss anything here ?

you are correct, this fundamental property of basic blocks is being
discarded.  Very nasty!  This is why I argued against this approach
in favour of the mini-basic-blocks approach, in which you have lots
of basic blocks which under the hood share common info to reduce memory
usage.  However Chris convinced me that in fact not that many places
really use that there is a single exit, and that only a wimpy quiche
eater would shrink at the idea of auditing all of LLVM! :)

Ciao,

Duncan.
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: unwinds to in the CFG

Duncan Sands
In reply to this post by Nick Lewycky
> In LLVM the rule is that an instruction must be dominated by its
> operands.

...

> We already have this issue with invoke. Consider:
>
> bb1:
>    %x = invoke i32 @f() to label %normal unwind label %unwind
> normal:
>    phi i32 [%x, %bb1]
>    ret i32 %x
> unwind:
>    phi i32 [%x, %bb1]  ; illegal
>    ret i32 %x
>
> The PHI in %unwind must mention %bb1, but may not refer to %x.

In some sense this PHI is not dominated by its operands.  There
is an edge to %unwind coming out of the invoke, but it is coming
out of the RHS, i.e. before the assignment to %x.  If I wrote
the invoke as:
  invoke i32 @f() <= this may branch to %unwind
  place the result in %x
then it is clear that the phi is dominated by the call part of
the invoke but not the result copying part.  Likewise for something
like

bb1: unwinds to %cleanup
   %x = udiv i32 @foo, @bar <= The udiv is always executed before %cleanup, but not the assignment to %x
   ret i32 %x
cleanup:
   %y = phi i32 [%x, bb1]

So maybe it is helpful to think of instructions as corresponding to
a LHS and a RHS node in the cfg, with an edge from RHS to LHS, and
with exceptional edges coming out of the RHS node.

Ciao,

Duncan.
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: unwinds to in the CFG

Chris Lattner
On Thu, 3 Apr 2008, Duncan Sands wrote:

>> bb1:
>>    %x = invoke i32 @f() to label %normal unwind label %unwind
>> normal:
>>    phi i32 [%x, %bb1]
>>    ret i32 %x
>> unwind:
>>    phi i32 [%x, %bb1]  ; illegal
>>    ret i32 %x
>>
>> The PHI in %unwind must mention %bb1, but may not refer to %x.
>
> In some sense this PHI is not dominated by its operands.  There
> is an edge to %unwind coming out of the invoke, but it is coming
> out of the RHS, i.e. before the assignment to %x.  If I wrote
> the invoke as:

On IRC, I suggested that Nicholas consider making the unwind block not be
dominated by *any* instruction in the unwinding block.  This gets us out
of weird partial dominance cases and makes the CFG checks simpler.  For
example, this would be illegal:

foo:  unwinds to bar:
   %a = add ...
   call ...
   ...


bar:
   use %a


Because foo doesn't dominate bar at all.  In practice, unwind blocks (at
least in C++) almost never refer to computation in the unwinding block, so
I think that this is just fine in pratice.  For cases where a reference
does exist, it is sufficient to split the block:


foo:
   %a = add ..
   br foo2

foo2: unwinds to bar
   call ...

bar:
   use %a

This means that simplifycfg would need to know that it can't merge
foo/foo2, etc.

I think this would significantly simplify the dominance issues with this
approach.  Maybe even the quiche eaters will be able to handle it! ;-)

What do you think?

-Chris

--
http://nondot.org/sabre/
http://llvm.org/
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: unwinds to in the CFG

Dale Johannesen
In reply to this post by Duncan Sands
I'm chasing a wrong-codegen bug that looks very much like this, except
that I do not have a second call to foo; I have %x being referenced in  
the
unwinding code, where it hasn't been set.  Was there a resolution for  
this?

On Apr 3, 2008, at 10:58 AM, Duncan Sands wrote:

> Hi Devang,
>>> Just as a quick recap the problem I encountered is how to deal
>>> instructions in a block being used as operands in the unwind dest.
>>> Such
>>> as this:
>>>
>>> bb1: unwinds to %cleanup
>>>  call void @foo() ; might throw, might not
>>>  %x = add i32 %y, %z
>>>  call void @foo() ; might throw, might not
>>>  ret void
>>> cleanup:
>>>  call void @use(i32 %x)
>>>
>>> The problem is that %x might not have been executed before we enter
>>> %cleanup.
>>
>> This means bb1 has multiple exit points, which defeats the "single
>> entry, single exit" idea. Did I miss anything here ?
>
> you are correct, this fundamental property of basic blocks is being
> discarded.  Very nasty!  This is why I argued against this approach
> in favour of the mini-basic-blocks approach, in which you have lots
> of basic blocks which under the hood share common info to reduce  
> memory
> usage.  However Chris convinced me that in fact not that many places
> really use that there is a single exit, and that only a wimpy quiche
> eater would shrink at the idea of auditing all of LLVM! :)
>
> Ciao,
>
> Duncan.
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: unwinds to in the CFG

Chris Lattner
On Fri, 4 Apr 2008, Dale Johannesen wrote:
> I'm chasing a wrong-codegen bug that looks very much like this, except
> that I do not have a second call to foo; I have %x being referenced in
> the
> unwinding code, where it hasn't been set.  Was there a resolution for
> this?

I don't think the 'unwinds to' stuff is used on mainline by llvm-gcc.

-Chris

> On Apr 3, 2008, at 10:58 AM, Duncan Sands wrote:
>> Hi Devang,
>>>> Just as a quick recap the problem I encountered is how to deal
>>>> instructions in a block being used as operands in the unwind dest.
>>>> Such
>>>> as this:
>>>>
>>>> bb1: unwinds to %cleanup
>>>>  call void @foo() ; might throw, might not
>>>>  %x = add i32 %y, %z
>>>>  call void @foo() ; might throw, might not
>>>>  ret void
>>>> cleanup:
>>>>  call void @use(i32 %x)
>>>>
>>>> The problem is that %x might not have been executed before we enter
>>>> %cleanup.
>>>
>>> This means bb1 has multiple exit points, which defeats the "single
>>> entry, single exit" idea. Did I miss anything here ?
>>
>> you are correct, this fundamental property of basic blocks is being
>> discarded.  Very nasty!  This is why I argued against this approach
>> in favour of the mini-basic-blocks approach, in which you have lots
>> of basic blocks which under the hood share common info to reduce
>> memory
>> usage.  However Chris convinced me that in fact not that many places
>> really use that there is a single exit, and that only a wimpy quiche
>> eater would shrink at the idea of auditing all of LLVM! :)
>>
>> Ciao,
>>
>> Duncan.
>> _______________________________________________
>> LLVM Developers mailing list
>> [hidden email]         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

-Chris

--
http://nondot.org/sabre/
http://llvm.org/
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev