[llvm-dev] RFC: Killing undef and spreading poison

19 messages
Open this post in threaded view
|
Report Content as Inappropriate

[llvm-dev] RFC: Killing undef and spreading poison

 Sanjoy,            My answer is that step 3, commuting the multiply with the sign-extends, is invalid,As this is what causes the `udiv` to fault.  I think it is invalid with or without the “freeze”,why do you think step 3, the commute, without the “freeze" is valid ?Also, do you think you can come up with an example that does not depend on signedoverflow being “undefined” ?Peter Lawrence.  ```Hi John, On 11/11/16, 1:58 PM, Sanjoy Das wrote: >> I'm saying that you still get reasonable behavior from reasoning about >> counterfactuals for these things, and I haven't been convinced that you >> lose optimization power except in code that actually exhibits undefined >> behavior. > > I'm not sure I understood your concern fully, but you could have cases like: > > for (...) { > if (condition){ > i32 prod = x *nsw y > i64 prod.sext = sext prod to i64 > i64 t = K `udiv` (-1 + (sum.prod >> 32)) > use(t) > } > } I'm sure it is obvious to you, but there's a typo here -- it should have been "i64 t = K `udiv` (-1 + (prod.sext >> 32))" > The C program this came from is well defined in two cases (that are relevant here): > > 1. condition is false > 2. condition is true and x * y does not sign overflow > > We'd like to LICM the entire expression tree out of control flow in a way that it helps the well defined case (2) > without introducing a fault in the well defined case (1). To be a 100% explicit, by (1) I mean a situation where: a. condition is false b. (say) x is 1 << 20, and y is 1 << 12 so "x *nsw y" does sign overflow, and ((sext(x) * sext(y)) >> 32) is 1 I'll also spell out what I was trying to say more explicitly, since we can easily sink too much time talking about subtly different things otherwise (has happened in the past :) ): Firstly consider this sequence of transforms: for (...) { if (condition){ i32 prod = x *nsw y i64 prod.sext = sext prod to i64 i64 t = K `udiv` (-1 + (sum.prod >> 32)) use(t) } } ==> (step 1) multiplication and sext can be speculated i32 prod = x *nsw y i64 prod.sext = sext prod to i64 for (...) { if (condition){ i64 t = K `udiv` (-1 + (prod.sext >> 32)) use(t) } } ==> (step 2) known bits analysis on sext to prove divisor is non-zero i32 prod = x *nsw y i64 prod.sext = sext prod to i64 i64 t = K `udiv` (-1 + (prod.sext >> 32)) for (...) { if (condition){ use(t) } } ==> (step 3) commute sext through nsw multiplicationa ;; i32 prod = x *nsw y i64 prod.sext = (sext x) *nsw (sext y) i64 t = K `udiv` (-1 + (prod.sext >> 32)) for (...) { if (condition){ use(t) } } Now we've managed to introduce a fault if we were in case 1 -- the source program was well defined but the target program divides by zero. This means whatever scheme / semantics / "rule book" LLVM implements has to outlaw at least one of the steps above. For the freeze/poison scheme that step is "step 2". Under the freeze/poison scheme step 2 will be i32 prod = x *nsw y i64 prod.sext = sext prod to i64 for (...) { if (condition){ i64 t = K `udiv` (-1 + (prod.sext >> 32)) use(t) } } ==> (step 2) known bits analysis on sext to prove divisor is non-zero i32 prod = x *nsw y i32 prod.frozen = freeze(prod) i64 prod.sext = sext prod.frozen to i64 i64 t = K `udiv` (-1 + (prod.sext >> 32)) for (...) { if (condition){ use(t) } } Which would further outlaw step 3. If we go with counterfactual reasoning, then what is that step above that is illegal? -- Sanjoy ``` _______________________________________________ LLVM Developers mailing list [hidden email] http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Open this post in threaded view
|
Report Content as Inappropriate

Re: [llvm-dev] RFC: Killing undef and spreading poison

 Hi Peter, On Thu, Jun 1, 2017 at 6:18 PM, Peter Lawrence via llvm-dev <[hidden email]> wrote: >             My answer is that step 3, commuting the multiply with the > sign-extends, is invalid, > As this is what causes the `udiv` to fault.  I think it is invalid with or > without the “freeze”, > why do you think step 3, the commute, without the “freeze" is valid ? I think it is valid by definition -- since "A *nsw B" does not sign wrap, "sext(A *nsw B)" == (sext A) * (sext B).  In other words, these two expressions are inequal only if "A *nsw B" sign wraps. > Also, do you think you can come up with an example that does not depend on > signed > overflow being “undefined” ? Can you be more specific?  Example that does what? -- Sanjoy _______________________________________________ LLVM Developers mailing list [hidden email] http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Open this post in threaded view
|
Report Content as Inappropriate

Re: [llvm-dev] RFC: Killing undef and spreading poison

 Sanjay,            Your original example showed end-to-end-miscompilation,After a sequence of transformations,Where “undefined behavior” is a part of the problem.My question is can you show some additional examples that do not rely on “nsw” or “nuw” ?Thanks,Peter Lawrence.On Jun 4, 2017, at 9:13 PM, Sanjoy Das <[hidden email]> wrote:Hi Peter,On Thu, Jun 1, 2017 at 6:18 PM, Peter Lawrence via llvm-dev<[hidden email]> wrote:Also, do you think you can come up with an example that does not depend onsignedoverflow being “undefined” ?Can you be more specific?  Example that does what?-- Sanjoy_______________________________________________ LLVM Developers mailing list [hidden email] http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Open this post in threaded view
|
Report Content as Inappropriate

Re: [llvm-dev] RFC: Killing undef and spreading poison

 Hi Peter, On Mon, Jun 5, 2017 at 3:56 PM, Peter Lawrence <[hidden email]> wrote: > Sanjay, >             Your original example showed end-to-end-miscompilation, > After a sequence of transformations, > Where “undefined behavior” is a part of the problem. > > > My question is can you show some additional examples that do not rely on > “nsw” or “nuw” ? I still don't understand you -- I don't see how an example of a miscompile that does not involve nsw/nuw is relevant here -- while I'm sure LLVM has generic bugs, such a miscompile will not illustrate anything about the nature of poison as it is implemented in LLVM today. If you're asking for an example where the original program does not have UB, that is precisely what I was trying to illustrate in the original example.  In  > for (...) {  > if (condition){  > i32 prod = x *nsw y  > i64 prod.sext = sext prod to i64  > i64 t = K `udiv` (-1 + (sum.prod >> 32))  > use(t)  > }  > } the program does *not* have UB if "condition" implies that "x * y does not sign overflow".  For instance "condition" could be "__builtin_smul_overflow(x, y, &res)", in which case  a. The program does not have UB  b. Even if prod was not marked nsw, it would have been legal for LLVM to infer it -- Sanjoy _______________________________________________ LLVM Developers mailing list [hidden email] http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Open this post in threaded view
|
Report Content as Inappropriate

Re: [llvm-dev] RFC: Killing undef and spreading poison

In reply to this post by Gerolf Hoflehner via llvm-dev
Sanjoy,
In that case I change my answer and say that the illegal
optimization was (step 1) hoisting the “nsw” out of the if-statement,

When inside the protection of the if-clause the “nsw” was conditional on
the if-clause, but when you hoisted it out of the if-statement there is no
reason to continue assuming “nsw” is true. (This is true regardless of how
the original “nsw” got there, either by language definition or by analysis).

“Nsw” isn’t an opcode, it is analysis information, and as we all know
Analysis information can and does get invalidated by transformations.
This is a case in point.

I think the “nsw” got hoisted out of the if-statement by accident, assuming
it was part of the opcode, but we have just seen that this is incorrect.

- - - - - - - - - - - - - - - - - - - -

If the above analysis is correct (I am pretty sure it is, but have been known
to make mistakes!), then my concern is that since this example was
incorrectly root-caused I am not sure we are ready to commit to “poison”
and “freeze” changes, and need to see more original source causes of
end-to-end-miscompilation to be sure we haven’t also mis-diagnosed those
as well.

Peter Lawrence.

On Jun 5, 2017, at 12:02 PM, Sanjoy Das <[hidden email]> wrote:

Hi Peter,

On Mon, Jun 5, 2017 at 9:43 AM, Peter Lawrence
<[hidden email]> wrote:
I think it is valid by definition -- since "A *nsw B" does not sign
wrap, "sext(A *nsw B)" == (sext A) * (sext B).  In other words, these
two expressions are inequal only if "A *nsw B" sign wraps.

The way you use “nsw” it seems like an assertion or an assumption.

Yes, that is correct.

Where did the “nsw” come from, is it from the source language definition,
Or compiler derived analysis information ?

Both are possible.  In C/C++, as you know, most (all?) signed
arithmetic can be marked nsw.  In many cases the optimizer will also
infer nsw if it can prove the arithmetic does not overflow (e.g. if (a
!= INT_SMAX) b = a + 1;  ==> if (a != INT_SMAX) b = a+nsw 1;).

-- Sanjoy

Here’s the original source example for context
———————————————————————————————

Sanjoy Das via llvm-dev [hidden email]

Fri Nov 11 23:55:32 PST 2016

```Firstly consider this sequence of transforms:

for (...) {
if (condition){
i32 prod = x *nsw y
i64 prod.sext = sext prod to i64
i64 t = K `udiv` (-1 + (sum.prod >> 32))
use(t)
}
}

==> (step 1) multiplication and sext can be speculated

i32 prod = x *nsw y
i64 prod.sext = sext prod to i64
for (...) {
if (condition){
i64 t = K `udiv` (-1 + (prod.sext >> 32))
use(t)
}
}

==> (step 2) known bits analysis on sext to prove divisor is non-zero

i32 prod = x *nsw y
i64 prod.sext = sext prod to i64
i64 t = K `udiv` (-1 + (prod.sext >> 32))
for (...) {
if (condition){
use(t)
}
}

==> (step 3) commute sext through nsw multiplicationa

;; i32 prod = x *nsw y
i64 prod.sext = (sext x) *nsw (sext y)
i64 t = K `udiv` (-1 + (prod.sext >> 32))
for (...) {
if (condition){
use(t)
}
}

Now we've managed to introduce a fault if we were in case 1 -- the
source program was well defined but the target program divides by
zero.```

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Open this post in threaded view
|
Report Content as Inappropriate

Re: [llvm-dev] RFC: Killing undef and spreading poison

 Since most add/sub operations compiled from C have the nsw attribute, we cannot simply restrict movement of these instructions. Sure, we could drop nsw when moving these instructions, but there are still many other problems left.  Please read more about the topic here: https://blog.regehr.org/archives/1496For example, doing loop unswitching, followed by inline (just to illustrate that the reasoning cannot be local), plus GVN gives you end-to-end miscompilations easily. Nuno -----Original Message----- From: Peter Lawrence via llvm-dev Sent: Wednesday, June 7, 2017 8:48 PM Subject: Re: [llvm-dev] RFC: Killing undef and spreading poison Sanjoy,             In that case I change my answer and say that the illegal optimization was (step 1) hoisting the “nsw” out of the if-statement, When inside the protection of the if-clause the “nsw” was conditional on the if-clause, but when you hoisted it out of the if-statement there is no reason to continue assuming “nsw” is true. (This is true regardless of how the original “nsw” got there, either by language definition or by analysis). “Nsw” isn’t an opcode, it is analysis information, and as we all know Analysis information can and does get invalidated by transformations. This is a case in point. I think the “nsw” got hoisted out of the if-statement by accident, assuming it was part of the opcode, but we have just seen that this is incorrect.    - - - - - - - - - - - - - - - - - - - - If the above analysis is correct (I am pretty sure it is, but have been known to make mistakes!), then my concern is that since this example was incorrectly root-caused I am not sure we are ready to commit to “poison” and “freeze” changes, and need to see more original source causes of end-to-end-miscompilation to be sure we haven’t also mis-diagnosed those as well. Peter Lawrence. On Jun 5, 2017, at 12:02 PM, Sanjoy Das <[hidden email]> wrote: Hi Peter, On Mon, Jun 5, 2017 at 9:43 AM, Peter Lawrence <[hidden email]> wrote: I think it is valid by definition -- since "A *nsw B" does not sign wrap, "sext(A *nsw B)" == (sext A) * (sext B).  In other words, these two expressions are inequal only if "A *nsw B" sign wraps. The way you use “nsw” it seems like an assertion or an assumption. Yes, that is correct. Where did the “nsw” come from, is it from the source language definition, Or compiler derived analysis information ? Both are possible.  In C/C++, as you know, most (all?) signed arithmetic can be marked nsw.  In many cases the optimizer will also infer nsw if it can prove the arithmetic does not overflow (e.g. if (a != INT_SMAX) b = a + 1;  ==> if (a != INT_SMAX) b = a+nsw 1;). -- Sanjoy Here’s the original source example for context ——————————————————————————————— [llvm-dev] RFC: Killing undef and spreading poison Sanjoy Das via llvm-dev llvm-dev at lists.llvm.org Fri Nov 11 23:55:32 PST 2016 Firstly consider this sequence of transforms: for (...) { if (condition){ i32 prod = x *nsw y i64 prod.sext = sext prod to i64 i64 t = K `udiv` (-1 + (sum.prod >> 32)) use(t) } } ==> (step 1) multiplication and sext can be speculated i32 prod = x *nsw y i64 prod.sext = sext prod to i64 for (...) { if (condition){ i64 t = K `udiv` (-1 + (prod.sext >> 32)) use(t) } } ==> (step 2) known bits analysis on sext to prove divisor is non-zero i32 prod = x *nsw y i64 prod.sext = sext prod to i64 i64 t = K `udiv` (-1 + (prod.sext  >> 32)) for (...) { if (condition){ use(t) } } ==> (step 3) commute sext through nsw multiplicationa ;; i32 prod = x *nsw y i64 prod.sext = (sext x) *nsw (sext y) i64 t = K `udiv` (-1 + (prod.sext >> 32)) for (...) { if (condition){ use(t) } } Now we've managed to introduce a fault if we were in case 1 -- the source program was well defined but the target program divides by zero. _______________________________________________ LLVM Developers mailing list [hidden email] http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev  _______________________________________________ LLVM Developers mailing list [hidden email] http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Open this post in threaded view
|
Report Content as Inappropriate

Re: [llvm-dev] RFC: Killing undef and spreading poison

 In reply to this post by Gerolf Hoflehner via llvm-dev Peter, it looks to me like you've misunderstood how nsw works.  You are saying that the transformation below is incorrect.  If it is incorrect, there needs to be a valuation of x and y (and also a choice of "..." and "condition") that causes the transformed code to call use(t) with a different t than the original code would have.  Please tell us those values. John >    for (...) { >      if (condition){ >        i32 prod = x *nsw y >        i64 prod.sext = sext prod to i64 >        i64 t = K `udiv` (-1 + (sum.prod >> 32)) >        use(t) >      } >    } > > ==> (step 1) multiplication and sext can be speculated > >    i32 prod = x *nsw y >    i64 prod.sext = sext prod to i64 >    for (...) { >      if (condition){ >        i64 t = K `udiv` (-1 + (prod.sext >> 32)) >        use(t) >      } >    } _______________________________________________ LLVM Developers mailing list [hidden email] http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Open this post in threaded view
|
Report Content as Inappropriate

Re: [llvm-dev] RFC: Killing undef and spreading poison

 John,          I’m not saying the transformation is illegal with respect to ’t’, I am saying the transformation is illegal with respect to ‘prod’ I wil use Sanjoy’s example         ‘x' = (1 << 16),   'y' = (1 << 16),   and ‘condition' precludes those values Before the transformation ‘nsw’ says the the multiply will not overflow, But after the transformation the multiply *will* overflow, and ‘nsw’ is no longer valid. The point is not that you can’t hoist the multiply, but rather that the ‘nsw’ attribute no longer applies after that hoisting. I think the LLVM community in general has misunderstood and misused ‘nsw’, don’t you agree now ? Peter Lawrence. > On Jun 7, 2017, at 2:27 PM, John Regehr <[hidden email]> wrote: > > Peter, it looks to me like you've misunderstood how nsw works.  You are saying that the transformation below is incorrect.  If it is incorrect, there needs to be a valuation of x and y (and also a choice of "..." and "condition") that causes the transformed code to call use(t) with a different t than the original code would have.  Please tell us those values. > > John > > >>   for (...) { >>     if (condition){ >>       i32 prod = x *nsw y >>       i64 prod.sext = sext prod to i64 >>       i64 t = K `udiv` (-1 + (sum.prod >> 32)) >>       use(t) >>     } >>   } >> >> ==> (step 1) multiplication and sext can be speculated >> >>   i32 prod = x *nsw y >>   i64 prod.sext = sext prod to i64 >>   for (...) { >>     if (condition){ >>       i64 t = K `udiv` (-1 + (prod.sext >> 32)) >>       use(t) >>     } >>   } _______________________________________________ LLVM Developers mailing list [hidden email] http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Open this post in threaded view
|
Report Content as Inappropriate

Re: [llvm-dev] RFC: Killing undef and spreading poison

Open this post in threaded view
|
Report Content as Inappropriate

Re: [llvm-dev] RFC: Killing undef and spreading poison

 Hi Peter, On Thu, Jun 8, 2017 at 9:41 AM, Peter Lawrence <[hidden email]> wrote: > >> On Jun 7, 2017, at 2:23 PM, Nuno Lopes <[hidden email]> wrote: >> >> Since most add/sub operations compiled from C have the nsw attribute, we cannot simply restrict movement of these instructions. > > Nuno, >           I’m not saying the operations can’t be moved, > I’m saying that once you move one the ‘nsw’ attribute no longer applies, > unless you can mathematically prove that it still does, > otherwise an “add nsw” has to be converted to plain “add”. > > It is only by illegally retaining the ‘nsw’ after hoisting the multiply out of the ‘if’ statement > that subsequent transformations yield end-to-end-miscompilation in Sanjoy’s example. That would be correct (and we do this for some constructs: for instance when we have !dereferenceable attached to a load instruction we will strip the !dereferenceable when hoisting it out of control flow).  However, with poison we want to have our cake and eat it too (perhaps eating is not the best analogy with poison :) ) -- we want to (whenever correct) exploit the fact that a certain operation does not overflow when possible even when hoisting it above control flow.  For instance, if we have: if (cond) {   t = a +nsw b;   print(t); } Now if once we hoist t out of the control block: t = a +nsw b; if (cond) {   print(t); } in the transformed program, t itself may sign overflow.  In LLVM IR (or at least in the semantics we'd like), this has no correctness implications -- t becomes "poison" (which is basically deferred undefined behavior), and the program is undefined only if the generated poison value is used in a "side effecting" manner.  Assuming that print is a "side effect", this means at print, we can assume t isn't poison (and thus a + b did not sign overflow).  This is a weaker model than C/C++; and the difficult bits are getting the poison propagation rules correct, and to have a sound definition of a "side effect" (i.e. the points at which poison == deferred UB actually becomes UB). > I think the LLVM community in general has misunderstood and misused ‘nsw’, don’t you agree now ? FYI, I think it is poor form to insinuate such things when you clearly haven't made an effort to dig back and understand the all of prior discussions we've had in this area (hint: we've discussed and explicitly decided to not implement the semantics you're suggesting). Of course, fresh ideas are always welcome but I suggest you start by first reading http://www.cs.utah.edu/~regehr/papers/undef-pldi17.pdfand some of the mailing list discussions we've had in the past on this topic. Thanks! -- Sanjoy _______________________________________________ LLVM Developers mailing list [hidden email] http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Open this post in threaded view
|
Report Content as Inappropriate

Re: [llvm-dev] RFC: Killing undef and spreading poison

 > On Jun 8, 2017, at 10:33 AM, Sanjoy Das <[hidden email]> wrote: > > Hi Peter, > > On Thu, Jun 8, 2017 at 9:41 AM, Peter Lawrence > <[hidden email]> wrote: >> >>> On Jun 7, 2017, at 2:23 PM, Nuno Lopes <[hidden email]> wrote: >>> >>> Since most add/sub operations compiled from C have the nsw attribute, we cannot simply restrict movement of these instructions. >> >> Nuno, >>          I’m not saying the operations can’t be moved, >> I’m saying that once you move one the ‘nsw’ attribute no longer applies, >> unless you can mathematically prove that it still does, >> otherwise an “add nsw” has to be converted to plain “add”. >> >> It is only by illegally retaining the ‘nsw’ after hoisting the multiply out of the ‘if’ statement >> that subsequent transformations yield end-to-end-miscompilation in Sanjoy’s example. > > That would be correct (and we do this for some constructs: for > instance when we have !dereferenceable attached to a load instruction > we will strip the !dereferenceable when hoisting it out of control > flow).   In other words you are agreeing with me. And once we’ve agreed on that, why do you insist on illegally hoisting the ‘nsw’ Out of the if-statement, since adding “poison” is clearly not necessary (once The ‘nsw’ is stripped off the multiply the ‘sext’ can no longer be commuted). In other words “poison” isn’t cake, it is a bandaid over an illegal transformation, It has no benefit. Don’t make the illegal transformation and ‘poison’ isn’t necessary. Peter Lawrence. > However, with poison we want to have our cake and eat it too > (perhaps eating is not the best analogy with poison :) ) -- we want to > (whenever correct) exploit the fact that a certain operation does not > overflow when possible even when hoisting it above control flow.  For > instance, if we have: > > if (cond) { >  t = a +nsw b; >  print(t); > } > > Now if once we hoist t out of the control block: > > t = a +nsw b; > if (cond) { >  print(t); > } > > in the transformed program, t itself may sign overflow.  In LLVM IR > (or at least in the semantics we'd like), this has no correctness > implications -- t becomes "poison" (which is basically deferred > undefined behavior), and the program is undefined only if the > generated poison value is used in a "side effecting" manner.  Assuming > that print is a "side effect", this means at print, we can assume t > isn't poison (and thus a + b did not sign overflow).  This is a weaker > model than C/C++; and the difficult bits are getting the poison > propagation rules correct, and to have a sound definition of a "side > effect" (i.e. the points at which poison == deferred UB actually > becomes UB). > >> I think the LLVM community in general has misunderstood and misused ‘nsw’, don’t you agree now ? > > FYI, I think it is poor form to insinuate such things when you clearly > haven't made an effort to dig back and understand the all of prior > discussions we've had in this area (hint: we've discussed and > explicitly decided to not implement the semantics you're suggesting). > Of course, fresh ideas are always welcome but I suggest you start by > first reading http://www.cs.utah.edu/~regehr/papers/undef-pldi17.pdf> and some of the mailing list discussions we've had in the past on this > topic. > > Thanks! > -- Sanjoy _______________________________________________ LLVM Developers mailing list [hidden email] http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Open this post in threaded view
|
Report Content as Inappropriate

Re: [llvm-dev] RFC: Killing undef and spreading poison

Open this post in threaded view
|
Report Content as Inappropriate

Re: [llvm-dev] RFC: Killing undef and spreading poison

 Sanjoy,            in your blog post  https://www.playingwithpointers.com/problem-with-undef.htmlyou describe a problem with LLVM “undef”,yet in your paper http://www.cs.utah.edu/~regehr/papers/undef-pldi17.pdfyou do not suggest fixing this problem, even though in chpt. 9 you identify othercompilers that do attempt to avoid it.It seems to me that since transforming a program with multiple uses of an “undef” variableinto one with multiple distinct “undef”s results in behavior that is inconsistent (and thatviolates users expectations) we should not allow it.This seems to me to be one of those situations where the phrase “extraordinary claimsrequire extraordinary evidence” applies, and I’m not seeing any, are you ?IIUC, a “freeze” operation can always be inserted over an “undef” (now “poison”)that forces a single consistent value, but why the extra overhead of “freeze”, asthere doesn’t seem to be a convincing argument for ever allowing the same variableto have inconsistent values, or am I missing something ?Peter Lawrence._______________________________________________ LLVM Developers mailing list [hidden email] http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Open this post in threaded view
|
Report Content as Inappropriate

Re: [llvm-dev] RFC: Killing undef and spreading poison

Open this post in threaded view
|
Report Content as Inappropriate

Re: [llvm-dev] RFC: Killing undef and spreading poison

 Sanjoy,            Sorry, I guess I wasn’t specific enough, I’ll try again.  Excerpt  from your blog post: "Another way to look at this is that `undef` isn’t a normal SSA value, and uses of an `undef`value are also its defs.”IMHO this was a bad definition from the get-go.It seems to me that transforming a program with multiple uses of a single undefined variableinto one with multiple distinct IR “undef”s results in behavior that is inconsistent, and thatviolates users expectations, and therefore we should not do it.In other words the aspect of the current definition of “undef” in LLVM-IR that seems to forceprograms with undefined variables to be mis-represented is broken, and the definition of “undef"needs to be fixed.Your proposal with “freeze” makes it possible to force all uses of an undefined variable to beconsistent, but why not just do the simpler fix to “undef” / “poison” and allow them to havemultiple uses, as opposed to “use once only, by definition of the IR”, which does not seemto have any benefit.So again, “freeze” seems to be a workaround, rather than fixing the underlying problem.Thoughts ?Peter Lawrence.On Jun 8, 2017, at 1:29 PM, Sanjoy Das <[hidden email]> wrote:Hi,On Thu, Jun 8, 2017 at 12:29 PM, Peter Lawrence<[hidden email]> wrote:Sanjoy,            in your blog posthttps://www.playingwithpointers.com/problem-with-undef.htmlyou describe a problem with LLVM “undef”,yet in your paper http://www.cs.utah.edu/~regehr/papers/undef-pldi17.pdfyou do not suggest fixing this problem, even though in chpt. 9 you identifyothercompilers that do attempt to avoid it.That problem is mentioned in section 3.2 (which quotes the sameexample as in the blog post), and section 4 (the paragraph startingwith "Defining branching on poison to be UB further enables analysesto assume that...") addresses how the new semantics plugs this gap.The reason I keep asking for additional examples is that this one whereYou have illegally transformed the ‘nsw’ isn’t convincing. So please,Maybe we're talking past each other.  Let me first discuss the framingof the problem as I see it: 1. Whether a transform is illegal or not is decided by the semantics    we've assigned to the IR.  In this case, whether hoisting the    multiplication with the nsw intact is illegal or not is decided by    the semantics we've assigned to nsw. 2. We want certain transforms to be legal, which limits the kinds of    semantics we can possibly have for the IR.  For instance, we    certainly don't want a semantics for IR that disallows constant    folding `add i32 1, 2`.The example I gave was to demonstrate why the current (albeit not veryclearly specified) semantics is incorrect.So when you ask for an example, are you asking why the set oftransforms we want to be correct in the new semantics (i.e. (2))includes "hoist nsw-arithmetic out of control flow while still keepingthe flags"?  IOW, in the language used in the paper I linked to, areyou asking "why can't we have a overflowing nsw addition be immediateUB"?  If so, that is covered in section 2.2 (but we can get into moredetail if you want).-- Sanjoy_______________________________________________ LLVM Developers mailing list [hidden email] http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Open this post in threaded view
|
Report Content as Inappropriate

Re: [llvm-dev] RFC: Killing undef and spreading poison

 In reply to this post by Gerolf Hoflehner via llvm-dev On Jun 8, 2017, at 10:33 AM, Sanjoy Das <[hidden email]> wrote:(hint: we've discussed andexplicitly decided to not implement the semantics you're suggesting).Sanjoy,            I’ve search the entire archives of 2017 and 2016 and have not found anything remotelyresembling a discussion about hoisting ‘nsw’ / ‘nuw’, can you be more specific ?Peter Lawrence._______________________________________________ LLVM Developers mailing list [hidden email] http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Open this post in threaded view
|
Report Content as Inappropriate

Re: [llvm-dev] RFC: Killing undef and spreading poison

 In reply to this post by Gerolf Hoflehner via llvm-dev Sanjoy,            From your paper   http://www.cs.utah.edu/~regehr/papers/undef-pldi17.pdf  we have this proposal1.  remove “undef” and use “poison” instead2. Introduce a new instruction:  “freeze” …3. All operations over “poison” unconditionally return “poison” except “phi", “select", and "freeze"4. Branching on “poison” is immediate UB.3:::   then  (X & 0)   does not translate to  (0)   when X is “poison” ?,  that seems strange, can you explain / elaborate ?4:::   hmm, I know of no machine where this is what actually happens[*],  can you explain / elaborate ?Peter Lawrence.[*. I am assuming you are using consistent terminology, where in the beginning of the paper you make a distinction between immediate and deferred undefined, with immediate being a trap like divide-by-zero.]_______________________________________________ LLVM Developers mailing list [hidden email] http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Open this post in threaded view
|
Report Content as Inappropriate

Re: [llvm-dev] RFC: Killing undef and spreading poison

 > 3:::   then  (X & 0)   does not translate to  (0)   when X is “poison” > ?,  that seems strange, can you explain / elaborate ? This is how poison already works, please see the LangRef. John _______________________________________________ LLVM Developers mailing list [hidden email] http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev