[llvm-dev] A tagged architecture, the elephant in the undef / poison room

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
22 messages Options
12
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[llvm-dev] A tagged architecture, the elephant in the undef / poison room

Tim Northover via llvm-dev

Here’s what seems to really be going on

        “undef”  ===  models an uninitialized register,  but

        “poison”  ===  turns the entire IR into a tagged architecture


Is this really the way to go ?

It seems like a odd choice given that none of our current targets
are tagged architectures, all of this tagged IR has to somehow be
reduced back down to normal target machine instructions.



This question is meant to be thought provoking,
I’m not looking for quick rhetorical answers.

Correct me if I am wrong, but it seems no one has outlined a proof
that C programs can be translated into IR+”poison", optimized according
to the definition of “poison”, and always be guaranteed to still
be the same C program. Right now all we have are beliefs
based on intuition that this can be done, but no actual proof.

In other words,  everyone believed that “undef” was the way to go until
counter examples started showing up, and now everyone believes
that “poison” is the way to go, which will be true until counter examples
start showing up. We are jumping from one shaky precipice to another,
we still are not yet on solid ground.


Peter Lawrence.
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [llvm-dev] A tagged architecture, the elephant in the undef / poison room

Tim Northover via llvm-dev
On 6/13/2017 12:27 PM, Peter Lawrence via llvm-dev wrote:
> Here’s what seems to really be going on
>
> “undef”  ===  models an uninitialized register,  but
>
> “poison”  ===  turns the entire IR into a tagged architecture

The problem with undef was that while "cmp eq i32 %x, %x" was always
true, if %x turned out to be undef, then we end up with "cmp eq i32
undef, undef", which may be folded either way.

Poison is meant to avoid that issue by saying that once you establish a
certain property of it, that property of it remains.  Maybe there is an
interpretation of it that matches a tagged architecture, but I don't see
how that's significant.  Do you have any particular concern regarding
it? (Other than about the ability to represent a C program.)

-Krzysztof

--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [llvm-dev] A tagged architecture, the elephant in the undef / poison room

Tim Northover via llvm-dev
In reply to this post by Tim Northover via llvm-dev
> It seems like a odd choice given that none of our current targets
> are tagged architectures, all of this tagged IR has to somehow be
> reduced back down to normal target machine instructions.

I agree with Krzysztof that an analogy with tagged architectures is
strained. One way to see this is to look at the interpreter mode of lli,
which is a proper LLVM interpreter that doesn't need to tag values as
poison or not. It completely ignores poison afaik.

> Correct me if I am wrong, but it seems no one has outlined a proof
> that C programs can be translated into IR+”poison", optimized according
> to the definition of “poison”, and always be guaranteed to still
> be the same C program. Right now all we have are beliefs
> based on intuition that this can be done, but no actual proof.

It isn't clear that producing such a proof is a good value proposition,
since Clang isn't a major source of miscompilation bugs, as far as I
know. The problems under discussion (e.g, in the freeze paper) are in
the middle end, not the front end.

> In other words,  everyone believed that “undef” was the way to go until
> counter examples started showing up, and now everyone believes
> that “poison” is the way to go, which will be true until counter examples
> start showing up. We are jumping from one shaky precipice to another,
> we still are not yet on solid ground.

I disagree with this characterization. Allow me to propose a different one.

- In the beginning, there was no undef and no poison. This was a
workable situation, a perfectly suitable basis for a correct compiler,
but certain desirable optimizations could not be performed.

- Undef was added to allow the IR to express don't-care values, with the
result that writing some optimizations got a bit harder but also more
optimizations could be performed.

- Poison was added to allow the IR to express additional optimizations
that cannot be justified by undef, with the result that writing some
optimizations got a bit harder still, but again more optimizations could
be performed. This is the current situation and there are three
problems. (1) People weren't always sure which of poison or undef to
use, this confusion persists and needs to get fixed. The common case (I
think) is a comment or documentation saying undef where it should be
poison. (2) Undef and poison interact in tricky ways. (3) Conflicting
assumptions were sometimes made about how UB works, leading to problems
such as ones described in the freeze paper.

But let's be clear: UB problems are in corner cases and they don't
affect very many developers or users. The sense of the community (as far
as I can tell) is that a fundamental overhaul isn't called for. Various
proposals have been made, including ours, about how to fix the UB model,
but none has yet been adopted.

Peter, you are obviously entitled to your views, but my take is that it
probably isn't productive to start multiple UB-related threads in a
short time period, nor is it productive to sort of throw stones at all
aspects of the model. We can't debate everything at once and we're not
going to throw out a huge base of optimizations built around the current
model. I'll also repeat Daniel's request that you tone down the
open-ended requests for proofs/counterexamples/whatever that would
generate a lot of work for anyone trying to discuss these matters with
you. Instead of producing abstract arguments, please provide concrete
examples, ideally accompanied by code.

John
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [llvm-dev] a tagged architecture, the elephant in the undef / poison room

Tim Northover via llvm-dev
In reply to this post by Tim Northover via llvm-dev
John,
         Here’s what I’m getting at

“Poison” is an attribute of a “value”, not a “value” itself.

“Poison” is an analysis result, and we should think about implementing it as such,
just like we do constant and range analysis.

Turning “poison” into a “value” means all “values” now have in addition to
a bit pattern an extra attribute (IE a tag).


Peter Lawrence.
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [llvm-dev] a tagged architecture, the elephant in the undef / poison room

Tim Northover via llvm-dev
I don't disagree with this but I don't see the concrete proposal yet.

John


On 6/16/17 9:01 AM, Peter Lawrence wrote:

> John,
>          Here’s what I’m getting at
>
> “Poison” is an attribute of a “value”, not a “value” itself.
>
> “Poison” is an analysis result, and we should think about implementing it as such,
> just like we do constant and range analysis.
>
> Turning “poison” into a “value” means all “values” now have in addition to
> a bit pattern an extra attribute (IE a tag).
>
>
> Peter Lawrence.
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [llvm-dev] a tagged architecture, the elephant in the undef / poison room

Tim Northover via llvm-dev
In reply to this post by Tim Northover via llvm-dev
On 6/16/2017 10:01 AM, Peter Lawrence via llvm-dev wrote:
> “Poison” is an attribute of a “value”, not a “value” itself.
>
> “Poison” is an analysis result, and we should think about implementing it as such,
> just like we do constant and range analysis.

You can think of poison as of just an extra element added to the type
(e.g. to the set represented by i32 or i1). Only freezing it will
replace it with something concrete such that if x is poison then
freeze(x) == freeze(x), etc.

Why do you insist on treating it as an attribute? It's more like NaN for
floating point: "not a value".

-Krzysztof

--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [llvm-dev] a tagged architecture, the elephant in the undef / poison room

Tim Northover via llvm-dev
On 6/16/2017 10:36 AM, Krzysztof Parzyszek via llvm-dev wrote:
> On 6/16/2017 10:01 AM, Peter Lawrence via llvm-dev wrote:
>> “Poison” is an attribute of a “value”, not a “value” itself.
>>
>> “Poison” is an analysis result, and we should think about implementing
>> it as such,
>> just like we do constant and range analysis.
>
> Why do you insist on treating it as an attribute? It's more like NaN for
> floating point: "not a value".

I'm guessing here---are you thinking of "could-be-poison" as opposed to
"is-poison"?

-Krzysztof

--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [llvm-dev] a tagged architecture, the elephant in the undef / poison room

Tim Northover via llvm-dev
As I understand (others have better thoughts on this than I do) it is
difficult to make the "definitely not poison" analysis return useful
answers very much of the time.  Instead, we try to write transformations
that work regardless of whether a value is poisoned.

John


On 6/16/17 9:41 AM, Krzysztof Parzyszek via llvm-dev wrote:

> On 6/16/2017 10:36 AM, Krzysztof Parzyszek via llvm-dev wrote:
>> On 6/16/2017 10:01 AM, Peter Lawrence via llvm-dev wrote:
>>> “Poison” is an attribute of a “value”, not a “value” itself.
>>>
>>> “Poison” is an analysis result, and we should think about
>>> implementing it as such,
>>> just like we do constant and range analysis.
>>
>> Why do you insist on treating it as an attribute? It's more like NaN
>> for floating point: "not a value".
>
> I'm guessing here---are you thinking of "could-be-poison" as opposed to
> "is-poison"?
>
> -Krzysztof
>
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [llvm-dev] a tagged architecture, the elephant in the undef / poison room

Tim Northover via llvm-dev
In reply to this post by Tim Northover via llvm-dev
> Only freezing it will
> replace it with something concrete such that if x is poison then
> freeze(x) == freeze(x), etc.

Nit: it's not true that freeze(x) == freeze(x) in the current proposal.
Each freeze can choose its own value.

John

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [llvm-dev] a tagged architecture, the elephant in the undef / poison room

Tim Northover via llvm-dev

> On Jun 16, 2017, at 12:31 PM, John Regehr via llvm-dev <[hidden email]> wrote:
>
>> Only freezing it will
>> replace it with something concrete such that if x is poison then
>> freeze(x) == freeze(x), etc.
>
> Nit: it's not true that freeze(x) == freeze(x) in the current proposal. Each freeze can choose its own value.
Just for my understanding: Freeze is an instruction isn't it? So this example has to read either:

%a = freeze(poison)
%b = freeze(poison)
%cond = icmp eq %a, %b

then %cond isn't necessarily true (and also not necessarily false).

While:

%a = freeze(poison)
%cond = icmp eq %a, %a

is always true. Right?

- Matthias

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [llvm-dev] a tagged architecture, the elephant in the undef / poison room

Tim Northover via llvm-dev
Right!

John


On 6/16/17 2:05 PM, Matthias Braun wrote:

>
>> On Jun 16, 2017, at 12:31 PM, John Regehr via llvm-dev <[hidden email]> wrote:
>>
>>> Only freezing it will
>>> replace it with something concrete such that if x is poison then
>>> freeze(x) == freeze(x), etc.
>>
>> Nit: it's not true that freeze(x) == freeze(x) in the current proposal. Each freeze can choose its own value.
> Just for my understanding: Freeze is an instruction isn't it? So this example has to read either:
>
> %a = freeze(poison)
> %b = freeze(poison)
> %cond = icmp eq %a, %b
>
> then %cond isn't necessarily true (and also not necessarily false).
>
> While:
>
> %a = freeze(poison)
> %cond = icmp eq %a, %a
>
> is always true. Right?
>
> - Matthias
>
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [llvm-dev] A tagged architecture, the elephant in the undef / poison room

Tim Northover via llvm-dev
In reply to this post by Tim Northover via llvm-dev
Hi Peter,

On Tue, Jun 13, 2017 at 10:27 AM, Peter Lawrence via llvm-dev
<[hidden email]> wrote:
> Here’s what seems to really be going on
>
>         “undef”  ===  models an uninitialized register,  but

No, it specifically does not.  You yourself have mentioned the reason
many times -- every "read" of undef returns a new value, which is
different from, say, %rax with garbage in it.

>         “poison”  ===  turns the entire IR into a tagged architecture
>
>
> Is this really the way to go ?
>
> It seems like a odd choice given that none of our current targets
> are tagged architectures, all of this tagged IR has to somehow be
> reduced back down to normal target machine instructions.

No architecture (that I know of) supports PHI nodes either, yet they
are an enormously useful concept.  Comparing LLVM IR to machine ISAs
is a mistake -- the mid level IR should be evaluated in terms of how
well it can do what it is supposed to do, which is enable mid level
optimizations.  There's a reason why we don't use MCInsts all the way
through.

Thanks!
-- Sanjoy
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[llvm-dev] the root cause is CP, was: A tagged architecture, the elephant in the undef / poison room

Tim Northover via llvm-dev

> On Jun 16, 2017, at 8:23 PM, Sanjoy Das <[hidden email]> wrote:
>
> Hi Peter,
>
> On Tue, Jun 13, 2017 at 10:27 AM, Peter Lawrence via llvm-dev
> <[hidden email]> wrote:
>> Here’s what seems to really be going on
>>
>>        “undef”  ===  models an uninitialized register,  but
>
> No, it specifically does not.  You yourself have mentioned the reason
> many times -- every "read" of undef returns a new value, which is
> different from, say, %rax with garbage in it.
>


Sanjoy,
            You’ve hit the nail on the head !, but I don’t think we’ve correctly
identified the root cause of the problem yet.

Rather the problem is with how we incorrectly cse and copy-propagate it:
        x = undef
        if (x == x)
        —————>    this is an illegal copy-propagation   —————>
        if (undef == undef)

If you don’t believe this is illegal then we end up with the absurdity of the
function-inlining example [1], and the argument against the function-inlining
example is so compelling that John decided to drop out of the argument,
IE he gave up because it is indefensible.

Apparently this copy-propagation has been justified by thinking of "undef"
as an IR "constant” and that any optimization you can do to a “constant”
can also be done to “undef” without further thought.

Instead each ‘undef’ should be thought of as a live on entry register, IE an
incoming argument physical register, and “x = undef” cannot be optimized
any more than “x = PhysReg0”, in particular multiple uses of X does not mean
multiple incoming argument registers, and separate instances of “undef”
cannot be equated any more than distinct incoming argument registers.

To the argument that this may create unnecessary register pressure I say
that is a register allocator issue not an IR issue, the register allocator can
and should figure this out and do the right thing.


Peter Lawrence.


[1. this function *always* executes statement S,
        F(a) {
           If (a == a) S;
        }
   but in llvm if you inline it and “a” happens to be “undef” then nothing can
   be said about whether statement S is executed.  This is indefensible.]










>>        “poison”  ===  turns the entire IR into a tagged architecture
>>
>>
>> Is this really the way to go ?
>>
>> It seems like a odd choice given that none of our current targets
>> are tagged architectures, all of this tagged IR has to somehow be
>> reduced back down to normal target machine instructions.
>
> No architecture (that I know of) supports PHI nodes either, yet they
> are an enormously useful concept.  Comparing LLVM IR to machine ISAs
> is a mistake -- the mid level IR should be evaluated in terms of how
> well it can do what it is supposed to do, which is enable mid level
> optimizations.  There's a reason why we don't use MCInsts all the way
> through.
>
> Thanks!
> -- Sanjoy

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [llvm-dev] the root cause is CP, was: A tagged architecture, the elephant in the undef / poison room

Tim Northover via llvm-dev
Hi Peter,

On Mon, Jun 19, 2017 at 8:36 AM, Peter Lawrence
<[hidden email]> wrote:

>             You’ve hit the nail on the head !, but I don’t think we’ve correctly
> identified the root cause of the problem yet.
>
> Rather the problem is with how we incorrectly cse and copy-propagate it:
>         x = undef
>         if (x == x)
>         —————>    this is an illegal copy-propagation   —————>
>         if (undef == undef)
>
> If you don’t believe this is illegal then we end up with the absurdity of the
> function-inlining example [1], and the argument against the function-inlining

I don't think [1] is absurd.  LLVM IR is not a programming language,
and it is okay for it to have semantics that would seem odd in a
programming language.

> example is so compelling that John decided to drop out of the argument,
> IE he gave up because it is indefensible.

I think he "gave up" because he has better things to do than argue with you. :)

> Apparently this copy-propagation has been justified by thinking of "undef"
> as an IR "constant” and that any optimization you can do to a “constant”
> can also be done to “undef” without further thought.
>
> Instead each ‘undef’ should be thought of as a live on entry register, IE an

Since you're talking about "each" undef, I presume you want to have an
undef instruction?  As I've said before, this would be equivalent to
"freeze(poison)" in the new semantics.  It does not address the
problem of optimizing things like `a s< (a +nsw 1)` (so you'll need a
poison-like thing for that anyway).

> incoming argument physical register, and “x = undef” cannot be optimized
> any more than “x = PhysReg0”, in particular multiple uses of X does not mean
> multiple incoming argument registers, and separate instances of “undef”
> cannot be equated any more than distinct incoming argument registers.
>
> To the argument that this may create unnecessary register pressure I say
> that is a register allocator issue not an IR issue, the register allocator can
> and should figure this out and do the right thing.

Sure, that's a consistent proposal.  However, practically, the onus is
on you to prove this is reasonable from a performance standpoint.

-- Sanjoy
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [llvm-dev] the root cause is CP, was: A tagged architecture, the elephant in the undef / poison room

Tim Northover via llvm-dev
Sanjoy,
            The point is this, you have to take a stand one way or
the other on the function-inlining issue:


[1. this function *always* executes statement S,
        F(a) {
           If (a == a) S;
        }
   but in llvm if you inline it and “a” happens to be “undef” then nothing can
   be said about whether statement S is executed. This is indefensible.]


My belief is this: that llvm exists for a utilitarian purpose,
and that llvm currently violates that utilitarian goal by violating
the users expectations in the function-inlining example.


So the question is, where do you stand ?


Peter Lawrence.



> On Jun 19, 2017, at 9:35 AM, Sanjoy Das <[hidden email]> wrote:
>
> Hi Peter,
>
> On Mon, Jun 19, 2017 at 8:36 AM, Peter Lawrence
> <[hidden email]> wrote:
>>            You’ve hit the nail on the head !, but I don’t think we’ve correctly
>> identified the root cause of the problem yet.
>>
>> Rather the problem is with how we incorrectly cse and copy-propagate it:
>>        x = undef
>>        if (x == x)
>>        —————>    this is an illegal copy-propagation   —————>
>>        if (undef == undef)
>>
>> If you don’t believe this is illegal then we end up with the absurdity of the
>> function-inlining example [1], and the argument against the function-inlining
>
> I don't think [1] is absurd.  LLVM IR is not a programming language,
> and it is okay for it to have semantics that would seem odd in a
> programming language.
>
>> example is so compelling that John decided to drop out of the argument,
>> IE he gave up because it is indefensible.
>
> I think he "gave up" because he has better things to do than argue with you. :)
>
>> Apparently this copy-propagation has been justified by thinking of "undef"
>> as an IR "constant” and that any optimization you can do to a “constant”
>> can also be done to “undef” without further thought.
>>
>> Instead each ‘undef’ should be thought of as a live on entry register, IE an
>
> Since you're talking about "each" undef, I presume you want to have an
> undef instruction?  As I've said before, this would be equivalent to
> "freeze(poison)" in the new semantics.  It does not address the
> problem of optimizing things like `a s< (a +nsw 1)` (so you'll need a
> poison-like thing for that anyway).
>
>> incoming argument physical register, and “x = undef” cannot be optimized
>> any more than “x = PhysReg0”, in particular multiple uses of X does not mean
>> multiple incoming argument registers, and separate instances of “undef”
>> cannot be equated any more than distinct incoming argument registers.
>>
>> To the argument that this may create unnecessary register pressure I say
>> that is a register allocator issue not an IR issue, the register allocator can
>> and should figure this out and do the right thing.
>
> Sure, that's a consistent proposal.  However, practically, the onus is
> on you to prove this is reasonable from a performance standpoint.
>
> -- Sanjoy

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [llvm-dev] the root cause is CP, was: A tagged architecture, the elephant in the undef / poison room

Tim Northover via llvm-dev
This may not apply at all here as I really don't understand the intricacies of undef/poison (and especially "freeze") - I'm just following the discussion trying to understand what the argument is.
So what I mean is that with floating point values, code within an `if (a == a)` condition is obviously not guaranteed to execute. Of course, you guys aren't discussing floating point here - you're discussing integers. But it seems to me like this type of optimization seems to treat an integer "undef" similarly to a floating point NaN. With my very limited understanding of the problem at hand, this seems like a logical thing.

I apologize if this is a naive view or if it distracts from the discussion.

On Mon, Jun 19, 2017 at 7:02 PM, Peter Lawrence via llvm-dev <[hidden email]> wrote:
Sanjoy,
            The point is this, you have to take a stand one way or
the other on the function-inlining issue:


[1. this function *always* executes statement S,
        F(a) {
           If (a == a) S;
        }
   but in llvm if you inline it and “a” happens to be “undef” then nothing can
   be said about whether statement S is executed. This is indefensible.]


My belief is this: that llvm exists for a utilitarian purpose,
and that llvm currently violates that utilitarian goal by violating
the users expectations in the function-inlining example.


So the question is, where do you stand ?


Peter Lawrence.



> On Jun 19, 2017, at 9:35 AM, Sanjoy Das <[hidden email]> wrote:
>
> Hi Peter,
>
> On Mon, Jun 19, 2017 at 8:36 AM, Peter Lawrence
> <[hidden email]> wrote:
>>            You’ve hit the nail on the head !, but I don’t think we’ve correctly
>> identified the root cause of the problem yet.
>>
>> Rather the problem is with how we incorrectly cse and copy-propagate it:
>>        x = undef
>>        if (x == x)
>>        —————>    this is an illegal copy-propagation   —————>
>>        if (undef == undef)
>>
>> If you don’t believe this is illegal then we end up with the absurdity of the
>> function-inlining example [1], and the argument against the function-inlining
>
> I don't think [1] is absurd.  LLVM IR is not a programming language,
> and it is okay for it to have semantics that would seem odd in a
> programming language.
>
>> example is so compelling that John decided to drop out of the argument,
>> IE he gave up because it is indefensible.
>
> I think he "gave up" because he has better things to do than argue with you. :)
>
>> Apparently this copy-propagation has been justified by thinking of "undef"
>> as an IR "constant” and that any optimization you can do to a “constant”
>> can also be done to “undef” without further thought.
>>
>> Instead each ‘undef’ should be thought of as a live on entry register, IE an
>
> Since you're talking about "each" undef, I presume you want to have an
> undef instruction?  As I've said before, this would be equivalent to
> "freeze(poison)" in the new semantics.  It does not address the
> problem of optimizing things like `a s< (a +nsw 1)` (so you'll need a
> poison-like thing for that anyway).
>
>> incoming argument physical register, and “x = undef” cannot be optimized
>> any more than “x = PhysReg0”, in particular multiple uses of X does not mean
>> multiple incoming argument registers, and separate instances of “undef”
>> cannot be equated any more than distinct incoming argument registers.
>>
>> To the argument that this may create unnecessary register pressure I say
>> that is a register allocator issue not an IR issue, the register allocator can
>> and should figure this out and do the right thing.
>
> Sure, that's a consistent proposal.  However, practically, the onus is
> on you to prove this is reasonable from a performance standpoint.
>
> -- Sanjoy

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [llvm-dev] the root cause is CP, was: A tagged architecture, the elephant in the undef / poison room

Tim Northover via llvm-dev
In reply to this post by Tim Northover via llvm-dev
> ... the argument against the function-inlining
> example is so compelling that John decided to drop out of the argument,
> IE he gave up because it is indefensible.

Peter, it is extremely impolite to mischaracterize someone else's
position.  Up to this point I assumed that you were operating in good
faith but what I quoted above looks more like trolling.

John

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [llvm-dev] the root cause is CP, was: A tagged architecture, the elephant in the undef / poison room

Tim Northover via llvm-dev
John,
         this is the issue at hand

[1. this function *always* executes statement S,
F(a) {
   If (a == a) S;
}
   but in llvm if you inline it and “a” happens to be “undef” then nothing can
   be said about whether statement S is executed. This is indefensible.]

I am trying to force this issue because when I challenge folks to question
their assumptions regarding CSE’ing and CP’ing of “undef” all I get back
is a recitation of the usual catechism on the subject. I can’t get them
to think outside the box without this example of the consequences
of those assumptions. Unfortunately so far no one has responded to it.


The VP of engineering at my last company had a way of bringing
clarity to issues like this, he said now that robotic surgery is
becoming a reality do you want the compiler to do this if it is being
used to compile the robot software and you are scheduled for
surgery.


I also believe that the C-standard is flawed in that it uses the
term “undefined behavior” in many place where it should only
use “unspecified”, and that reading the letter rather than the spirit 
of the standard is doing a disservice to our users.  We should not
look to the standard for guidance, rather the llvm community
needs to take a stand and guide the standard.


Will you help in challenging folks to question their assumptions ?
Will you help guide the standard ?
What say you ?


Peter Lawrence.




On Jun 19, 2017, at 9:28 PM, John Regehr <[hidden email]> wrote:

... the argument against the function-inlining
example is so compelling that John decided to drop out of the argument,
IE he gave up because it is indefensible.

Peter, it is extremely impolite to mischaracterize someone else's position.  Up to this point I assumed that you were operating in good faith but what I quoted above looks more like trolling.

John



_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [llvm-dev] the root cause is CP, was: A tagged architecture, the elephant in the undef / poison room

Tim Northover via llvm-dev
In reply to this post by Tim Northover via llvm-dev
Hi Peter,

On Mon, Jun 19, 2017 at 10:02 AM, Peter Lawrence
<[hidden email]> wrote:

>             The point is this, you have to take a stand one way or
> the other on the function-inlining issue:
>
>
> [1. this function *always* executes statement S,
>         F(a) {
>            If (a == a) S;
>         }
>    but in llvm if you inline it and “a” happens to be “undef” then nothing can
>    be said about whether statement S is executed. This is indefensible.]
>
>
> My belief is this: that llvm exists for a utilitarian purpose,

Agreed on this front.

> and that llvm currently violates that utilitarian goal by violating

You've assumed that LLVM's utility primarily lies in providing
programming-language like semantics that are obvious and easy to
program.  That isn't the case -- LLVM is a compiler framework, and its
IR semantics are only somewhat related to source language semantics.

> the users expectations in the function-inlining example.
>
> So the question is, where do you stand ?

As I said, I disagree with your characterization of LLVM's utility.

You can always write an LLVM frontend that does not generate IR that
will have this kind of "odd" semantics (in fact, I *am* a proponent of
using "safe" languages wherever possible).  However, the safety /
"obviousness" of the programming language is not directly related to
the safety / "obviousness" of LLVM IR.

-- Sanjoy
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [llvm-dev] the root cause is CP, was: A tagged architecture, the elephant in the undef / poison room

Tim Northover via llvm-dev
In reply to this post by Tim Northover via llvm-dev
Hi Peter,

On Mon, Jun 19, 2017 at 11:59 PM, Peter Lawrence
<[hidden email]> wrote:

> I am trying to force this issue because when I challenge folks to question
> their assumptions regarding CSE’ing and CP’ing of “undef” all I get back
> is a recitation of the usual catechism on the subject. I can’t get them
> to think outside the box without this example of the consequences
> of those assumptions. Unfortunately so far no one has responded to it.
>
> The VP of engineering at my last company had a way of bringing
> clarity to issues like this, he said now that robotic surgery is
> becoming a reality do you want the compiler to do this if it is being
> used to compile the robot software and you are scheduled for
> surgery.

Then don't use C or C++ for such systems?

> I also believe that the C-standard is flawed in that it uses the

This (and what follows) seems off topic -- I thought we were
discussing the semantics of LLVM IR, not C.  As I've said in my
previous email, LLVM IR semantics is not directly related to the
semantics of C.  You can implement a safe programming language (like
Java or Python) in LLVM IR.  This is covered in the paper in section
2.1.

-- Sanjoy
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
12
Loading...