Add a 'notrap' function attribute?

classic Classic list List threaded Threaded
18 messages Options
Reply | Threaded
Open this post in threaded view
|

Add a 'notrap' function attribute?

Pekka Jääskeläinen-3
Hello,

OpenCL C specifies that instructions should not trap (it is "discouraged"
in the specs). If they do, it's vendor-specific how the hardware
exceptions are handled.

It might be also the case with some other (future) languages targeting
"streamlined" parallel accelerators in an heterogeneous setting.
At least CUDA comes to mind. What about OpenACC and the new OpenMP,
does someone know offhand?

It would help several optimizations if they could assume certain
instructions do not trap. E.g., I was looking at the if-conversion of
the loop vectorizer, and it seems to not support speculating stores,
divs, etc. which could be done if we knew it's safe to speculatively
execute them.

[In this particular if-conversion case proper predicated execution
(not speculative) would require predicates to be added for all LLVM
instructions so they could be squashed. I think this was discussed
several years ago in the context of a generic IR-level if-conversion
pass, but it seems such a thing did not realize eventually.]

Anyways, "speculative" if-conversion is just one example where knowing
that traps need not to be considered in the function at hand
would help the optimizations. Also other speculative code motion
optimizations, e.g., LICM, could benefit from it.

One way would be to introduce a new function attribute. Functions (e.g.,
OpenCL C or CUDA kernels) could be marked with an attribute that states
that the instructions can be assumed not to trap -- it's a programmer's or
the runtime's mistake if they do. The runtime should change the fp
computation mode to the non-trapping one before calling such
a function (this is actually stated in the OpenCL specs). If such
handling is not supported by the target, then the attribute should not
be added the first place.

The attribute could be called 'notrap' which would include the
semantics of any trap caused by any instruction.  Or that could be
split, just in case the hardware is known not to support one of the
features. Three could suffice: 'nofptrap' (no IEEE FP exceptions),
'nodivtrap' (no divide by zero exceptions, undef value output instead),
'nomemtrap' (no mem exceptions).

What do you think of the general idea? Or is there something similar
already that can accomplish this?

Thanks in advance,
--
Pekka
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Add a 'notrap' function attribute?

Nadav Rotem
Hi Pekka,

The motivation for the ’notrap’ bit is clear.  Domain specific languages can set this bit to enable more aggressive optimizations.  I don’t think that the Loop Vectorizer is a good example because it is not designed to vectorize data-parallel languages which have a completely different semantics.  In OpenCL/Cuda you would want to vectorize the outermost loop, and the language guarantees that it is safe to so.

Function attribute is one possible solution.  Another solution would be to use metadata. I think that we need to explore both solutions and estimate their effect on the rest of the compiler.  Can you estimate which parts of the compiler would need to be changed in order to support this new piece of information ?  We need to think about what happens when we merge or hoist load/stores.  Will we need to review and change every single memory optimization in the compiler ?  

Thanks,
Nadav  

On Oct 31, 2013, at 7:38 AM, Pekka Jääskeläinen <[hidden email]> wrote:

> Hello,
>
> OpenCL C specifies that instructions should not trap (it is "discouraged"
> in the specs). If they do, it's vendor-specific how the hardware
> exceptions are handled.
>
> It might be also the case with some other (future) languages targeting "streamlined" parallel accelerators in an heterogeneous setting.
> At least CUDA comes to mind. What about OpenACC and the new OpenMP,
> does someone know offhand?
>
> It would help several optimizations if they could assume certain
> instructions do not trap. E.g., I was looking at the if-conversion of
> the loop vectorizer, and it seems to not support speculating stores,
> divs, etc. which could be done if we knew it's safe to speculatively
> execute them.
>
> [In this particular if-conversion case proper predicated execution
> (not speculative) would require predicates to be added for all LLVM
> instructions so they could be squashed. I think this was discussed
> several years ago in the context of a generic IR-level if-conversion
> pass, but it seems such a thing did not realize eventually.]
>
> Anyways, "speculative" if-conversion is just one example where knowing
> that traps need not to be considered in the function at hand
> would help the optimizations. Also other speculative code motion
> optimizations, e.g., LICM, could benefit from it.
>
> One way would be to introduce a new function attribute. Functions (e.g.,
> OpenCL C or CUDA kernels) could be marked with an attribute that states
> that the instructions can be assumed not to trap -- it's a programmer's or
> the runtime's mistake if they do. The runtime should change the fp
> computation mode to the non-trapping one before calling such
> a function (this is actually stated in the OpenCL specs). If such
> handling is not supported by the target, then the attribute should not
> be added the first place.
>
> The attribute could be called 'notrap' which would include the
> semantics of any trap caused by any instruction.  Or that could be
> split, just in case the hardware is known not to support one of the
> features. Three could suffice: 'nofptrap' (no IEEE FP exceptions),
> 'nodivtrap' (no divide by zero exceptions, undef value output instead),
> 'nomemtrap' (no mem exceptions).
>
> What do you think of the general idea? Or is there something similar
> already that can accomplish this?
>
> Thanks in advance,
> --
> Pekka
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Add a 'notrap' function attribute?

Hal Finkel
----- Original Message -----
> Hi Pekka,
>
> The motivation for the ’notrap’ bit is clear.

I'd also like to see this functionality.

>  Domain specific
> languages can set this bit to enable more aggressive optimizations.
>  I don’t think that the Loop Vectorizer is a good example because it
> is not designed to vectorize data-parallel languages which have a
> completely different semantics.  In OpenCL/Cuda you would want to
> vectorize the outermost loop, and the language guarantees that it is
> safe to so.
>
> Function attribute is one possible solution.  Another solution would
> be to use metadata. I think that we need to explore both solutions
> and estimate their effect on the rest of the compiler.  Can you
> estimate which parts of the compiler would need to be changed in
> order to support this new piece of information ?  We need to think
> about what happens when we merge or hoist load/stores.  Will we need
> to review and change every single memory optimization in the
> compiler ?

My hope is that almost all of the benefit can be obtained only by changing isSafeToSpeculativelyExecute -- for memory functions, we already preserve TBAA metadata, and so auditing to preserve notrap seems like it should be reasonable. For floating-point, we preserve fpmath metadata, so that should not be too hard.

 -Hal

>
> Thanks,
> Nadav
>
> On Oct 31, 2013, at 7:38 AM, Pekka Jääskeläinen
> <[hidden email]> wrote:
>
> > Hello,
> >
> > OpenCL C specifies that instructions should not trap (it is
> > "discouraged"
> > in the specs). If they do, it's vendor-specific how the hardware
> > exceptions are handled.
> >
> > It might be also the case with some other (future) languages
> > targeting "streamlined" parallel accelerators in an heterogeneous
> > setting.
> > At least CUDA comes to mind. What about OpenACC and the new OpenMP,
> > does someone know offhand?
> >
> > It would help several optimizations if they could assume certain
> > instructions do not trap. E.g., I was looking at the if-conversion
> > of
> > the loop vectorizer, and it seems to not support speculating
> > stores,
> > divs, etc. which could be done if we knew it's safe to
> > speculatively
> > execute them.
> >
> > [In this particular if-conversion case proper predicated execution
> > (not speculative) would require predicates to be added for all LLVM
> > instructions so they could be squashed. I think this was discussed
> > several years ago in the context of a generic IR-level
> > if-conversion
> > pass, but it seems such a thing did not realize eventually.]
> >
> > Anyways, "speculative" if-conversion is just one example where
> > knowing
> > that traps need not to be considered in the function at hand
> > would help the optimizations. Also other speculative code motion
> > optimizations, e.g., LICM, could benefit from it.
> >
> > One way would be to introduce a new function attribute. Functions
> > (e.g.,
> > OpenCL C or CUDA kernels) could be marked with an attribute that
> > states
> > that the instructions can be assumed not to trap -- it's a
> > programmer's or
> > the runtime's mistake if they do. The runtime should change the fp
> > computation mode to the non-trapping one before calling such
> > a function (this is actually stated in the OpenCL specs). If such
> > handling is not supported by the target, then the attribute should
> > not
> > be added the first place.
> >
> > The attribute could be called 'notrap' which would include the
> > semantics of any trap caused by any instruction.  Or that could be
> > split, just in case the hardware is known not to support one of the
> > features. Three could suffice: 'nofptrap' (no IEEE FP exceptions),
> > 'nodivtrap' (no divide by zero exceptions, undef value output
> > instead),
> > 'nomemtrap' (no mem exceptions).
> >
> > What do you think of the general idea? Or is there something
> > similar
> > already that can accomplish this?
> >
> > Thanks in advance,
> > --
> > Pekka
> > _______________________________________________
> > LLVM Developers mailing list
> > [hidden email]         http://llvm.cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Add a 'notrap' function attribute?

Pekka Jääskeläinen-3
In reply to this post by Nadav Rotem
Hi Nadav,

On 10/31/2013 08:53 PM, Nadav Rotem wrote:
> data-parallel languages which have a completely different semantics.  In
> OpenCL/Cuda you would want to vectorize the outermost loop, and the
> language guarantees that it is safe to so.

Yeah. This is the separate (old) discussion and not strictly related to
the problem at hand. Better if-conversion benefits more than OpenCL C
work-item loops.



[For reference, here's an email in the thread from Spring. This discussion
lead to the parallel loop metadata to mark the data-parallel loops:

http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-January/058710.html

The current status of this work is that there's now also effectively
loop interchange functionality in pocl so the inner (sequential) loops
in the OpenCL C kernels are interchanged with the implicit parallel
work-item (outer) loops when it's semantically legal. After this the
inner loop vectorizer can be used efficiently also for kernels with
sequential loops.]

> Function attribute is one possible solution.  Another solution would be to
> use metadata. I think that we need to explore both solutions and estimate
> their effect on the rest of the compiler.  Can you estimate which parts of
> the compiler would need to be changed in order to support this new piece
> of information ?  We need to think about what happens when we merge or
> hoist load/stores.  Will we need to review and change every single memory
> optimization in the compiler ?

The original idea was that if the function is marked notrap, it only
loosens the previous restrictions for the optimizations. Thus, if the
old code still assumes trapping semantics, it should be still safe (only
worse optimizations might result).

Anyways, this has at least one problem that I see: functions that have
the notrap attribute cannot be safely inlined to functions without that
attribute. Otherwise a function which has possibly been optimized with the
assumption of not trapping (and speculate an instruction that might trap),
might again trap due to dropping the attribute (and the runtime not
knowing it has to switch off the trapping behavior). Thus, perhaps
notrap should simply always imply noinline to avoid this issue.

The another way is to add 'notrap' metadata to all possibly trapping
instructions. This should be safe and perhaps work across inlining,
but it requires more maintenance code and it might not work very
well in practice: the runtime might want to (or be able to) switch
the trapping semantics of e.g. the FP hardware on function basis, not
per instruction. If that's not the case, the code generator has
to support the instructions separately, injecting instructions that
switch on/off the trapping behavior.

The metadata approach has a benefit that there can be optimizations,
unrelated to the input language, that intelligently prove whether a
particular instruction instance can trap or not. E.g., if it's known
from code that a divider of a division is never zero, one can set
this metadata to a single DIV instruction, perhaps helping later optimizations.

IMHO, the attribute approach is easier and makes more sense in
this particular case where the trapping behavior is dictated
by the input language, but OTOH the metadata approach seems to go
better along how it has been done previously (fpmath) and might
open the door for separate non-language-specific optimizations.

BR,
--
--Pekka

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Add a 'notrap' function attribute?

Hal Finkel
----- Original Message -----

> Hi Nadav,
>
> On 10/31/2013 08:53 PM, Nadav Rotem wrote:
> > data-parallel languages which have a completely different
> > semantics.  In
> > OpenCL/Cuda you would want to vectorize the outermost loop, and the
> > language guarantees that it is safe to so.
>
> Yeah. This is the separate (old) discussion and not strictly related
> to
> the problem at hand. Better if-conversion benefits more than OpenCL C
> work-item loops.
>
>
>
> [For reference, here's an email in the thread from Spring. This
> discussion
> lead to the parallel loop metadata to mark the data-parallel loops:
>
> http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-January/058710.html
>
> The current status of this work is that there's now also effectively
> loop interchange functionality in pocl so the inner (sequential)
> loops
> in the OpenCL C kernels are interchanged with the implicit parallel
> work-item (outer) loops when it's semantically legal. After this the
> inner loop vectorizer can be used efficiently also for kernels with
> sequential loops.]
>
> > Function attribute is one possible solution.  Another solution
> > would be to
> > use metadata. I think that we need to explore both solutions and
> > estimate
> > their effect on the rest of the compiler.  Can you estimate which
> > parts of
> > the compiler would need to be changed in order to support this new
> > piece
> > of information ?  We need to think about what happens when we merge
> > or
> > hoist load/stores.  Will we need to review and change every single
> > memory
> > optimization in the compiler ?
>
> The original idea was that if the function is marked notrap, it only
> loosens the previous restrictions for the optimizations. Thus, if the
> old code still assumes trapping semantics, it should be still safe
> (only
> worse optimizations might result).
>
> Anyways, this has at least one problem that I see: functions that
> have
> the notrap attribute cannot be safely inlined to functions without
> that
> attribute. Otherwise a function which has possibly been optimized
> with the
> assumption of not trapping (and speculate an instruction that might
> trap),
> might again trap due to dropping the attribute (and the runtime not
> knowing it has to switch off the trapping behavior). Thus, perhaps
> notrap should simply always imply noinline to avoid this issue.
>
> The another way is to add 'notrap' metadata to all possibly trapping
> instructions. This should be safe and perhaps work across inlining,
> but it requires more maintenance code and it might not work very
> well in practice: the runtime might want to (or be able to) switch
> the trapping semantics of e.g. the FP hardware on function basis, not
> per instruction. If that's not the case, the code generator has
> to support the instructions separately, injecting instructions that
> switch on/off the trapping behavior.
>
> The metadata approach has a benefit that there can be optimizations,
> unrelated to the input language, that intelligently prove whether a
> particular instruction instance can trap or not. E.g., if it's known
> from code that a divider of a division is never zero, one can set
> this metadata to a single DIV instruction, perhaps helping later
> optimizations.
>
> IMHO, the attribute approach is easier and makes more sense in
> this particular case where the trapping behavior is dictated
> by the input language, but OTOH the metadata approach seems to go
> better along how it has been done previously (fpmath) and might
> open the door for separate non-language-specific optimizations.

The large complication that you end up with a scheme like this is maintaining control dependencies. For example:

 if (z_is_never_zero()) {
   x = y / z !notrap
   ...
 }

the !notrap asserts that the division won't trap, which is good, but also makes it safe to speculatively execute. That's the desired effect, but not in this instance, because it will allow hoisting outside of the current block:

 x = y / z !notrap
 if (z_is_never_zero()) {
   ...
 }

and that obviously won't work correctly. This seems to leave us with three options:

 1. Add logic to all passes that might do this to prevent it (in which case, we might as well add some new (subclass data) flags instead of metadata).

 2. Assert that !notrap cannot be used where its validity might be affected by control dependencies.

 3. Represent the control dependencies explicitly in the metadata. Andy, Arnold (CC'd) and I have been discussing this in a slightly-different context, and briefly, this means adding all of the relevant conditional branch inputs to the metadata, and ensuring dominance before the metadata is respected. For example:

  if (i1 %c = call z_is_never_zero()) {
    %x = %y / %z !notrap !{ %c }
    ...
  }

and so if we run across this situation:

  %x = %y / %z !notrap !{ %c }
  if (i1 %c = call z_is_never_zero()) {
    ...
  }

  we can test that the %c does not dominate %x, and so the metadata needs to be ignored. The complication here is that you may need to encode all conditional branch inputs along all paths from the entry to the value, and the scheme also needs to deal with maythrow functions.

Given that the common use case for this seems like it will be for some language frontend to add !notrap to *all* instances of some kind of instruction (divisions, load, etc.), I think that adding a new flag (like the nsw flag) may be more appropriate for efficiency reasons. Even easier, add some more fine-grained function attributes (as you had suggested).

Also, I think that being able to tag a memory access as no trapping could be a big win for C++ too, because we could tag all loads/stores that come from C++ reference types as not trapping. Because of the way that iterators are defined, I suspect this would have a lot of positive benefits in terms of LICM and other optimizations.

 -Hal

>
> BR,
> --
> --Pekka
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Add a 'notrap' function attribute?

Pekka Jääskeläinen-3
On 11/01/2013 01:48 PM, Hal Finkel wrote:
> The large complication that you end up with a scheme like this is
> maintaining control dependencies. [...]

Good point.

> 2. Assert that !notrap cannot be used where its validity might be affected
> by control dependencies.

Thus, I propose that if this ends up being per-instruction property,
its semantics would include this restriction.

> 3. Represent the control dependencies explicitly in the metadata. [...]

This could work but might get quite complex (especially its maintenance)
for the received benefit.

> Given that the common use case for this seems like it will be for some
> language frontend to add !notrap to *all* instances of some kind of
> instruction (divisions, load, etc.), I think that adding a new flag (like
> the nsw flag) may be more appropriate for efficiency reasons. Even easier,
> add some more fine-grained function attributes (as you had suggested).
>
> Also, I think that being able to tag a memory access as no trapping could
> be a big win for C++ too, because we could tag all loads/stores that come
> from C++ reference types as not trapping. Because of the way that iterators
> are defined, I suspect this would have a lot of positive benefits in terms
> of LICM and other optimizations.

True. So, it seems there would be benefit from per-instruction
flags or metadata. Which one (an instruction flag or MD) is better,
I'm not sure. Is the flag too intrusive (it affects the bitcode format,
doesn't it?).

The MD seems more trivial. The main drawback I see is maintaining it
across optimizations.

The question is whether the basic metadata principle applies: Optimizations
that do not maintain it (and thus might drop it), or do not understand
its semantics, should not break.

A problem is that the notrap has two use cases: to communicate to the
optimizations that an instruction instance does never trap (known from
the language or static analysis), thus it's safe to speculate it.

The second meaning (my original one) is that the language (or maybe
a compiler switch) dictates that any instruction instance _should_ not
trap (is undefined if it does or use NaN propagation). In this
case the instructions might trap unless hardware is instructed not
to (switch off FP exceptions, use a special dummy signal handler for
divbyzero) before executing the instruction (or function).

If the latter is implemented using MD and some optimization drops it,
it might break programs that assume (due to the language/switch) that
there are no traps, but propagate NaNs from illegal fp operations,
because the instructions work as intended only if the FP exceptions
are switched off.

So, perhaps both, a function attribute, and an MD (that can be
safely removed) are called for as their use cases and applicability
are different.

--
Pekka
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Add a 'notrap' function attribute?

Hal Finkel
----- Original Message -----

> On 11/01/2013 01:48 PM, Hal Finkel wrote:
> > The large complication that you end up with a scheme like this is
> > maintaining control dependencies. [...]
>
> Good point.
>
> > 2. Assert that !notrap cannot be used where its validity might be
> > affected
> > by control dependencies.
>
> Thus, I propose that if this ends up being per-instruction property,
> its semantics would include this restriction.
>
> > 3. Represent the control dependencies explicitly in the metadata.
> > [...]
>
> This could work but might get quite complex (especially its
> maintenance)
> for the received benefit.
>
> > Given that the common use case for this seems like it will be for
> > some
> > language frontend to add !notrap to *all* instances of some kind of
> > instruction (divisions, load, etc.), I think that adding a new flag
> > (like
> > the nsw flag) may be more appropriate for efficiency reasons. Even
> > easier,
> > add some more fine-grained function attributes (as you had
> > suggested).
> >
> > Also, I think that being able to tag a memory access as no trapping
> > could
> > be a big win for C++ too, because we could tag all loads/stores
> > that come
> > from C++ reference types as not trapping. Because of the way that
> > iterators
> > are defined, I suspect this would have a lot of positive benefits
> > in terms
> > of LICM and other optimizations.
>
> True. So, it seems there would be benefit from per-instruction
> flags or metadata. Which one (an instruction flag or MD) is better,
> I'm not sure. Is the flag too intrusive (it affects the bitcode
> format,
> doesn't it?).

I think this depends on whether or not we have any free bits in the relevant instructions.

>
> The MD seems more trivial. The main drawback I see is maintaining it
> across optimizations.

Depending on how this is done, it is either more intrusive or less intrusive than the alternative. If we encode control dependencies into the MD, then it is less intrusive (although it could be slow and bloat the IR size); if we explicitly try to maintain it everywhere, then we might as well have a flag.

>
> The question is whether the basic metadata principle applies:
> Optimizations
> that do not maintain it (and thus might drop it), or do not
> understand
> its semantics, should not break.

It would need to be engineered that way, otherwise we can't use MD.

>
> A problem is that the notrap has two use cases: to communicate to the
> optimizations that an instruction instance does never trap (known
> from
> the language or static analysis), thus it's safe to speculate it.
>
> The second meaning (my original one) is that the language (or maybe
> a compiler switch) dictates that any instruction instance _should_
> not
> trap (is undefined if it does or use NaN propagation). In this
> case the instructions might trap unless hardware is instructed not
> to (switch off FP exceptions, use a special dummy signal handler for
> divbyzero) before executing the instruction (or function).
>
> If the latter is implemented using MD and some optimization drops it,
> it might break programs that assume (due to the language/switch) that
> there are no traps, but propagate NaNs from illegal fp operations,
> because the instructions work as intended only if the FP exceptions
> are switched off.

I don't think that, in general, we're SNAN-safe.

>
> So, perhaps both, a function attribute, and an MD (that can be
> safely removed) are called for as their use cases and applicability
> are different.

Agreed.

 -Hal

>
> --
> Pekka
>

--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Add a 'notrap' function attribute?

Nick Lewycky-2
In reply to this post by Pekka Jääskeläinen-3
FYI, see also the previous discussion about "speculatable":


I think such an attribute should be added.

In the thread which lead up to that thread, I proposed using more fine-grained attributes and Michael rightly pointed out the problem with that: you'd need one for every possible form of undefined behaviour. You listed "nofptrap", "nodivtrap" and "nomemtrap", but you didn't say "nounreachabletrap". Whoops!

Nick


On 31 October 2013 07:38, Pekka Jääskeläinen <[hidden email]> wrote:
Hello,

OpenCL C specifies that instructions should not trap (it is "discouraged"
in the specs). If they do, it's vendor-specific how the hardware
exceptions are handled.

It might be also the case with some other (future) languages targeting "streamlined" parallel accelerators in an heterogeneous setting.
At least CUDA comes to mind. What about OpenACC and the new OpenMP,
does someone know offhand?

It would help several optimizations if they could assume certain
instructions do not trap. E.g., I was looking at the if-conversion of
the loop vectorizer, and it seems to not support speculating stores,
divs, etc. which could be done if we knew it's safe to speculatively
execute them.

[In this particular if-conversion case proper predicated execution
(not speculative) would require predicates to be added for all LLVM
instructions so they could be squashed. I think this was discussed
several years ago in the context of a generic IR-level if-conversion
pass, but it seems such a thing did not realize eventually.]

Anyways, "speculative" if-conversion is just one example where knowing
that traps need not to be considered in the function at hand
would help the optimizations. Also other speculative code motion
optimizations, e.g., LICM, could benefit from it.

One way would be to introduce a new function attribute. Functions (e.g.,
OpenCL C or CUDA kernels) could be marked with an attribute that states
that the instructions can be assumed not to trap -- it's a programmer's or
the runtime's mistake if they do. The runtime should change the fp
computation mode to the non-trapping one before calling such
a function (this is actually stated in the OpenCL specs). If such
handling is not supported by the target, then the attribute should not
be added the first place.

The attribute could be called 'notrap' which would include the
semantics of any trap caused by any instruction.  Or that could be
split, just in case the hardware is known not to support one of the
features. Three could suffice: 'nofptrap' (no IEEE FP exceptions),
'nodivtrap' (no divide by zero exceptions, undef value output instead),
'nomemtrap' (no mem exceptions).

What do you think of the general idea? Or is there something similar
already that can accomplish this?

Thanks in advance,
--
Pekka
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Add a 'notrap' function attribute?

Filip Pizlo
In reply to this post by Hal Finkel

On Nov 1, 2013, at 4:48 AM, Hal Finkel <[hidden email]> wrote:

----- Original Message -----
Hi Nadav,

On 10/31/2013 08:53 PM, Nadav Rotem wrote:
data-parallel languages which have a completely different
semantics.  In
OpenCL/Cuda you would want to vectorize the outermost loop, and the
language guarantees that it is safe to so.

Yeah. This is the separate (old) discussion and not strictly related
to
the problem at hand. Better if-conversion benefits more than OpenCL C
work-item loops.



[For reference, here's an email in the thread from Spring. This
discussion
lead to the parallel loop metadata to mark the data-parallel loops:

http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-January/058710.html

The current status of this work is that there's now also effectively
loop interchange functionality in pocl so the inner (sequential)
loops
in the OpenCL C kernels are interchanged with the implicit parallel
work-item (outer) loops when it's semantically legal. After this the
inner loop vectorizer can be used efficiently also for kernels with
sequential loops.]

Function attribute is one possible solution.  Another solution
would be to
use metadata. I think that we need to explore both solutions and
estimate
their effect on the rest of the compiler.  Can you estimate which
parts of
the compiler would need to be changed in order to support this new
piece
of information ?  We need to think about what happens when we merge
or
hoist load/stores.  Will we need to review and change every single
memory
optimization in the compiler ?

The original idea was that if the function is marked notrap, it only
loosens the previous restrictions for the optimizations. Thus, if the
old code still assumes trapping semantics, it should be still safe
(only
worse optimizations might result).

Anyways, this has at least one problem that I see: functions that
have
the notrap attribute cannot be safely inlined to functions without
that
attribute. Otherwise a function which has possibly been optimized
with the
assumption of not trapping (and speculate an instruction that might
trap),
might again trap due to dropping the attribute (and the runtime not
knowing it has to switch off the trapping behavior). Thus, perhaps
notrap should simply always imply noinline to avoid this issue.

The another way is to add 'notrap' metadata to all possibly trapping
instructions. This should be safe and perhaps work across inlining,
but it requires more maintenance code and it might not work very
well in practice: the runtime might want to (or be able to) switch
the trapping semantics of e.g. the FP hardware on function basis, not
per instruction. If that's not the case, the code generator has
to support the instructions separately, injecting instructions that
switch on/off the trapping behavior.

The metadata approach has a benefit that there can be optimizations,
unrelated to the input language, that intelligently prove whether a
particular instruction instance can trap or not. E.g., if it's known
from code that a divider of a division is never zero, one can set
this metadata to a single DIV instruction, perhaps helping later
optimizations.

IMHO, the attribute approach is easier and makes more sense in
this particular case where the trapping behavior is dictated
by the input language, but OTOH the metadata approach seems to go
better along how it has been done previously (fpmath) and might
open the door for separate non-language-specific optimizations.

The large complication that you end up with a scheme like this is maintaining control dependencies. For example:

if (z_is_never_zero()) {
  x = y / z !notrap
  ...
}

the !notrap asserts that the division won't trap, which is good, but also makes it safe to speculatively execute. That's the desired effect, but not in this instance, because it will allow hoisting outside of the current block:

x = y / z !notrap
if (z_is_never_zero()) {
  ...
} 

and that obviously won't work correctly. This seems to leave us with three options:

1. Add logic to all passes that might do this to prevent it (in which case, we might as well add some new (subclass data) flags instead of metadata). 

2. Assert that !notrap cannot be used where its validity might be affected by control dependencies.

3. Represent the control dependencies explicitly in the metadata. Andy, Arnold (CC'd) and I have been discussing this in a slightly-different context, and briefly, this means adding all of the relevant conditional branch inputs to the metadata, and ensuring dominance before the metadata is respected. For example:

 if (i1 %c = call z_is_never_zero()) {
   %x = %y / %z !notrap !{ %c }
   ...
 }

Does this !{%c} reference to %c obey the same rules that other uses of a value would obey in LLVM?

If it does, then the following transformation would be trivially valid:

 if (i1 %c = call z_is_never_zero()) {
   %x = %y / %z !notrap !{ 1 }
   ...
 }

Because we can prove that %c must have the value 1 inside the then case.  But, now you have a !notrap !{1}, which means you can do:

 %x = %y / %z !notrap !{ 1 }
 if (i1 %c = call z_is_never_zero()) {
   ...
 }

... and the world just broke.  So clearly, the !{ %c } reference cannot obey all of the same rules as other uses of a value would obey in LLVM IR.  Can you describe exactly what rules such a use of %c would have?  What would replaceAllUsesWith do for it?  How should other phases treat it?  What can they do to it?


and so if we run across this situation:

 %x = %y / %z !notrap !{ %c }
 if (i1 %c = call z_is_never_zero()) {
   ...
 } 

 we can test that the %c does not dominate %x, and so the metadata needs to be ignored. The complication here is that you may need to encode all conditional branch inputs along all paths from the entry to the value, and the scheme also needs to deal with maythrow functions.

Given that the common use case for this seems like it will be for some language frontend to add !notrap to *all* instances of some kind of instruction (divisions, load, etc.), I think that adding a new flag (like the nsw flag) may be more appropriate for efficiency reasons. Even easier, add some more fine-grained function attributes (as you had suggested).

Also, I think that being able to tag a memory access as no trapping could be a big win for C++ too, because we could tag all loads/stores that come from C++ reference types as not trapping. Because of the way that iterators are defined, I suspect this would have a lot of positive benefits in terms of LICM and other optimizations.

-Hal


BR,
--
--Pekka

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Add a 'notrap' function attribute?

Andrew Trick
In reply to this post by Hal Finkel

On Nov 1, 2013, at 4:48 AM, Hal Finkel <[hidden email]> wrote:

3. Represent the control dependencies explicitly in the metadata. Andy, Arnold (CC'd) and I have been discussing this in a slightly-different context, and briefly, this means adding all of the relevant conditional branch inputs to the metadata, and ensuring dominance before the metadata is respected. For example:

 if (i1 %c = call z_is_never_zero()) {
   %x = %y / %z !notrap !{ %c }
   ...
 }

and so if we run across this situation:

 %x = %y / %z !notrap !{ %c }
 if (i1 %c = call z_is_never_zero()) {
   ...
 } 

 we can test that the %c does not dominate %x, and so the metadata needs to be ignored. The complication here is that you may need to encode all conditional branch inputs along all paths from the entry to the value, and the scheme also needs to deal with maythrow functions.

This does not need to be resolved to move forward with Pekka’s proposal, but since we’re talking about it...

- The control dependent metadata looks like it could work, I like the idea (although we’re lacking a strong motivation)

- I’m not sure why divide-by-zero would motivate this (probably just missing something). LLVM doesn’t model it as a trap currently. And if it did, an explicit nonzero-divisor check would be easy to reason about without any metadata.

- The semantics should be that control dependent metadata is guaranteed if only the encoded conditions can be proven to hold, independent of surrounding control flow. So we would never use this if we needed to encode all branch conditions from entry to home block. e.g. only a single nonzero divisor check is sufficient.

- We would need to encode the sense of the condition (true or false). The metadata is still valid if we can see that the condition controls a branch, and the corresponding branch target (non-critical-edge) dominates the operation. This is cool because optimizations can be totally oblivious, but the information will still be preserved most of the time.

and the scheme also needs to deal with maythrow functions.

I guess maythrow is implicitly the inverse of nounwind. notrap is similar but actually makes much more sense to me as an attribute. (I always thought that maythrow should be modeled with an invoke). longjmp isn’t clean but could be modeled as writing all memory and maybe-trapping.

Adding a flag for every subtle behavior gets pedantic, as seen in this thread:

The much more interesting question to me is what are the semantics of traps? Conservatively, we now assume they are well defined calls to abort(). I think that is way too conservative for most uses. It would be great to have another flavor of trap that can be reordered with certain side effects, particularly other “floating traps". Nuno Lopes ran into this problem with bounds check elimination. I don’t have a link to the discussion.

Given that the common use case for this seems like it will be for some language frontend to add !notrap to *all* instances of some kind of instruction (divisions, load, etc.), I think that adding a new flag (like the nsw flag) may be more appropriate for efficiency reasons. Even easier, add some more fine-grained function attributes (as you had suggested).

Good point. Seems like a future optimization though.

Also, I think that being able to tag a memory access as no trapping could be a big win for C++ too, because we could tag all loads/stores that come from C++ reference types as not trapping. Because of the way that iterators are defined, I suspect this would have a lot of positive benefits in terms of LICM and other optimizations.

I didn’t follow this, but obviously it would be a huge benefit type checked languages. Teaching optimizations about it would be nontrivial work though.

-Andy

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Add a 'notrap' function attribute?

Hal Finkel
In reply to this post by Filip Pizlo
----- Original Message -----

>
>
>
> On Nov 1, 2013, at 4:48 AM, Hal Finkel < [hidden email] > wrote:
>
>
>
> ----- Original Message -----
>
>
> Hi Nadav,
>
> On 10/31/2013 08:53 PM, Nadav Rotem wrote:
>
>
> data-parallel languages which have a completely different
> semantics. In
> OpenCL/Cuda you would want to vectorize the outermost loop, and the
> language guarantees that it is safe to so.
>
> Yeah. This is the separate (old) discussion and not strictly related
> to
> the problem at hand. Better if-conversion benefits more than OpenCL C
> work-item loops.
>
>
>
> [For reference, here's an email in the thread from Spring. This
> discussion
> lead to the parallel loop metadata to mark the data-parallel loops:
>
> http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-January/058710.html
>
> The current status of this work is that there's now also effectively
> loop interchange functionality in pocl so the inner (sequential)
> loops
> in the OpenCL C kernels are interchanged with the implicit parallel
> work-item (outer) loops when it's semantically legal. After this the
> inner loop vectorizer can be used efficiently also for kernels with
> sequential loops.]
>
>
>
> Function attribute is one possible solution. Another solution
> would be to
> use metadata. I think that we need to explore both solutions and
> estimate
> their effect on the rest of the compiler. Can you estimate which
> parts of
> the compiler would need to be changed in order to support this new
> piece
> of information ? We need to think about what happens when we merge
> or
> hoist load/stores. Will we need to review and change every single
> memory
> optimization in the compiler ?
>
> The original idea was that if the function is marked notrap, it only
> loosens the previous restrictions for the optimizations. Thus, if the
> old code still assumes trapping semantics, it should be still safe
> (only
> worse optimizations might result).
>
> Anyways, this has at least one problem that I see: functions that
> have
> the notrap attribute cannot be safely inlined to functions without
> that
> attribute. Otherwise a function which has possibly been optimized
> with the
> assumption of not trapping (and speculate an instruction that might
> trap),
> might again trap due to dropping the attribute (and the runtime not
> knowing it has to switch off the trapping behavior). Thus, perhaps
> notrap should simply always imply noinline to avoid this issue.
>
> The another way is to add 'notrap' metadata to all possibly trapping
> instructions. This should be safe and perhaps work across inlining,
> but it requires more maintenance code and it might not work very
> well in practice: the runtime might want to (or be able to) switch
> the trapping semantics of e.g. the FP hardware on function basis, not
> per instruction. If that's not the case, the code generator has
> to support the instructions separately, injecting instructions that
> switch on/off the trapping behavior.
>
> The metadata approach has a benefit that there can be optimizations,
> unrelated to the input language, that intelligently prove whether a
> particular instruction instance can trap or not. E.g., if it's known
> from code that a divider of a division is never zero, one can set
> this metadata to a single DIV instruction, perhaps helping later
> optimizations.
>
> IMHO, the attribute approach is easier and makes more sense in
> this particular case where the trapping behavior is dictated
> by the input language, but OTOH the metadata approach seems to go
> better along how it has been done previously (fpmath) and might
> open the door for separate non-language-specific optimizations.
>
> The large complication that you end up with a scheme like this is
> maintaining control dependencies. For example:
>
> if (z_is_never_zero()) {
> x = y / z !notrap
> ...
> }
>
> the !notrap asserts that the division won't trap, which is good, but
> also makes it safe to speculatively execute. That's the desired
> effect, but not in this instance, because it will allow hoisting
> outside of the current block:
>
> x = y / z !notrap
> if (z_is_never_zero()) {
> ...
> }
>
> and that obviously won't work correctly. This seems to leave us with
> three options:
>
> 1. Add logic to all passes that might do this to prevent it (in which
> case, we might as well add some new (subclass data) flags instead of
> metadata).
>
> 2. Assert that !notrap cannot be used where its validity might be
> affected by control dependencies.
>
> 3. Represent the control dependencies explicitly in the metadata.
> Andy, Arnold (CC'd) and I have been discussing this in a
> slightly-different context, and briefly, this means adding all of
> the relevant conditional branch inputs to the metadata, and ensuring
> dominance before the metadata is respected. For example:
>
> if (i1 %c = call z_is_never_zero()) {
> %x = %y / %z !notrap !{ %c }
> ...
> }
>
>
>
> Does this !{%c} reference to %c obey the same rules that other uses
> of a value would obey in LLVM?

Only in the operational sense, not in the semantic sense. Operationally, the metadata node has a weak reference to the value.

>
>
> If it does, then the following transformation would be trivially
> valid:
>
>
>
> if (i1 %c = call z_is_never_zero()) {
> %x = %y / %z !notrap !{ 1 }
> ...
> }
>
>
> Because we can prove that %c must have the value 1 inside the then
> case. But, now you have a !notrap !{1}, which means you can do:
>
> %x = %y / %z !notrap !{ 1 }
>
> if (i1 %c = call z_is_never_zero()) {
> ...
> }
>
>
> ... and the world just broke. So clearly, the !{ %c } reference
> cannot obey all of the same rules as other uses of a value would
> obey in LLVM IR.

Correct.

> Can you describe exactly what rules such a use of
> %c would have?

I don't think that there are any special rules, as such, but there are only a limited set of things that you can meaningfully do with it.

> What would replaceAllUsesWith do for it?

I think that it is replaced with the new value.

> How should
> other phases treat it? What can they do to it?
>

Other passes can ignore it (although hopefully passes that tend to hoist values, like LICM, will be taught to limit themselves based on this metadata). It can be used only for checking dominance for the purpose of ensuring metadata validity.

 -Hal

>
>
>
> and so if we run across this situation:
>
> %x = %y / %z !notrap !{ %c }
> if (i1 %c = call z_is_never_zero()) {
> ...
> }
>
> we can test that the %c does not dominate %x, and so the metadata
> needs to be ignored. The complication here is that you may need to
> encode all conditional branch inputs along all paths from the entry
> to the value, and the scheme also needs to deal with maythrow
> functions.
>
> Given that the common use case for this seems like it will be for
> some language frontend to add !notrap to *all* instances of some
> kind of instruction (divisions, load, etc.), I think that adding a
> new flag (like the nsw flag) may be more appropriate for efficiency
> reasons. Even easier, add some more fine-grained function attributes
> (as you had suggested).
>
> Also, I think that being able to tag a memory access as no trapping
> could be a big win for C++ too, because we could tag all
> loads/stores that come from C++ reference types as not trapping.
> Because of the way that iterators are defined, I suspect this would
> have a lot of positive benefits in terms of LICM and other
> optimizations.
>
> -Hal
>
>
>
>
> BR,
> --
> --Pekka
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email] http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
> _______________________________________________
> LLVM Developers mailing list
> [hidden email] http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Add a 'notrap' function attribute?

Andrew Trick
In reply to this post by Filip Pizlo

On Nov 1, 2013, at 1:45 PM, Filip Pizlo <[hidden email]> wrote:


On Nov 1, 2013, at 4:48 AM, Hal Finkel <[hidden email]> wrote:

----- Original Message -----
Hi Nadav,

On 10/31/2013 08:53 PM, Nadav Rotem wrote:
data-parallel languages which have a completely different
semantics.  In
OpenCL/Cuda you would want to vectorize the outermost loop, and the
language guarantees that it is safe to so.

Yeah. This is the separate (old) discussion and not strictly related
to
the problem at hand. Better if-conversion benefits more than OpenCL C
work-item loops.



[For reference, here's an email in the thread from Spring. This
discussion
lead to the parallel loop metadata to mark the data-parallel loops:

http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-January/058710.html

The current status of this work is that there's now also effectively
loop interchange functionality in pocl so the inner (sequential)
loops
in the OpenCL C kernels are interchanged with the implicit parallel
work-item (outer) loops when it's semantically legal. After this the
inner loop vectorizer can be used efficiently also for kernels with
sequential loops.]

Function attribute is one possible solution.  Another solution
would be to
use metadata. I think that we need to explore both solutions and
estimate
their effect on the rest of the compiler.  Can you estimate which
parts of
the compiler would need to be changed in order to support this new
piece
of information ?  We need to think about what happens when we merge
or
hoist load/stores.  Will we need to review and change every single
memory
optimization in the compiler ?

The original idea was that if the function is marked notrap, it only
loosens the previous restrictions for the optimizations. Thus, if the
old code still assumes trapping semantics, it should be still safe
(only
worse optimizations might result).

Anyways, this has at least one problem that I see: functions that
have
the notrap attribute cannot be safely inlined to functions without
that
attribute. Otherwise a function which has possibly been optimized
with the
assumption of not trapping (and speculate an instruction that might
trap),
might again trap due to dropping the attribute (and the runtime not
knowing it has to switch off the trapping behavior). Thus, perhaps
notrap should simply always imply noinline to avoid this issue.

The another way is to add 'notrap' metadata to all possibly trapping
instructions. This should be safe and perhaps work across inlining,
but it requires more maintenance code and it might not work very
well in practice: the runtime might want to (or be able to) switch
the trapping semantics of e.g. the FP hardware on function basis, not
per instruction. If that's not the case, the code generator has
to support the instructions separately, injecting instructions that
switch on/off the trapping behavior.

The metadata approach has a benefit that there can be optimizations,
unrelated to the input language, that intelligently prove whether a
particular instruction instance can trap or not. E.g., if it's known
from code that a divider of a division is never zero, one can set
this metadata to a single DIV instruction, perhaps helping later
optimizations.

IMHO, the attribute approach is easier and makes more sense in
this particular case where the trapping behavior is dictated
by the input language, but OTOH the metadata approach seems to go
better along how it has been done previously (fpmath) and might
open the door for separate non-language-specific optimizations.

The large complication that you end up with a scheme like this is maintaining control dependencies. For example:

if (z_is_never_zero()) {
  x = y / z !notrap
  ...
}

the !notrap asserts that the division won't trap, which is good, but also makes it safe to speculatively execute. That's the desired effect, but not in this instance, because it will allow hoisting outside of the current block:

x = y / z !notrap
if (z_is_never_zero()) {
  ...
} 

and that obviously won't work correctly. This seems to leave us with three options:

1. Add logic to all passes that might do this to prevent it (in which case, we might as well add some new (subclass data) flags instead of metadata). 

2. Assert that !notrap cannot be used where its validity might be affected by control dependencies.

3. Represent the control dependencies explicitly in the metadata. Andy, Arnold (CC'd) and I have been discussing this in a slightly-different context, and briefly, this means adding all of the relevant conditional branch inputs to the metadata, and ensuring dominance before the metadata is respected. For example:

 if (i1 %c = call z_is_never_zero()) {
   %x = %y / %z !notrap !{ %c }
   ...
 }

Does this !{%c} reference to %c obey the same rules that other uses of a value would obey in LLVM?

If it does, then the following transformation would be trivially valid:

 if (i1 %c = call z_is_never_zero()) {
   %x = %y / %z !notrap !{ 1 }
   ...
 }

Because we can prove that %c must have the value 1 inside the then case.  But, now you have a !notrap !{1}, which means you can do:

 %x = %y / %z !notrap !{ 1 }
 if (i1 %c = call z_is_never_zero()) {
   ...
 }

... and the world just broke.  So clearly, the !{ %c } reference cannot obey all of the same rules as other uses of a value would obey in LLVM IR.  Can you describe exactly what rules such a use of %c would have?  What would replaceAllUsesWith do for it?  How should other phases treat it?  What can they do to it?

We have to be very clear about answering these questions *if* we actually implement the control dependent metadata, but I don’t see this as a new problem.

In general, metadata uses cannot be considered SSA uses. We are not going to do SSA update on metadata, ever. LLVM will not have metadata phis.

Optimizations that walk uses and correlate their values with control flow should also definitely not process metadata.

I am concerned that the compare itself would be processed by CorrelatedValuePropagation, which would call replaceAllUsesWith. Metadata uses are currently updated with RAUW, but it isn’t clear that’s the right thing. Either the RAUW interface could be extended, or, worst case, we could make these “control dependent uses” a special ValueHandle that doesn’t automatically update on RAUW.

-Andy


and so if we run across this situation:

 %x = %y / %z !notrap !{ %c }
 if (i1 %c = call z_is_never_zero()) {
   ...
 } 

 we can test that the %c does not dominate %x, and so the metadata needs to be ignored. The complication here is that you may need to encode all conditional branch inputs along all paths from the entry to the value, and the scheme also needs to deal with maythrow functions.

Given that the common use case for this seems like it will be for some language frontend to add !notrap to *all* instances of some kind of instruction (divisions, load, etc.), I think that adding a new flag (like the nsw flag) may be more appropriate for efficiency reasons. Even easier, add some more fine-grained function attributes (as you had suggested).

Also, I think that being able to tag a memory access as no trapping could be a big win for C++ too, because we could tag all loads/stores that come from C++ reference types as not trapping. Because of the way that iterators are defined, I suspect this would have a lot of positive benefits in terms of LICM and other optimizations.

-Hal


BR,
--
--Pekka

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Add a 'notrap' function attribute?

Hal Finkel
In reply to this post by Andrew Trick
----- Original Message -----

>
>
>
> On Nov 1, 2013, at 4:48 AM, Hal Finkel < [hidden email] > wrote:
>
>
> 3. Represent the control dependencies explicitly in the metadata.
> Andy, Arnold (CC'd) and I have been discussing this in a
> slightly-different context, and briefly, this means adding all of
> the relevant conditional branch inputs to the metadata, and ensuring
> dominance before the metadata is respected. For example:
>
> if (i1 %c = call z_is_never_zero()) {
> %x = %y / %z !notrap !{ %c }
> ...
> }
>
> and so if we run across this situation:
>
> %x = %y / %z !notrap !{ %c }
> if (i1 %c = call z_is_never_zero()) {
> ...
> }
>
> we can test that the %c does not dominate %x, and so the metadata
> needs to be ignored. The complication here is that you may need to
> encode all conditional branch inputs along all paths from the entry
> to the value, and the scheme also needs to deal with maythrow
> functions.
>
>
>
> This does not need to be resolved to move forward with Pekka’s
> proposal, but since we’re talking about it...
>

*If* we're going to introduce metadata with implied control dependencies, then we need to figure this out. Otherwise, yes.

>
> - The control dependent metadata looks like it could work, I like the
> idea (although we’re lacking a strong motivation)
>
>
> - I’m not sure why divide-by-zero would motivate this (probably just
> missing something). LLVM doesn’t model it as a trap currently. And
> if it did, an explicit nonzero-divisor check would be easy to reason
> about without any metadata.

I think that the proposal is meant to be more general, also covering things like ensuring that loads/stores don't trap. Specifically being able to indicate that certain loads (and perhaps stores) are safe to speculatively execute is my primary interest in this.

>
>
> - The semantics should be that control dependent metadata is
> guaranteed if only the encoded conditions can be proven to hold,
> independent of surrounding control flow. So we would never use this
> if we needed to encode all branch conditions from entry to home
> block. e.g. only a single nonzero divisor check is sufficient.

Sounds good (I've not thought of a counter-example).

>
>
> - We would need to encode the sense of the condition (true or false).

Good point. More generally, it would need to encode which branch index we need (in the case of a switch instruction).

> The metadata is still valid if we can see that the condition
> controls a branch, and the corresponding branch target
> (non-critical-edge) dominates the operation. This is cool because
> optimizations can be totally oblivious, but the information will
> still be preserved most of the time.
>
>
>
>
>
> and the scheme also needs to deal with maythrow functions.

I think that you misunderstood what I meant. If we have:

 check_something();
 %x = load %ptr, !notrap

 Then it could be that %ptr will never trap because check_something() will throw in all cases where the load will trap. As a result, we also need to encode a dependency between all functions that may throw (or longjmp or whatever) and the notrap metadata.

Thanks again,
Hal

>
>
> I guess maythrow is implicitly the inverse of nounwind. notrap is
> similar but actually makes much more sense to me as an attribute. (I
> always thought that maythrow should be modeled with an invoke).
> longjmp isn’t clean but could be modeled as writing all memory and
> maybe-trapping.
>
>
> Adding a flag for every subtle behavior gets pedantic, as seen in
> this thread:
> http://llvm.1065342.n5.nabble.com/Does-nounwind-have-semantics-td59631.html
>
>
> The much more interesting question to me is what are the semantics of
> traps? Conservatively, we now assume they are well defined calls to
> abort(). I think that is way too conservative for most uses. It
> would be great to have another flavor of trap that can be reordered
> with certain side effects, particularly other “floating traps". Nuno
> Lopes ran into this problem with bounds check elimination. I don’t
> have a link to the discussion.
>
>
>
>
> Given that the common use case for this seems like it will be for
> some language frontend to add !notrap to *all* instances of some
> kind of instruction (divisions, load, etc.), I think that adding a
> new flag (like the nsw flag) may be more appropriate for efficiency
> reasons. Even easier, add some more fine-grained function attributes
> (as you had suggested).
>
>
>
> Good point. Seems like a future optimization though.
>
>
> Also, I think that being able to tag a memory access as no trapping
> could be a big win for C++ too, because we could tag all
> loads/stores that come from C++ reference types as not trapping.
> Because of the way that iterators are defined, I suspect this would
> have a lot of positive benefits in terms of LICM and other
> optimizations.
>
>
> I didn’t follow this, but obviously it would be a huge benefit type
> checked languages. Teaching optimizations about it would be
> nontrivial work though.
>
>
> -Andy

--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Add a 'notrap' function attribute?

Andrew Trick

On Nov 1, 2013, at 3:13 PM, Hal Finkel <[hidden email]> wrote:

I think that you misunderstood what I meant. If we have:

check_something();
%x = load %ptr, !notrap

Then it could be that %ptr will never trap because check_something() will throw in all cases where the load will trap. As a result, we also need to encode a dependency between all functions that may throw (or longjmp or whatever) and the notrap metadata.

Oh, of course. I missed that. I don’t have any clever ideas. Brute force would be adding a metadata ID to the call.

We’re already talking about adding TBAA to calls though. It doesn’t solve the problem, but might handle some of the use-cases.

-Andy

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Add a 'notrap' function attribute?

Filip Pizlo
In reply to this post by Hal Finkel

On Nov 1, 2013, at 2:44 PM, Hal Finkel <[hidden email]> wrote:

----- Original Message -----



On Nov 1, 2013, at 4:48 AM, Hal Finkel < [hidden email] > wrote:



----- Original Message -----


Hi Nadav,

On 10/31/2013 08:53 PM, Nadav Rotem wrote:


data-parallel languages which have a completely different
semantics. In
OpenCL/Cuda you would want to vectorize the outermost loop, and the
language guarantees that it is safe to so.

Yeah. This is the separate (old) discussion and not strictly related
to
the problem at hand. Better if-conversion benefits more than OpenCL C
work-item loops.



[For reference, here's an email in the thread from Spring. This
discussion
lead to the parallel loop metadata to mark the data-parallel loops:

http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-January/058710.html

The current status of this work is that there's now also effectively
loop interchange functionality in pocl so the inner (sequential)
loops
in the OpenCL C kernels are interchanged with the implicit parallel
work-item (outer) loops when it's semantically legal. After this the
inner loop vectorizer can be used efficiently also for kernels with
sequential loops.]



Function attribute is one possible solution. Another solution
would be to
use metadata. I think that we need to explore both solutions and
estimate
their effect on the rest of the compiler. Can you estimate which
parts of
the compiler would need to be changed in order to support this new
piece
of information ? We need to think about what happens when we merge
or
hoist load/stores. Will we need to review and change every single
memory
optimization in the compiler ?

The original idea was that if the function is marked notrap, it only
loosens the previous restrictions for the optimizations. Thus, if the
old code still assumes trapping semantics, it should be still safe
(only
worse optimizations might result).

Anyways, this has at least one problem that I see: functions that
have
the notrap attribute cannot be safely inlined to functions without
that
attribute. Otherwise a function which has possibly been optimized
with the
assumption of not trapping (and speculate an instruction that might
trap),
might again trap due to dropping the attribute (and the runtime not
knowing it has to switch off the trapping behavior). Thus, perhaps
notrap should simply always imply noinline to avoid this issue.

The another way is to add 'notrap' metadata to all possibly trapping
instructions. This should be safe and perhaps work across inlining,
but it requires more maintenance code and it might not work very
well in practice: the runtime might want to (or be able to) switch
the trapping semantics of e.g. the FP hardware on function basis, not
per instruction. If that's not the case, the code generator has
to support the instructions separately, injecting instructions that
switch on/off the trapping behavior.

The metadata approach has a benefit that there can be optimizations,
unrelated to the input language, that intelligently prove whether a
particular instruction instance can trap or not. E.g., if it's known
from code that a divider of a division is never zero, one can set
this metadata to a single DIV instruction, perhaps helping later
optimizations.

IMHO, the attribute approach is easier and makes more sense in
this particular case where the trapping behavior is dictated
by the input language, but OTOH the metadata approach seems to go
better along how it has been done previously (fpmath) and might
open the door for separate non-language-specific optimizations.

The large complication that you end up with a scheme like this is
maintaining control dependencies. For example:

if (z_is_never_zero()) {
x = y / z !notrap
...
}

the !notrap asserts that the division won't trap, which is good, but
also makes it safe to speculatively execute. That's the desired
effect, but not in this instance, because it will allow hoisting
outside of the current block:

x = y / z !notrap
if (z_is_never_zero()) {
...
}

and that obviously won't work correctly. This seems to leave us with
three options:

1. Add logic to all passes that might do this to prevent it (in which
case, we might as well add some new (subclass data) flags instead of
metadata).

2. Assert that !notrap cannot be used where its validity might be
affected by control dependencies.

3. Represent the control dependencies explicitly in the metadata.
Andy, Arnold (CC'd) and I have been discussing this in a
slightly-different context, and briefly, this means adding all of
the relevant conditional branch inputs to the metadata, and ensuring
dominance before the metadata is respected. For example:

if (i1 %c = call z_is_never_zero()) {
%x = %y / %z !notrap !{ %c }
...
}



Does this !{%c} reference to %c obey the same rules that other uses
of a value would obey in LLVM?

Only in the operational sense, not in the semantic sense. Operationally, the metadata node has a weak reference to the value.



If it does, then the following transformation would be trivially
valid:



if (i1 %c = call z_is_never_zero()) {
%x = %y / %z !notrap !{ 1 }
...
}


Because we can prove that %c must have the value 1 inside the then
case. But, now you have a !notrap !{1}, which means you can do:

%x = %y / %z !notrap !{ 1 }

if (i1 %c = call z_is_never_zero()) {
...
}


... and the world just broke. So clearly, the !{ %c } reference
cannot obey all of the same rules as other uses of a value would
obey in LLVM IR.

Correct.

Can you describe exactly what rules such a use of
%c would have? 

I don't think that there are any special rules, as such, but there are only a limited set of things that you can meaningfully do with it.

What would replaceAllUsesWith do for it?

I think that it is replaced with the new value.

I don't think this will work.  If it did, what about:

%c = load %p
if (%c) {
    %c2 = load %p
    if (%c2)
        %t = load %q !{ %c2 }
}

It might be strange, but it would certainly be fine to write an optimization that would realize that %c2 is 1 and call replaceAllUsesWith.  And then you end up with the breakage.


How should
other phases treat it? What can they do to it?


Other passes can ignore it (although hopefully passes that tend to hoist values, like LICM, will be taught to limit themselves based on this metadata). It can be used only for checking dominance for the purpose of ensuring metadata validity.

I guess I don't follow what it is about !{ %c } that controls when it is safe to execute a load.

- If it's that %c must dominate the load, then that's clearly wrong, since I could have:

%x = ...
if (%b) {
    %c = add %x, 1
    if (%c)
        %t = load %p !{ %c }
}

and then I could hoist %c over the branch on %b, and then the load would seem to be hoistable above the %b check, which could be wrong.

- If it's that %c must be true at the load, then you need to be careful to ensure that any transformation that moves %c or proves anything about it using reasoning about control flow also removes the meta-data.  This feels awkward.

-Filip



-Hal




and so if we run across this situation:

%x = %y / %z !notrap !{ %c }
if (i1 %c = call z_is_never_zero()) {
...
}

we can test that the %c does not dominate %x, and so the metadata
needs to be ignored. The complication here is that you may need to
encode all conditional branch inputs along all paths from the entry
to the value, and the scheme also needs to deal with maythrow
functions.

Given that the common use case for this seems like it will be for
some language frontend to add !notrap to *all* instances of some
kind of instruction (divisions, load, etc.), I think that adding a
new flag (like the nsw flag) may be more appropriate for efficiency
reasons. Even easier, add some more fine-grained function attributes
(as you had suggested).

Also, I think that being able to tag a memory access as no trapping
could be a big win for C++ too, because we could tag all
loads/stores that come from C++ reference types as not trapping.
Because of the way that iterators are defined, I suspect this would
have a lot of positive benefits in terms of LICM and other
optimizations.

-Hal




BR,
--
--Pekka

_______________________________________________
LLVM Developers mailing list
[hidden email] http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory
_______________________________________________
LLVM Developers mailing list
[hidden email] http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Add a 'notrap' function attribute?

Pekka Jääskeläinen-3
In reply to this post by Nick Lewycky-2
On 11/01/2013 10:26 PM, Nick Lewycky wrote:

> FYI, see also the previous discussion about "speculatable":
>
> http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-July/064426.html
>
> I think such an attribute should be added.
>
> In the thread which lead up to that thread, I proposed using more
> fine-grained attributes and Michael rightly pointed out the problem with
> that: you'd need one for every possible form of undefined behaviour. You
> listed "nofptrap", "nodivtrap" and "nomemtrap", but you didn't say
> "nounreachabletrap". Whoops!

Thanks for the link. I'm not sure if you meant to merge the 'speculatable'
with 'notrap' or just take it in account when implementing optimizations
based on 'notrap'.

The notrap attribute(s) will not work as a replacement for the 'speculatable'
to be able to speculatively call the function itself, but would work for
optimizations inside the function itself.

"Speculatable" implies also more of the semantic information of the
program's logic (e.g. the halting problem mentioned in the above thread).
Also, 'speculatable' does not imply 'notrap' because the function itself
might have instructions that should not be speculated separately. That is,
it won't trap if the original control dependencies are respected, but
might trap in case of some (illegal) speculative execution is done inside
the function (for example tests for special cases in FP computation or
NULL ptr checks).

The connection here is that optimizations based on 'notrap' might
make a previously 'speculatable' function 'non-speculatable' unless the
'notrap' is respected (by the runtime so it disables exceptions). It
should be ok, the notrap should just not be dropped. This also indicates
that it should be an attribute, not MD.

On the question whether there should be one or more fine grained
'notrap' attributes: If there is a single 'notrap' only, it would state
that none of the instructions inside the function cause traps regardless
of how they were speculatively executed. This can be true because of
the hardware (e.g. one that does not implement the exceptions the
first place or does not have an MMU/mem protection) or if the exceptions
can be and are switched off before calling the function.

E.g., a load inside a NULL pointer check can be hoisted above
the check, etc. Creepy looking optimizations for memory protected
envs that trap on illegal access. If there were more fine grained attributes,
one could just state the non-trapping property for a subset of cases.
It's future proof in a sense that a monolithic 'notrap' implies it is possible
switch off any type of potential exception, which it might not be for
some platform while the finer granularity ones assume only partial support.

It should be emphasized that 'no*trap' attirubtes would be a contract:
it does not state that it is known that the instructions do not trap,
but tells that inside the function there might be speculated instructions
that do trap in case the trap(s) are not switched off.

--
Pekka
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Add a 'notrap' function attribute?

Philip Reames-4
In reply to this post by Andrew Trick
On 11/1/13 2:07 PM, Andrew Trick wrote:

- I’m not sure why divide-by-zero would motivate this (probably just missing something). LLVM doesn’t model it as a trap currently. And if it did, an explicit nonzero-divisor check would be easy to reason about without any metadata.
This is far enough off topic to not be worth discussing now, but I want to throw out that there would be interested parties in defining trapping versions of LLVM instructions which currently use undef semantics for edge cases.  We haven't dug into this much yet, but it is something we're probably going to be interested in long term.  Our initial approach is to simply use explicit conditional guards to handle things like div-by-zero and null-dereference.  Depending on the performance we see with that approach, we may be interested in implementing support for implicit checks which would likely require explicitly trapping semantics. 

Philip


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Add a 'notrap' function attribute?

Nadav Rotem
In reply to this post by Nick Lewycky-2
Nick, 

I like the simplicity of the attribute approach. However, one of the problems of using the attribute approach is that you lose them when you inline the function.  I am not sure if this problem disqualifies this approach for the proposed uses or not. 

Thanks,
Nadav

On Nov 1, 2013, at 1:26 PM, Nick Lewycky <[hidden email]> wrote:

FYI, see also the previous discussion about "speculatable":


I think such an attribute should be added.

In the thread which lead up to that thread, I proposed using more fine-grained attributes and Michael rightly pointed out the problem with that: you'd need one for every possible form of undefined behaviour. You listed "nofptrap", "nodivtrap" and "nomemtrap", but you didn't say "nounreachabletrap". Whoops!

Nick


On 31 October 2013 07:38, Pekka Jääskeläinen <[hidden email]> wrote:
Hello,

OpenCL C specifies that instructions should not trap (it is "discouraged"
in the specs). If they do, it's vendor-specific how the hardware
exceptions are handled.

It might be also the case with some other (future) languages targeting "streamlined" parallel accelerators in an heterogeneous setting.
At least CUDA comes to mind. What about OpenACC and the new OpenMP,
does someone know offhand?

It would help several optimizations if they could assume certain
instructions do not trap. E.g., I was looking at the if-conversion of
the loop vectorizer, and it seems to not support speculating stores,
divs, etc. which could be done if we knew it's safe to speculatively
execute them.

[In this particular if-conversion case proper predicated execution
(not speculative) would require predicates to be added for all LLVM
instructions so they could be squashed. I think this was discussed
several years ago in the context of a generic IR-level if-conversion
pass, but it seems such a thing did not realize eventually.]

Anyways, "speculative" if-conversion is just one example where knowing
that traps need not to be considered in the function at hand
would help the optimizations. Also other speculative code motion
optimizations, e.g., LICM, could benefit from it.

One way would be to introduce a new function attribute. Functions (e.g.,
OpenCL C or CUDA kernels) could be marked with an attribute that states
that the instructions can be assumed not to trap -- it's a programmer's or
the runtime's mistake if they do. The runtime should change the fp
computation mode to the non-trapping one before calling such
a function (this is actually stated in the OpenCL specs). If such
handling is not supported by the target, then the attribute should not
be added the first place.

The attribute could be called 'notrap' which would include the
semantics of any trap caused by any instruction.  Or that could be
split, just in case the hardware is known not to support one of the
features. Three could suffice: 'nofptrap' (no IEEE FP exceptions),
'nodivtrap' (no divide by zero exceptions, undef value output instead),
'nomemtrap' (no mem exceptions).

What do you think of the general idea? Or is there something similar
already that can accomplish this?

Thanks in advance,
--
Pekka
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev