[llvm-dev] Proposed new min and max intrinsics

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

[llvm-dev] Proposed new min and max intrinsics

David Jones via llvm-dev
Hi everyone,

I have implemented a pair of new target-independent intrinsics and it would be great if I could get some feedback about whether we want to merge them in. The intrinsics are @llvm.minimum and @llvm.maximum, and are IEEE 754-2018 analogues to @llvm.minnum and @llvm.minnum. The differences are that the new intrinsics propagate NaNs and consider -0.0 to be strictly less than 0.0. The names are taken from the corresponding operations in the IEEE 754-2018 draft specification. You can find the patches implementing the intrinsics in the stack starting at https://reviews.llvm.org/D52764.

The context for this proposed change is that the WebAssembly backend needs these intrinsics to make its min and max instructions available at the IR level, since they follow the draft IEEE 754-2018 semantics. Originally I was thinking of implementing the intrinsics as @wasm.min and @wasm.max, but since these operations are in the draft standard, I thought they might be more appropriate as target-independent. If the community does not want to merge these intrinsics now, I can always go back to making them wasm-specific.

The way I have currently implemented the intrinsics, they lower directly to the fminnan and fmaxnan ISel DAG nodes, which already existed. As far as I can tell, these DAG nodes are only used in the ARM, AArch64, SystemZ, and WebAssembly backends and lower to instructions that follow the draft IEEE 754-2018 semantics, so these targets do not need to take any action to support the new intrinsics. Before this change, the only way an ISel DAG could contain fminnan or fmaxnan nodes was if they were produced by a select pattern. The select pattern matching code does not produce mins or maxes if it would be possible that both arguments could be zeros, so tightening the semantics of fminnan and fmaxnan to match IEEE 754-2018 will not invalidate any existing code.

Best,

Thomas Lively

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Proposed new min and max intrinsics

David Jones via llvm-dev
I just wanted to bump this to see if anyone has any input. I would really like to get these landed soon if there are no objections. 

On Tue, Oct 2, 2018 at 10:59 AM Thomas Lively <[hidden email]> wrote:
Hi everyone,

I have implemented a pair of new target-independent intrinsics and it would be great if I could get some feedback about whether we want to merge them in. The intrinsics are @llvm.minimum and @llvm.maximum, and are IEEE 754-2018 analogues to @llvm.minnum and @llvm.minnum. The differences are that the new intrinsics propagate NaNs and consider -0.0 to be strictly less than 0.0. The names are taken from the corresponding operations in the IEEE 754-2018 draft specification. You can find the patches implementing the intrinsics in the stack starting at https://reviews.llvm.org/D52764.

The context for this proposed change is that the WebAssembly backend needs these intrinsics to make its min and max instructions available at the IR level, since they follow the draft IEEE 754-2018 semantics. Originally I was thinking of implementing the intrinsics as @wasm.min and @wasm.max, but since these operations are in the draft standard, I thought they might be more appropriate as target-independent. If the community does not want to merge these intrinsics now, I can always go back to making them wasm-specific.

The way I have currently implemented the intrinsics, they lower directly to the fminnan and fmaxnan ISel DAG nodes, which already existed. As far as I can tell, these DAG nodes are only used in the ARM, AArch64, SystemZ, and WebAssembly backends and lower to instructions that follow the draft IEEE 754-2018 semantics, so these targets do not need to take any action to support the new intrinsics. Before this change, the only way an ISel DAG could contain fminnan or fmaxnan nodes was if they were produced by a select pattern. The select pattern matching code does not produce mins or maxes if it would be possible that both arguments could be zeros, so tightening the semantics of fminnan and fmaxnan to match IEEE 754-2018 will not invalidate any existing code.

Best,

Thomas Lively

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Proposed new min and max intrinsics

David Jones via llvm-dev
On Thu, 11 Oct 2018 at 00:28, Thomas Lively via llvm-dev
<[hidden email]> wrote:
>
> I just wanted to bump this to see if anyone has any input. I would really like to get these landed soon if there are no objections.

Hi Thomas,

With ISD::FMINNAN and ISD::FMAXNAN now easy to produce for any target
due to these newly exposed intrinsics, I think these nodes should be
handled in at least SelectionDAGLegalize::ExpandNode (for when the
float type is legal but the operation is not) and
DAGTypeLegalizer::SoftenFloatResult (for when the float type is not
legal).

Best,

Alex
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Proposed new min and max intrinsics

David Jones via llvm-dev
Sounds good, I'll take a look.

On Thu, Nov 1, 2018 at 9:45 AM Alex Bradbury <[hidden email]> wrote:
On Thu, 11 Oct 2018 at 00:28, Thomas Lively via llvm-dev
<[hidden email]> wrote:
>
> I just wanted to bump this to see if anyone has any input. I would really like to get these landed soon if there are no objections.

Hi Thomas,

With ISD::FMINNAN and ISD::FMAXNAN now easy to produce for any target
due to these newly exposed intrinsics, I think these nodes should be
handled in at least SelectionDAGLegalize::ExpandNode (for when the
float type is legal but the operation is not) and
DAGTypeLegalizer::SoftenFloatResult (for when the float type is not
legal).

Best,

Alex

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Proposed new min and max intrinsics

David Jones via llvm-dev
Alex,

After looking into this a bit, it looks to me like the best thing to do for targets that do not natively support ISD::MINIMUM and ISD::MAXIMUM would be to fall back to a libcall, since implementing these operations in terms of existing operations is actually rather complicated. Do you think it would make sense to add builtin functions to compiler-rt to implement these operations, or is there a better way of handling this?

Thanks,

Thomas

On Thu, Nov 1, 2018 at 11:49 AM Thomas Lively <[hidden email]> wrote:
Sounds good, I'll take a look.

On Thu, Nov 1, 2018 at 9:45 AM Alex Bradbury <[hidden email]> wrote:
On Thu, 11 Oct 2018 at 00:28, Thomas Lively via llvm-dev
<[hidden email]> wrote:
>
> I just wanted to bump this to see if anyone has any input. I would really like to get these landed soon if there are no objections.

Hi Thomas,

With ISD::FMINNAN and ISD::FMAXNAN now easy to produce for any target
due to these newly exposed intrinsics, I think these nodes should be
handled in at least SelectionDAGLegalize::ExpandNode (for when the
float type is legal but the operation is not) and
DAGTypeLegalizer::SoftenFloatResult (for when the float type is not
legal).

Best,

Alex

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Proposed new min and max intrinsics

David Jones via llvm-dev
What is so complicated about these? Shouldn't they just correspond to
two compares + selects?

To give a concrete example, x86 MIN[SP][SD] and MAX[SP][SD],
respectively, correspond exactly to

   MIN*: select(a < b, a, b) (i.e. "a < b ? a : b")
   MAX*: select(a > b, a, b) (i.e. "a > b ? a : b")

IIRC, MINIMUM and MAXIMUM have the added requirement that they should
return NaN if _either_ input is NaN, whereas the above will return NaN
if the second input (i.e. b) is NaN, but not if the first is.

So we need to explicitly catch the case where a is NaN as well. For
minimum, that works out to something like:

   %3 = fcmp olt float %a, %b
   %4 = select i1 %3, float %a, float %b ; (a < b) ? a : b
   %5 = fcmp ord float %a, %a ; true if !isNaN(a)
   %6 = select i1 %5, float %4, float %a ; if a was NaN, return a

for the entire operation. The logic here is that if isNaN(a) ||
isNaN(b), the initial comparison will evaluate to false and %4 ends up
being b. If isNaN(b), this is a NaN value (as required). The case we are
missing is when isNaN(a) && !isNaN(b), where %4 is not a NaN; the second
compare + select fixes that one up.

The first pair of these corresponds to a single (x86-style) MIN/MAX, and
the second turns into a compare followed by (depending on target
instruction set) either a BLEND or some logic ops.

For minimumNumber/maximumNumber, you should be able to use a similar
construction. Showing the example for minimumNumber here:

   %3 = fcmp olt float %a, %b
   %4 = select i1 %3, float %a, float %b ; (a < b) ? a : b
   %5 = fcmp ord float %b, %b ; true if !isNaN(b)
   %6 = select i1 %5, float %4, float %a ; if b was NaN, return a

Starts out the same as before. Here, the tricky case is !isNaN(a) &&
isNaN(b). The initial select %4 will result in the (NaN) b in that case,
and the second compare/select pair switches the result to a instead; we
will only get a NaN result if both inputs were NaN.

I might be missing something here, but that seems like a fairly harmless
expansion, as such things go, and going to compiler-rt feels like overkill.

-Fabian

On 11/7/2018 8:12 PM, Thomas Lively via llvm-dev wrote:

> Alex,
>
> After looking into this a bit, it looks to me like the best thing to do
> for targets that do not natively support ISD::MINIMUM and ISD::MAXIMUM
> would be to fall back to a libcall, since implementing these operations
> in terms of existing operations is actually rather complicated. Do you
> think it would make sense to add builtin functions to compiler-rt to
> implement these operations, or is there a better way of handling this?
>
> Thanks,
>
> Thomas
>
> On Thu, Nov 1, 2018 at 11:49 AM Thomas Lively <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     Sounds good, I'll take a look.
>
>     On Thu, Nov 1, 2018 at 9:45 AM Alex Bradbury <[hidden email]
>     <mailto:[hidden email]>> wrote:
>
>         On Thu, 11 Oct 2018 at 00:28, Thomas Lively via llvm-dev
>         <[hidden email] <mailto:[hidden email]>> wrote:
>          >
>          > I just wanted to bump this to see if anyone has any input. I
>         would really like to get these landed soon if there are no
>         objections.
>
>         Hi Thomas,
>
>         With ISD::FMINNAN and ISD::FMAXNAN now easy to produce for any
>         target
>         due to these newly exposed intrinsics, I think these nodes should be
>         handled in at least SelectionDAGLegalize::ExpandNode (for when the
>         float type is legal but the operation is not) and
>         DAGTypeLegalizer::SoftenFloatResult (for when the float type is not
>         legal).
>
>         Best,
>
>         Alex
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Proposed new min and max intrinsics

David Jones via llvm-dev
On Thu, Nov 8, 2018 at 11:35 PM Fabian Giesen via llvm-dev <[hidden email]> wrote:
What is so complicated about these? Shouldn't they just correspond to
two compares + selects?

To give a concrete example, x86 MIN[SP][SD] and MAX[SP][SD],
respectively, correspond exactly to

   MIN*: select(a < b, a, b) (i.e. "a < b ? a : b")
   MAX*: select(a > b, a, b) (i.e. "a > b ? a : b")

IIRC, MINIMUM and MAXIMUM have the added requirement that they should
return NaN if _either_ input is NaN, whereas the above will return NaN
if the second input (i.e. b) is NaN, but not if the first is.

So we need to explicitly catch the case where a is NaN as well. For
minimum, that works out to something like:

   %3 = fcmp olt float %a, %b
   %4 = select i1 %3, float %a, float %b ; (a < b) ? a : b
   %5 = fcmp ord float %a, %a ; true if !isNaN(a)
   %6 = select i1 %5, float %4, float %a ; if a was NaN, return a

for the entire operation. The logic here is that if isNaN(a) ||
isNaN(b), the initial comparison will evaluate to false and %4 ends up
being b. If isNaN(b), this is a NaN value (as required). The case we are
missing is when isNaN(a) && !isNaN(b), where %4 is not a NaN; the second
compare + select fixes that one up.

The first pair of these corresponds to a single (x86-style) MIN/MAX, and
the second turns into a compare followed by (depending on target
instruction set) either a BLEND or some logic ops.

For minimumNumber/maximumNumber, you should be able to use a similar
construction. Showing the example for minimumNumber here:

   %3 = fcmp olt float %a, %b
   %4 = select i1 %3, float %a, float %b ; (a < b) ? a : b
   %5 = fcmp ord float %b, %b ; true if !isNaN(b)
   %6 = select i1 %5, float %4, float %a ; if b was NaN, return a

Starts out the same as before. Here, the tricky case is !isNaN(a) &&
isNaN(b). The initial select %4 will result in the (NaN) b in that case,
and the second compare/select pair switches the result to a instead; we
will only get a NaN result if both inputs were NaN.

I might be missing something here, but that seems like a fairly harmless
expansion, as such things go, and going to compiler-rt feels like overkill.

+1 

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Proposed new min and max intrinsics

David Jones via llvm-dev
I would agree, but the expansion also has to properly treat negative zero as less than zero, which leads to something like the following:

;; Find minimum if both zero
%a_sign = fgetsign %a
%b_sign = fgetsign %b
%a_is_lesser_zero = icmp ugt %a_sign, %b_sign
%minimum_zeros = select %a_is_lesser_zero, %a, %b

;; Find minimum if not both zero (from above)
%no_nan_and_a_lesser = fcmp olt %a, %b
%lesser_or_b_if_nan = select %no_nan_and_a_lesser, %a, %b
%a_not_nan = fcmp ord %a, %a
%minimum_not_zeros = select %a_not_nan, %lesser_or_b_if_nan, %a

;; Choose between zeros and not-zeros
%a_is_zero = fcmp oeq %a, 0
%b_is_zero = fcmp oeq %b, 0
%both_zero = and %a_is_zero, %b_is_zero
%minimum = select %both_zero, %minimum_zeros, %minimum_not_zeros

Which is considerably less reasonable.



On Fri, Nov 9, 2018 at 7:00 AM Cameron McInally via llvm-dev <[hidden email]> wrote:
On Thu, Nov 8, 2018 at 11:35 PM Fabian Giesen via llvm-dev <[hidden email]> wrote:
What is so complicated about these? Shouldn't they just correspond to
two compares + selects?

To give a concrete example, x86 MIN[SP][SD] and MAX[SP][SD],
respectively, correspond exactly to

   MIN*: select(a < b, a, b) (i.e. "a < b ? a : b")
   MAX*: select(a > b, a, b) (i.e. "a > b ? a : b")

IIRC, MINIMUM and MAXIMUM have the added requirement that they should
return NaN if _either_ input is NaN, whereas the above will return NaN
if the second input (i.e. b) is NaN, but not if the first is.

So we need to explicitly catch the case where a is NaN as well. For
minimum, that works out to something like:

   %3 = fcmp olt float %a, %b
   %4 = select i1 %3, float %a, float %b ; (a < b) ? a : b
   %5 = fcmp ord float %a, %a ; true if !isNaN(a)
   %6 = select i1 %5, float %4, float %a ; if a was NaN, return a

for the entire operation. The logic here is that if isNaN(a) ||
isNaN(b), the initial comparison will evaluate to false and %4 ends up
being b. If isNaN(b), this is a NaN value (as required). The case we are
missing is when isNaN(a) && !isNaN(b), where %4 is not a NaN; the second
compare + select fixes that one up.

The first pair of these corresponds to a single (x86-style) MIN/MAX, and
the second turns into a compare followed by (depending on target
instruction set) either a BLEND or some logic ops.

For minimumNumber/maximumNumber, you should be able to use a similar
construction. Showing the example for minimumNumber here:

   %3 = fcmp olt float %a, %b
   %4 = select i1 %3, float %a, float %b ; (a < b) ? a : b
   %5 = fcmp ord float %b, %b ; true if !isNaN(b)
   %6 = select i1 %5, float %4, float %a ; if b was NaN, return a

Starts out the same as before. Here, the tricky case is !isNaN(a) &&
isNaN(b). The initial select %4 will result in the (NaN) b in that case,
and the second compare/select pair switches the result to a instead; we
will only get a NaN result if both inputs were NaN.

I might be missing something here, but that seems like a fairly harmless
expansion, as such things go, and going to compiler-rt feels like overkill.

+1 
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Proposed new min and max intrinsics

David Jones via llvm-dev
On Fri, Nov 9, 2018 at 1:57 PM Thomas Lively <[hidden email]> wrote:
I would agree, but the expansion also has to properly treat negative zero as less than zero,

I don't think libm does this. I'm also under the impression that IEEE-754 doesn't specify this either. Could you point me to the clause that does specify it?
 
which leads to something like the following:

;; Find minimum if both zero
%a_sign = fgetsign %a
%b_sign = fgetsign %b
%a_is_lesser_zero = icmp ugt %a_sign, %b_sign
%minimum_zeros = select %a_is_lesser_zero, %a, %b

;; Find minimum if not both zero (from above)
%no_nan_and_a_lesser = fcmp olt %a, %b
%lesser_or_b_if_nan = select %no_nan_and_a_lesser, %a, %b
%a_not_nan = fcmp ord %a, %a
%minimum_not_zeros = select %a_not_nan, %lesser_or_b_if_nan, %a

;; Choose between zeros and not-zeros
%a_is_zero = fcmp oeq %a, 0
%b_is_zero = fcmp oeq %b, 0
%both_zero = and %a_is_zero, %b_is_zero
%minimum = select %both_zero, %minimum_zeros, %minimum_not_zeros

Which is considerably less reasonable.



On Fri, Nov 9, 2018 at 7:00 AM Cameron McInally via llvm-dev <[hidden email]> wrote:
On Thu, Nov 8, 2018 at 11:35 PM Fabian Giesen via llvm-dev <[hidden email]> wrote:
What is so complicated about these? Shouldn't they just correspond to
two compares + selects?

To give a concrete example, x86 MIN[SP][SD] and MAX[SP][SD],
respectively, correspond exactly to

   MIN*: select(a < b, a, b) (i.e. "a < b ? a : b")
   MAX*: select(a > b, a, b) (i.e. "a > b ? a : b")

IIRC, MINIMUM and MAXIMUM have the added requirement that they should
return NaN if _either_ input is NaN, whereas the above will return NaN
if the second input (i.e. b) is NaN, but not if the first is.

So we need to explicitly catch the case where a is NaN as well. For
minimum, that works out to something like:

   %3 = fcmp olt float %a, %b
   %4 = select i1 %3, float %a, float %b ; (a < b) ? a : b
   %5 = fcmp ord float %a, %a ; true if !isNaN(a)
   %6 = select i1 %5, float %4, float %a ; if a was NaN, return a

for the entire operation. The logic here is that if isNaN(a) ||
isNaN(b), the initial comparison will evaluate to false and %4 ends up
being b. If isNaN(b), this is a NaN value (as required). The case we are
missing is when isNaN(a) && !isNaN(b), where %4 is not a NaN; the second
compare + select fixes that one up.

The first pair of these corresponds to a single (x86-style) MIN/MAX, and
the second turns into a compare followed by (depending on target
instruction set) either a BLEND or some logic ops.

For minimumNumber/maximumNumber, you should be able to use a similar
construction. Showing the example for minimumNumber here:

   %3 = fcmp olt float %a, %b
   %4 = select i1 %3, float %a, float %b ; (a < b) ? a : b
   %5 = fcmp ord float %b, %b ; true if !isNaN(b)
   %6 = select i1 %5, float %4, float %a ; if b was NaN, return a

Starts out the same as before. Here, the tricky case is !isNaN(a) &&
isNaN(b). The initial select %4 will result in the (NaN) b in that case,
and the second compare/select pair switches the result to a instead; we
will only get a NaN result if both inputs were NaN.

I might be missing something here, but that seems like a fairly harmless
expansion, as such things go, and going to compiler-rt feels like overkill.

+1 
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Proposed new min and max intrinsics

David Jones via llvm-dev
In reply to this post by David Jones via llvm-dev
Ah, forget about the +-0 issue!

Slightly different way to handle them: XOR the two arguments, AND to
keep only the sign bit, then OR it back into the result at the end (for
minimum; maximum would do a bit clear/and-not).

The argument goes as follows:
1) If the two sign bits agree, this does nothing
2) If the two sign bits disagree:
   2.1) If the comparison was unordered, we already return a NaN; this
makes us return a NaN with sign bit set. (But still a NaN. I don't see
anything in the rules for minimum/maximum guaranteeing a specific NaN.)
   2.2) If the comparison was ordered and not between two 0s, the value
with the sign bit set is the smaller one, so the OR ends up doing nothing.
   2.3) If the comparison was ordered and between two 0s, the OR
guarantees we return a negative 0.

If you write this out with the required bitcasts it probably still gets
lengthy in LLVM IR form (so might still want to put it in compiler-rt),
but this seems like a preferable implementation provided the target lets
you do the required bitwise ops on FP registers.

-Fabian

On 11/9/18 10:56 AM, Thomas Lively wrote:

> I would agree, but the expansion also has to properly treat negative
> zero as less than zero, which leads to something like the following:
>
> ;; Find minimum if both zero
> %a_sign = fgetsign %a
> %b_sign = fgetsign %b
> %a_is_lesser_zero = icmp ugt %a_sign, %b_sign
> %minimum_zeros = select %a_is_lesser_zero, %a, %b
>
> ;; Find minimum if not both zero (from above)
> %no_nan_and_a_lesser = fcmp olt %a, %b
> %lesser_or_b_if_nan = select %no_nan_and_a_lesser, %a, %b
> %a_not_nan = fcmp ord %a, %a
> %minimum_not_zeros = select %a_not_nan, %lesser_or_b_if_nan, %a
>
> ;; Choose between zeros and not-zeros
> %a_is_zero = fcmp oeq %a, 0
> %b_is_zero = fcmp oeq %b, 0
> %both_zero = and %a_is_zero, %b_is_zero
> %minimum = select %both_zero, %minimum_zeros, %minimum_not_zeros
>
> Which is considerably less reasonable.
>
>
>
> On Fri, Nov 9, 2018 at 7:00 AM Cameron McInally via llvm-dev
> <[hidden email] <mailto:[hidden email]>> wrote:
>
>     On Thu, Nov 8, 2018 at 11:35 PM Fabian Giesen via llvm-dev
>     <[hidden email] <mailto:[hidden email]>> wrote:
>
>         What is so complicated about these? Shouldn't they just
>         correspond to
>         two compares + selects?
>
>         To give a concrete example, x86 MIN[SP][SD] and MAX[SP][SD],
>         respectively, correspond exactly to
>
>            MIN*: select(a < b, a, b) (i.e. "a < b ? a : b")
>            MAX*: select(a > b, a, b) (i.e. "a > b ? a : b")
>
>         IIRC, MINIMUM and MAXIMUM have the added requirement that they
>         should
>         return NaN if _either_ input is NaN, whereas the above will
>         return NaN
>         if the second input (i.e. b) is NaN, but not if the first is.
>
>         So we need to explicitly catch the case where a is NaN as
>         well. For
>         minimum, that works out to something like:
>
>            %3 = fcmp olt float %a, %b
>            %4 = select i1 %3, float %a, float %b ; (a < b) ? a : b
>            %5 = fcmp ord float %a, %a ; true if !isNaN(a)
>            %6 = select i1 %5, float %4, float %a ; if a was NaN, return a
>
>         for the entire operation. The logic here is that if isNaN(a) ||
>         isNaN(b), the initial comparison will evaluate to false and %4
>         ends up
>         being b. If isNaN(b), this is a NaN value (as required). The
>         case we are
>         missing is when isNaN(a) && !isNaN(b), where %4 is not a NaN;
>         the second
>         compare + select fixes that one up.
>
>         The first pair of these corresponds to a single (x86-style)
>         MIN/MAX, and
>         the second turns into a compare followed by (depending on target
>         instruction set) either a BLEND or some logic ops.
>
>         For minimumNumber/maximumNumber, you should be able to use a
>         similar
>         construction. Showing the example for minimumNumber here:
>
>            %3 = fcmp olt float %a, %b
>            %4 = select i1 %3, float %a, float %b ; (a < b) ? a : b
>            %5 = fcmp ord float %b, %b ; true if !isNaN(b)
>            %6 = select i1 %5, float %4, float %a ; if b was NaN, return a
>
>         Starts out the same as before. Here, the tricky case is
>         !isNaN(a) &&
>         isNaN(b). The initial select %4 will result in the (NaN) b in
>         that case,
>         and the second compare/select pair switches the result to a
>         instead; we
>         will only get a NaN result if both inputs were NaN.
>
>         I might be missing something here, but that seems like a
>         fairly harmless
>         expansion, as such things go, and going to compiler-rt feels
>         like overkill.
>
>
>     +1
>     _______________________________________________
>     LLVM Developers mailing list
>     [hidden email] <mailto:[hidden email]>
>     http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Proposed new min and max intrinsics

David Jones via llvm-dev
Unfortunately the draft spec in section 6.2.3 does specify that the output NaN should be one of the input NaNs, and this property is important for WebAssembly as well. It's good to know that the bar for unreasonable expansions is higher than I would have expected.

On Fri, Nov 9, 2018 at 2:22 PM Fabian Giesen <[hidden email]> wrote:
Ah, forget about the +-0 issue!

Slightly different way to handle them: XOR the two arguments, AND to
keep only the sign bit, then OR it back into the result at the end (for
minimum; maximum would do a bit clear/and-not).

The argument goes as follows:
1) If the two sign bits agree, this does nothing
2) If the two sign bits disagree:
   2.1) If the comparison was unordered, we already return a NaN; this
makes us return a NaN with sign bit set. (But still a NaN. I don't see
anything in the rules for minimum/maximum guaranteeing a specific NaN.)
   2.2) If the comparison was ordered and not between two 0s, the value
with the sign bit set is the smaller one, so the OR ends up doing nothing.
   2.3) If the comparison was ordered and between two 0s, the OR
guarantees we return a negative 0.

If you write this out with the required bitcasts it probably still gets
lengthy in LLVM IR form (so might still want to put it in compiler-rt),
but this seems like a preferable implementation provided the target lets
you do the required bitwise ops on FP registers.

-Fabian

On 11/9/18 10:56 AM, Thomas Lively wrote:

> I would agree, but the expansion also has to properly treat negative
> zero as less than zero, which leads to something like the following:
>
> ;; Find minimum if both zero
> %a_sign = fgetsign %a
> %b_sign = fgetsign %b
> %a_is_lesser_zero = icmp ugt %a_sign, %b_sign
> %minimum_zeros = select %a_is_lesser_zero, %a, %b
>
> ;; Find minimum if not both zero (from above)
> %no_nan_and_a_lesser = fcmp olt %a, %b
> %lesser_or_b_if_nan = select %no_nan_and_a_lesser, %a, %b
> %a_not_nan = fcmp ord %a, %a
> %minimum_not_zeros = select %a_not_nan, %lesser_or_b_if_nan, %a
>
> ;; Choose between zeros and not-zeros
> %a_is_zero = fcmp oeq %a, 0
> %b_is_zero = fcmp oeq %b, 0
> %both_zero = and %a_is_zero, %b_is_zero
> %minimum = select %both_zero, %minimum_zeros, %minimum_not_zeros
>
> Which is considerably less reasonable.
>
>
>
> On Fri, Nov 9, 2018 at 7:00 AM Cameron McInally via llvm-dev
> <[hidden email] <mailto:[hidden email]>> wrote:
>
>     On Thu, Nov 8, 2018 at 11:35 PM Fabian Giesen via llvm-dev
>     <[hidden email] <mailto:[hidden email]>> wrote:
>
>         What is so complicated about these? Shouldn't they just
>         correspond to
>         two compares + selects?
>
>         To give a concrete example, x86 MIN[SP][SD] and MAX[SP][SD],
>         respectively, correspond exactly to
>
>            MIN*: select(a < b, a, b) (i.e. "a < b ? a : b")
>            MAX*: select(a > b, a, b) (i.e. "a > b ? a : b")
>
>         IIRC, MINIMUM and MAXIMUM have the added requirement that they
>         should
>         return NaN if _either_ input is NaN, whereas the above will
>         return NaN
>         if the second input (i.e. b) is NaN, but not if the first is.
>
>         So we need to explicitly catch the case where a is NaN as
>         well. For
>         minimum, that works out to something like:
>
>            %3 = fcmp olt float %a, %b
>            %4 = select i1 %3, float %a, float %b ; (a < b) ? a : b
>            %5 = fcmp ord float %a, %a ; true if !isNaN(a)
>            %6 = select i1 %5, float %4, float %a ; if a was NaN, return a
>
>         for the entire operation. The logic here is that if isNaN(a) ||
>         isNaN(b), the initial comparison will evaluate to false and %4
>         ends up
>         being b. If isNaN(b), this is a NaN value (as required). The
>         case we are
>         missing is when isNaN(a) && !isNaN(b), where %4 is not a NaN;
>         the second
>         compare + select fixes that one up.
>
>         The first pair of these corresponds to a single (x86-style)
>         MIN/MAX, and
>         the second turns into a compare followed by (depending on target
>         instruction set) either a BLEND or some logic ops.
>
>         For minimumNumber/maximumNumber, you should be able to use a
>         similar
>         construction. Showing the example for minimumNumber here:
>
>            %3 = fcmp olt float %a, %b
>            %4 = select i1 %3, float %a, float %b ; (a < b) ? a : b
>            %5 = fcmp ord float %b, %b ; true if !isNaN(b)
>            %6 = select i1 %5, float %4, float %a ; if b was NaN, return a
>
>         Starts out the same as before. Here, the tricky case is
>         !isNaN(a) &&
>         isNaN(b). The initial select %4 will result in the (NaN) b in
>         that case,
>         and the second compare/select pair switches the result to a
>         instead; we
>         will only get a NaN result if both inputs were NaN.
>
>         I might be missing something here, but that seems like a
>         fairly harmless
>         expansion, as such things go, and going to compiler-rt feels
>         like overkill.
>
>
>     +1
>     _______________________________________________
>     LLVM Developers mailing list
>     [hidden email] <mailto:[hidden email]>
>     http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Proposed new min and max intrinsics

David Jones via llvm-dev
In reply to this post by David Jones via llvm-dev
One more idea: (if this is getting spammy, tell me and I'll stop!)

If %a and %b are ordered equal, return %a | %b (gives -0.0 if either is
-0.0; for maximum, it would be %a & %b, only returning -0.0 if both are
-0.0). The only two distinct bit patterns that compare ordered equal
should be -0 and +0, which are identical everywhere except for the sign bit.

Gets a bit bit-casty but not that bad:

   ;; Work out result for when %a == %b (ordered)
   %a_bits = bitcast float %a to i32
   %b_bits = bitcast float %b to i32
   %a_bitor_b_bits = or i32 %a_bits, %b_bits
   %a_bitor_b = bitcast i32 %a_bitor_b_bits to float

   ;; Find minimum if not both zero (from above)
   %no_nan_and_a_lesser = fcmp olt %a, %b
   %lesser_or_b_if_nan = select %no_nan_and_a_lesser, %a, %b
   %a_not_nan = fcmp ord %a, %a
   %minimum_not_zeros = select %a_not_nan, %lesser_or_b_if_nan, %a

   ;; Handle +-0
   %a_and_b_ordered_equal = fcmp oeq %a, %b
   %final_minimum = select %a_and_b_ordered_equal, %a_bitor_b,
%minimum_not_zeros

For x86-64, we can do a bit better by combining the OR with the final
select, which I haven't tried to express in LLVM IR:

   ; input: %a in xmm0, %b in xmm1
   movaps   xmm2, xmm0
   cmpeqss  xmm2, xmm1 ; %a_or_b_ordered_equal
   andps    xmm2, xmm0 ; %a_or_b_ordered_equal & %a

   movaps   xmm3, xmm0
   minss    xmm3, xmm1 ; %lesser_or_b_if_nan
   movaps   xmm4, xmm0
   cmpordss xmm0, xmm0 ; %a_not_nan
   blendvps xmm3, xmm4 ; %minimum_not_zeros
   ; (blendvps turns into andps / andnps / orps pre-SSE4.1)

   orps     xmm2, xmm3 ; %final_minimum

(if %a_or_b_ordered_equal, then the result of %minimum_not_zeros is
guaranteed to be %b, hence we only need to conditionally OR in %a in
that case.)

For the maximum case, you would use "cmpneqss", "orps xmm2, xmm0", then
"andps xmm2, xmm3" for the final combine.

-Fabian

On 11/9/2018 2:22 PM, Fabian Giesen wrote:

> Ah, forget about the +-0 issue!
>
> Slightly different way to handle them: XOR the two arguments, AND to
> keep only the sign bit, then OR it back into the result at the end (for
> minimum; maximum would do a bit clear/and-not).
>
> The argument goes as follows:
> 1) If the two sign bits agree, this does nothing
> 2) If the two sign bits disagree:
>    2.1) If the comparison was unordered, we already return a NaN; this
> makes us return a NaN with sign bit set. (But still a NaN. I don't see
> anything in the rules for minimum/maximum guaranteeing a specific NaN.)
>    2.2) If the comparison was ordered and not between two 0s, the value
> with the sign bit set is the smaller one, so the OR ends up doing nothing.
>    2.3) If the comparison was ordered and between two 0s, the OR
> guarantees we return a negative 0.
>
> If you write this out with the required bitcasts it probably still gets
> lengthy in LLVM IR form (so might still want to put it in compiler-rt),
> but this seems like a preferable implementation provided the target lets
> you do the required bitwise ops on FP registers.
>
> -Fabian
>
> On 11/9/18 10:56 AM, Thomas Lively wrote:
>
>> I would agree, but the expansion also has to properly treat negative
>> zero as less than zero, which leads to something like the following:
>>
>> ;; Find minimum if both zero
>> %a_sign = fgetsign %a
>> %b_sign = fgetsign %b
>> %a_is_lesser_zero = icmp ugt %a_sign, %b_sign
>> %minimum_zeros = select %a_is_lesser_zero, %a, %b
>>
>> ;; Find minimum if not both zero (from above)
>> %no_nan_and_a_lesser = fcmp olt %a, %b
>> %lesser_or_b_if_nan = select %no_nan_and_a_lesser, %a, %b
>> %a_not_nan = fcmp ord %a, %a
>> %minimum_not_zeros = select %a_not_nan, %lesser_or_b_if_nan, %a
>>
>> ;; Choose between zeros and not-zeros
>> %a_is_zero = fcmp oeq %a, 0
>> %b_is_zero = fcmp oeq %b, 0
>> %both_zero = and %a_is_zero, %b_is_zero
>> %minimum = select %both_zero, %minimum_zeros, %minimum_not_zeros
>>
>> Which is considerably less reasonable.
>>
>>
>>
>> On Fri, Nov 9, 2018 at 7:00 AM Cameron McInally via llvm-dev
>> <[hidden email] <mailto:[hidden email]>> wrote:
>>
>>     On Thu, Nov 8, 2018 at 11:35 PM Fabian Giesen via llvm-dev
>>     <[hidden email] <mailto:[hidden email]>> wrote:
>>
>>         What is so complicated about these? Shouldn't they just
>>         correspond to
>>         two compares + selects?
>>
>>         To give a concrete example, x86 MIN[SP][SD] and MAX[SP][SD],
>>         respectively, correspond exactly to
>>
>>            MIN*: select(a < b, a, b) (i.e. "a < b ? a : b")
>>            MAX*: select(a > b, a, b) (i.e. "a > b ? a : b")
>>
>>         IIRC, MINIMUM and MAXIMUM have the added requirement that they
>>         should
>>         return NaN if _either_ input is NaN, whereas the above will
>>         return NaN
>>         if the second input (i.e. b) is NaN, but not if the first is.
>>
>>         So we need to explicitly catch the case where a is NaN as
>>         well. For
>>         minimum, that works out to something like:
>>
>>            %3 = fcmp olt float %a, %b
>>            %4 = select i1 %3, float %a, float %b ; (a < b) ? a : b
>>            %5 = fcmp ord float %a, %a ; true if !isNaN(a)
>>            %6 = select i1 %5, float %4, float %a ; if a was NaN, return a
>>
>>         for the entire operation. The logic here is that if isNaN(a) ||
>>         isNaN(b), the initial comparison will evaluate to false and %4
>>         ends up
>>         being b. If isNaN(b), this is a NaN value (as required). The
>>         case we are
>>         missing is when isNaN(a) && !isNaN(b), where %4 is not a NaN;
>>         the second
>>         compare + select fixes that one up.
>>
>>         The first pair of these corresponds to a single (x86-style)
>>         MIN/MAX, and
>>         the second turns into a compare followed by (depending on target
>>         instruction set) either a BLEND or some logic ops.
>>
>>         For minimumNumber/maximumNumber, you should be able to use a
>>         similar
>>         construction. Showing the example for minimumNumber here:
>>
>>            %3 = fcmp olt float %a, %b
>>            %4 = select i1 %3, float %a, float %b ; (a < b) ? a : b
>>            %5 = fcmp ord float %b, %b ; true if !isNaN(b)
>>            %6 = select i1 %5, float %4, float %a ; if b was NaN, return a
>>
>>         Starts out the same as before. Here, the tricky case is
>>         !isNaN(a) &&
>>         isNaN(b). The initial select %4 will result in the (NaN) b in
>>         that case,
>>         and the second compare/select pair switches the result to a
>>         instead; we
>>         will only get a NaN result if both inputs were NaN.
>>
>>         I might be missing something here, but that seems like a
>>         fairly harmless
>>         expansion, as such things go, and going to compiler-rt feels
>>         like overkill.
>>
>>
>>     +1
>>     _______________________________________________
>>     LLVM Developers mailing list
>>     [hidden email] <mailto:[hidden email]>
>>     http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev