[llvm-dev] [FP] Constant folding math library functions

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

[llvm-dev] [FP] Constant folding math library functions

Tingyuan LIANG via llvm-dev

Hi everyone,

 

I noticed today that LLVM’s constant folding of math library functions can lead to minor differences in results. A colleague sent me the following test case which demonstrates the issue:

 

#include <stdio.h>

#include <math.h>

 

typedef union {

  double d;

  unsigned long long i;

} my_dbl;

 

int main(void) {

  my_dbl res, x;

  x.i = 0x3feeb39556255de2ull;

  res.d = tanh(x.d);

  printf("tanh(%f) = %f = %016LX\n", x.d, res.d, res.i);

  return 0;

}

 

Compiling with “clang -O2 -g0 -emit-llvm” I get this:

 

define dso_local i32 @main() local_unnamed_addr #0 {

  %1 = tail call double @tanh(double 0x3FEEB39556255DE2) #2

  %2 = tail call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([24 x i8], [24 x i8]* @.str, i64 0, i64 0),

                                                             double 0x3FEEB39556255DE2, double 0x3FE7CF009CE7F169,

                                                             i64 4604876745549017449)

  ret i32 0

}

 

We’re still calling ‘tanh’ but all the values passed to printf are constant folded. The constant folding is based on a call to tanh made by the compiler. The problem with this is that if I am linking my program against a different version of the math library than was used by the compiler I may get a different result.

 

I can prevent this constant folding with either the ‘nobuiltin’ or ‘strictfp’ attribute. However, it seems to me like this optimization should really be checking the ‘afn’ fast math flag.

 

Opinions?

 

Thanks,

Andy

 


_______________________________________________
LLVM Developers mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [FP] Constant folding math library functions

Tingyuan LIANG via llvm-dev
Hi, Andy,

This is somewhat tricky. 'afn' is for approximate functions, to "allow substitution of approximate calculations for functions", but in this case, the answers aren't any more approximate than the original function calls. Different, but likely no less accurate. This has long caused these kinds of subtle differences when cross compiling, etc. but it's not clear what the best thing to do actually is. Users often want the constant folding, and I've certainly seen code where the performance depends critically on it, and yet, the compiler will likely never be able to exactly replicate the behavior of whatever libm implementation is used at runtime. Maybe having a dedicated flag to disable just this behavior, aside from suggesting that users use -fno-builtin=..., would be useful for users who depend on the compiler not folding these kinds of expressions in ways that might differ from their runtime libm behavior?

 -Hal

Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory


From: llvm-dev <[hidden email]> on behalf of Kaylor, Andrew via llvm-dev <[hidden email]>
Sent: Tuesday, April 16, 2019 2:23 PM
To: llvm-dev
Subject: [llvm-dev] [FP] Constant folding math library functions
 

Hi everyone,

 

I noticed today that LLVM’s constant folding of math library functions can lead to minor differences in results. A colleague sent me the following test case which demonstrates the issue:

 

#include <stdio.h>

#include <math.h>

 

typedef union {

  double d;

  unsigned long long i;

} my_dbl;

 

int main(void) {

  my_dbl res, x;

  x.i = 0x3feeb39556255de2ull;

  res.d = tanh(x.d);

  printf("tanh(%f) = %f = %016LX\n", x.d, res.d, res.i);

  return 0;

}

 

Compiling with “clang -O2 -g0 -emit-llvm” I get this:

 

define dso_local i32 @main() local_unnamed_addr #0 {

  %1 = tail call double @tanh(double 0x3FEEB39556255DE2) #2

  %2 = tail call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([24 x i8], [24 x i8]* @.str, i64 0, i64 0),

                                                             double 0x3FEEB39556255DE2, double 0x3FE7CF009CE7F169,

                                                             i64 4604876745549017449)

  ret i32 0

}

 

We’re still calling ‘tanh’ but all the values passed to printf are constant folded. The constant folding is based on a call to tanh made by the compiler. The problem with this is that if I am linking my program against a different version of the math library than was used by the compiler I may get a different result.

 

I can prevent this constant folding with either the ‘nobuiltin’ or ‘strictfp’ attribute. However, it seems to me like this optimization should really be checking the ‘afn’ fast math flag.

 

Opinions?

 

Thanks,

Andy

 


_______________________________________________
LLVM Developers mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [FP] Constant folding math library functions

Tingyuan LIANG via llvm-dev

Thanks, Hal.

 

I hear what you are saying about the accuracy. The problem, from my perspective, is trying to explain to users what they are going to get. The constant folding may be as accurate as the lib call would have been, but it isn’t necessarily value safe. I’ve been operating on the assumption that LLVM’s FP optimizations are value safe unless fast math flags are used. For the most part that appears to be true. This case breaks my assumption.

 

I realize that any call to a library function puts claims of value safety on shaky ground, but the standard I’m going for is that you’ll get the same bitwise results compiling at -O0 as you will at -O2 (for instance).

 

That said, I agree that the difference between constant folding a library call and substituting an approximate calculation is significant. Most users would probably prefer to have this optimization enabled by default. It just leads to a kind of murky answer to the question of whether or not we’re value safe by default for the users who do care about that.

I guess what I’m saying is that I do like the idea of a separate flag for this, though as I recall we’re running out of bits for fast math flags. I’m also not sure whether it should be on by default. If we want to permit this transformation by default, then it shouldn’t be a fast math flag. Probably an attribute on the call site is better? And in that case it feels like we’d be circling back toward “nobuiltin” but can the front end identify which call sites would need that?

 

-Andy

 

From: Finkel, Hal J. <[hidden email]>
Sent: Tuesday, April 16, 2019 1:01 PM
To: llvm-dev <[hidden email]>; Kaylor, Andrew <[hidden email]>
Subject: Re: [FP] Constant folding math library functions

 

Hi, Andy,

 

This is somewhat tricky. 'afn' is for approximate functions, to "allow substitution of approximate calculations for functions", but in this case, the answers aren't any more approximate than the original function calls. Different, but likely no less accurate. This has long caused these kinds of subtle differences when cross compiling, etc. but it's not clear what the best thing to do actually is. Users often want the constant folding, and I've certainly seen code where the performance depends critically on it, and yet, the compiler will likely never be able to exactly replicate the behavior of whatever libm implementation is used at runtime. Maybe having a dedicated flag to disable just this behavior, aside from suggesting that users use -fno-builtin=..., would be useful for users who depend on the compiler not folding these kinds of expressions in ways that might differ from their runtime libm behavior?

 

 -Hal

 

Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

 


From: llvm-dev <[hidden email]> on behalf of Kaylor, Andrew via llvm-dev <[hidden email]>
Sent: Tuesday, April 16, 2019 2:23 PM
To: llvm-dev
Subject: [llvm-dev] [FP] Constant folding math library functions

 

Hi everyone,

 

I noticed today that LLVM’s constant folding of math library functions can lead to minor differences in results. A colleague sent me the following test case which demonstrates the issue:

 

#include <stdio.h>

#include <math.h>

 

typedef union {

  double d;

  unsigned long long i;

} my_dbl;

 

int main(void) {

  my_dbl res, x;

  x.i = 0x3feeb39556255de2ull;

  res.d = tanh(x.d);

  printf("tanh(%f) = %f = %016LX\n", x.d, res.d, res.i);

  return 0;

}

 

Compiling with “clang -O2 -g0 -emit-llvm” I get this:

 

define dso_local i32 @main() local_unnamed_addr #0 {

  %1 = tail call double @tanh(double 0x3FEEB39556255DE2) #2

  %2 = tail call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([24 x i8], [24 x i8]* @.str, i64 0, i64 0),

                                                             double 0x3FEEB39556255DE2, double 0x3FE7CF009CE7F169,

                                                             i64 4604876745549017449)

  ret i32 0

}

 

We’re still calling ‘tanh’ but all the values passed to printf are constant folded. The constant folding is based on a call to tanh made by the compiler. The problem with this is that if I am linking my program against a different version of the math library than was used by the compiler I may get a different result.

 

I can prevent this constant folding with either the ‘nobuiltin’ or ‘strictfp’ attribute. However, it seems to me like this optimization should really be checking the ‘afn’ fast math flag.

 

Opinions?

 

Thanks,

Andy

 


_______________________________________________
LLVM Developers mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [FP] Constant folding math library functions

Tingyuan LIANG via llvm-dev


On Apr 16, 2019, at 1:47 PM, Kaylor, Andrew via llvm-dev <[hidden email]> wrote:

Thanks, Hal.
 
I hear what you are saying about the accuracy. The problem, from my perspective, is trying to explain to users what they are going to get. The constant folding may be as accurate as the lib call would have been, but it isn’t necessarily value safe. I’ve been operating on the assumption that LLVM’s FP optimizations are value safe unless fast math flags are used. For the most part that appears to be true. This case breaks my assumption.
 
I realize that any call to a library function puts claims of value safety on shaky ground, but the standard I’m going for is that you’ll get the same bitwise results compiling at -O0 as you will at -O2 (for instance).
 
That said, I agree that the difference between constant folding a library call and substituting an approximate calculation is significant. Most users would probably prefer to have this optimization enabled by default. It just leads to a kind of murky answer to the question of whether or not we’re value safe by default for the users who do care about that.
I guess what I’m saying is that I do like the idea of a separate flag for this, though as I recall we’re running out of bits for fast math flags. I’m also not sure whether it should be on by default. If we want to permit this transformation by default, then it shouldn’t be a fast math flag. Probably an attribute on the call site is better? And in that case it feels like we’d be circling back toward “nobuiltin” but can the front end identify which call sites would need that?
Could this not be a function attribute if it’s intended to be consistent across entire functions/programs?

I agree with Hal that afn doesn’t sound like the right approach, and in terms of how the compiler actually treats these calls (I’m thinking about more than constant folding here) then it seems to be that this is the same as -fno-builtin. For example, can the optimizer assume some properties about the result value of a call if it knowns some (partial) information about the argument (e.g. sign)? If we prevent constant folding then that also precludes this kind of optimization. Perhaps an umbrella flag like -fno-builtin-math-lib that would turn on -fno-builtin for all of the libm functions?

Amara
 
-Andy
 
From: Finkel, Hal J. <[hidden email]> 
Sent: Tuesday, April 16, 2019 1:01 PM
To: llvm-dev <[hidden email]>; Kaylor, Andrew <[hidden email]>
Subject: Re: [FP] Constant folding math library functions
 
Hi, Andy,
 
This is somewhat tricky. 'afn' is for approximate functions, to "allow substitution of approximate calculations for functions", but in this case, the answers aren't any more approximate than the original function calls. Different, but likely no less accurate. This has long caused these kinds of subtle differences when cross compiling, etc. but it's not clear what the best thing to do actually is. Users often want the constant folding, and I've certainly seen code where the performance depends critically on it, and yet, the compiler will likely never be able to exactly replicate the behavior of whatever libm implementation is used at runtime. Maybe having a dedicated flag to disable just this behavior, aside from suggesting that users use -fno-builtin=..., would be useful for users who depend on the compiler not folding these kinds of expressions in ways that might differ from their runtime libm behavior?
 
 -Hal
 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
 

From: llvm-dev <[hidden email]> on behalf of Kaylor, Andrew via llvm-dev <[hidden email]>
Sent: Tuesday, April 16, 2019 2:23 PM
To: llvm-dev
Subject: [llvm-dev] [FP] Constant folding math library functions
 
Hi everyone,
 
I noticed today that LLVM’s constant folding of math library functions can lead to minor differences in results. A colleague sent me the following test case which demonstrates the issue:
 
#include <stdio.h>
#include <math.h>
 
typedef union {
  double d;
  unsigned long long i;
} my_dbl;
 
int main(void) {
  my_dbl res, x;
  x.i = 0x3feeb39556255de2ull;
  res.d = tanh(x.d);
  printf("tanh(%f) = %f = %016LX\n", x.d, res.d, res.i);
  return 0;
}
 
Compiling with “clang -O2 -g0 -emit-llvm” I get this:
 
define dso_local i32 @main() local_unnamed_addr #0 {
  %1 = tail call double @tanh(double 0x3FEEB39556255DE2) #2
  %2 = tail call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([24 x i8], [24 x i8]* @.str, i64 0, i64 0),
                                                             double 0x3FEEB39556255DE2, double 0x3FE7CF009CE7F169,
                                                             i64 4604876745549017449)
  ret i32 0
}
 
We’re still calling ‘tanh’ but all the values passed to printf are constant folded. The constant folding is based on a call to tanh made by the compiler. The problem with this is that if I am linking my program against a different version of the math library than was used by the compiler I may get a different result.
 
I can prevent this constant folding with either the ‘nobuiltin’ or ‘strictfp’ attribute. However, it seems to me like this optimization should really be checking the ‘afn’ fast math flag.
 
Opinions?
 
Thanks,
Andy
 
_______________________________________________
LLVM Developers mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


_______________________________________________
LLVM Developers mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [FP] Constant folding math library functions

Tingyuan LIANG via llvm-dev


On 4/16/19 4:18 PM, Amara Emerson wrote:


On Apr 16, 2019, at 1:47 PM, Kaylor, Andrew via llvm-dev <[hidden email]> wrote:

Thanks, Hal.
 
I hear what you are saying about the accuracy. The problem, from my perspective, is trying to explain to users what they are going to get. The constant folding may be as accurate as the lib call would have been, but it isn’t necessarily value safe. I’ve been operating on the assumption that LLVM’s FP optimizations are value safe unless fast math flags are used. For the most part that appears to be true. This case breaks my assumption.
 
I realize that any call to a library function puts claims of value safety on shaky ground, but the standard I’m going for is that you’ll get the same bitwise results compiling at -O0 as you will at -O2 (for instance).
 
That said, I agree that the difference between constant folding a library call and substituting an approximate calculation is significant. Most users would probably prefer to have this optimization enabled by default. It just leads to a kind of murky answer to the question of whether or not we’re value safe by default for the users who do care about that.
I guess what I’m saying is that I do like the idea of a separate flag for this, though as I recall we’re running out of bits for fast math flags. I’m also not sure whether it should be on by default. If we want to permit this transformation by default, then it shouldn’t be a fast math flag. Probably an attribute on the call site is better? And in that case it feels like we’d be circling back toward “nobuiltin” but can the front end identify which call sites would need that?
Could this not be a function attribute if it’s intended to be consistent across entire functions/programs?

I agree with Hal that afn doesn’t sound like the right approach, and in terms of how the compiler actually treats these calls (I’m thinking about more than constant folding here) then it seems to be that this is the same as -fno-builtin. For example, can the optimizer assume some properties about the result value of a call if it knowns some (partial) information about the argument (e.g. sign)? If we prevent constant folding then that also precludes this kind of optimization. Perhaps an umbrella flag like -fno-builtin-math-lib that would turn on -fno-builtin for all of the libm functions?


I certainly think that the umbrella flag is a good approach. The tricky part is implementing it so that Clang does not need to have a list of the relevant functions that LLVM's optimizer knows about (thus, I don't think that we can implement this just by having Clang add metadata to a predefined list of math functions).

 -Hal



Amara
 
-Andy
 
From: Finkel, Hal J. <[hidden email]> 
Sent: Tuesday, April 16, 2019 1:01 PM
To: llvm-dev <[hidden email]>; Kaylor, Andrew <[hidden email]>
Subject: Re: [FP] Constant folding math library functions
 
Hi, Andy,
 
This is somewhat tricky. 'afn' is for approximate functions, to "allow substitution of approximate calculations for functions", but in this case, the answers aren't any more approximate than the original function calls. Different, but likely no less accurate. This has long caused these kinds of subtle differences when cross compiling, etc. but it's not clear what the best thing to do actually is. Users often want the constant folding, and I've certainly seen code where the performance depends critically on it, and yet, the compiler will likely never be able to exactly replicate the behavior of whatever libm implementation is used at runtime. Maybe having a dedicated flag to disable just this behavior, aside from suggesting that users use -fno-builtin=..., would be useful for users who depend on the compiler not folding these kinds of expressions in ways that might differ from their runtime libm behavior?
 
 -Hal
 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
 

From: llvm-dev <[hidden email]> on behalf of Kaylor, Andrew via llvm-dev <[hidden email]>
Sent: Tuesday, April 16, 2019 2:23 PM
To: llvm-dev
Subject: [llvm-dev] [FP] Constant folding math library functions
 
Hi everyone,
 
I noticed today that LLVM’s constant folding of math library functions can lead to minor differences in results. A colleague sent me the following test case which demonstrates the issue:
 
#include <stdio.h>
#include <math.h>
 
typedef union {
  double d;
  unsigned long long i;
} my_dbl;
 
int main(void) {
  my_dbl res, x;
  x.i = 0x3feeb39556255de2ull;
  res.d = tanh(x.d);
  printf("tanh(%f) = %f = %016LX\n", x.d, res.d, res.i);
  return 0;
}
 
Compiling with “clang -O2 -g0 -emit-llvm” I get this:
 
define dso_local i32 @main() local_unnamed_addr #0 {
  %1 = tail call double @tanh(double 0x3FEEB39556255DE2) #2
  %2 = tail call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([24 x i8], [24 x i8]* @.str, i64 0, i64 0),
                                                             double 0x3FEEB39556255DE2, double 0x3FE7CF009CE7F169,
                                                             i64 4604876745549017449)
  ret i32 0
}
 
We’re still calling ‘tanh’ but all the values passed to printf are constant folded. The constant folding is based on a call to tanh made by the compiler. The problem with this is that if I am linking my program against a different version of the math library than was used by the compiler I may get a different result.
 
I can prevent this constant folding with either the ‘nobuiltin’ or ‘strictfp’ attribute. However, it seems to me like this optimization should really be checking the ‘afn’ fast math flag.
 
Opinions?
 
Thanks,
Andy
 
_______________________________________________
LLVM Developers mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
LLVM Developers mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [FP] Constant folding math library functions

Tingyuan LIANG via llvm-dev
Hi, 

I don't mean to hijack this thread but after reading through some discussion here and poking around in the constant folder myself just now, is my understanding correct in that LLVM only uses host routines for constant folding? Has there been discussion in the past about using MPFR or something like it instead? (I'm aware that doing so wouldn't solve the original problem in the general case -- I'm asking out of curiosity).

What happens currently when cross compiling and endianness does not match between host/target? What about folding types that are unsupported on the host (such as __float128)? Does LLVM reject the fold in those cases? 

Regards,

Scott

On Wed, Apr 17, 2019 at 10:31 AM Finkel, Hal J. via llvm-dev <[hidden email]> wrote:


On 4/16/19 4:18 PM, Amara Emerson wrote:


On Apr 16, 2019, at 1:47 PM, Kaylor, Andrew via llvm-dev <[hidden email]> wrote:

Thanks, Hal.
 
I hear what you are saying about the accuracy. The problem, from my perspective, is trying to explain to users what they are going to get. The constant folding may be as accurate as the lib call would have been, but it isn’t necessarily value safe. I’ve been operating on the assumption that LLVM’s FP optimizations are value safe unless fast math flags are used. For the most part that appears to be true. This case breaks my assumption.
 
I realize that any call to a library function puts claims of value safety on shaky ground, but the standard I’m going for is that you’ll get the same bitwise results compiling at -O0 as you will at -O2 (for instance).
 
That said, I agree that the difference between constant folding a library call and substituting an approximate calculation is significant. Most users would probably prefer to have this optimization enabled by default. It just leads to a kind of murky answer to the question of whether or not we’re value safe by default for the users who do care about that.
I guess what I’m saying is that I do like the idea of a separate flag for this, though as I recall we’re running out of bits for fast math flags. I’m also not sure whether it should be on by default. If we want to permit this transformation by default, then it shouldn’t be a fast math flag. Probably an attribute on the call site is better? And in that case it feels like we’d be circling back toward “nobuiltin” but can the front end identify which call sites would need that?
Could this not be a function attribute if it’s intended to be consistent across entire functions/programs?

I agree with Hal that afn doesn’t sound like the right approach, and in terms of how the compiler actually treats these calls (I’m thinking about more than constant folding here) then it seems to be that this is the same as -fno-builtin. For example, can the optimizer assume some properties about the result value of a call if it knowns some (partial) information about the argument (e.g. sign)? If we prevent constant folding then that also precludes this kind of optimization. Perhaps an umbrella flag like -fno-builtin-math-lib that would turn on -fno-builtin for all of the libm functions?


I certainly think that the umbrella flag is a good approach. The tricky part is implementing it so that Clang does not need to have a list of the relevant functions that LLVM's optimizer knows about (thus, I don't think that we can implement this just by having Clang add metadata to a predefined list of math functions).

 -Hal



Amara
 
-Andy
 
From: Finkel, Hal J. <[hidden email]> 
Sent: Tuesday, April 16, 2019 1:01 PM
To: llvm-dev <[hidden email]>; Kaylor, Andrew <[hidden email]>
Subject: Re: [FP] Constant folding math library functions
 
Hi, Andy,
 
This is somewhat tricky. 'afn' is for approximate functions, to "allow substitution of approximate calculations for functions", but in this case, the answers aren't any more approximate than the original function calls. Different, but likely no less accurate. This has long caused these kinds of subtle differences when cross compiling, etc. but it's not clear what the best thing to do actually is. Users often want the constant folding, and I've certainly seen code where the performance depends critically on it, and yet, the compiler will likely never be able to exactly replicate the behavior of whatever libm implementation is used at runtime. Maybe having a dedicated flag to disable just this behavior, aside from suggesting that users use -fno-builtin=..., would be useful for users who depend on the compiler not folding these kinds of expressions in ways that might differ from their runtime libm behavior?
 
 -Hal
 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
 

From: llvm-dev <[hidden email]> on behalf of Kaylor, Andrew via llvm-dev <[hidden email]>
Sent: Tuesday, April 16, 2019 2:23 PM
To: llvm-dev
Subject: [llvm-dev] [FP] Constant folding math library functions
 
Hi everyone,
 
I noticed today that LLVM’s constant folding of math library functions can lead to minor differences in results. A colleague sent me the following test case which demonstrates the issue:
 
#include <stdio.h>
#include <math.h>
 
typedef union {
  double d;
  unsigned long long i;
} my_dbl;
 
int main(void) {
  my_dbl res, x;
  x.i = 0x3feeb39556255de2ull;
  res.d = tanh(x.d);
  printf("tanh(%f) = %f = %016LX\n", x.d, res.d, res.i);
  return 0;
}
 
Compiling with “clang -O2 -g0 -emit-llvm” I get this:
 
define dso_local i32 @main() local_unnamed_addr #0 {
  %1 = tail call double @tanh(double 0x3FEEB39556255DE2) #2
  %2 = tail call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([24 x i8], [24 x i8]* @.str, i64 0, i64 0),
                                                             double 0x3FEEB39556255DE2, double 0x3FE7CF009CE7F169,
                                                             i64 4604876745549017449)
  ret i32 0
}
 
We’re still calling ‘tanh’ but all the values passed to printf are constant folded. The constant folding is based on a call to tanh made by the compiler. The problem with this is that if I am linking my program against a different version of the math library than was used by the compiler I may get a different result.
 
I can prevent this constant folding with either the ‘nobuiltin’ or ‘strictfp’ attribute. However, it seems to me like this optimization should really be checking the ‘afn’ fast math flag.
 
Opinions?
 
Thanks,
Andy
 
_______________________________________________
LLVM Developers mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
_______________________________________________
LLVM Developers mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [FP] Constant folding math library functions

Tingyuan LIANG via llvm-dev
Scott,

Yes, LLVM uses host routines for constant-folding math functions (e.g., sin, cos). For arithmetic operations, LLVM has its own builtin implementations (called APFloat). I know people have talked about having our own implementations of trigonometric functions, etc., but I don't believe that anyone has viewed this as a high-priority item. We wouldn't use MPFR as a core dependency for licensing reasons, but in general, we shy away from any external dependencies that would affect the output of the compiler. In any case, as you say, this doesn't solve the problem of matching the libm at runtime.

This can cause issues for cross compiling for the same reason it can cause issues when the runtime libm doesn't match the one used by LLVM itself. The values are properly encoded for the target regardless, however, so there's no endianness concern. We only ever fold the float and double variants of these calls, so we don't need to worry about host support for 128-bit float formats (see lib/Analysis/ConstantFolding.cpp).

 -Hal

Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory


From: Scott Manley <[hidden email]>
Sent: Wednesday, April 17, 2019 2:35 PM
To: Finkel, Hal J.
Cc: Amara Emerson; Kaylor, Andrew; llvm-dev
Subject: Re: [llvm-dev] [FP] Constant folding math library functions
 
Hi, 

I don't mean to hijack this thread but after reading through some discussion here and poking around in the constant folder myself just now, is my understanding correct in that LLVM only uses host routines for constant folding? Has there been discussion in the past about using MPFR or something like it instead? (I'm aware that doing so wouldn't solve the original problem in the general case -- I'm asking out of curiosity).

What happens currently when cross compiling and endianness does not match between host/target? What about folding types that are unsupported on the host (such as __float128)? Does LLVM reject the fold in those cases? 

Regards,

Scott

On Wed, Apr 17, 2019 at 10:31 AM Finkel, Hal J. via llvm-dev <[hidden email]> wrote:


On 4/16/19 4:18 PM, Amara Emerson wrote:


On Apr 16, 2019, at 1:47 PM, Kaylor, Andrew via llvm-dev <[hidden email]> wrote:

Thanks, Hal.
 
I hear what you are saying about the accuracy. The problem, from my perspective, is trying to explain to users what they are going to get. The constant folding may be as accurate as the lib call would have been, but it isn’t necessarily value safe. I’ve been operating on the assumption that LLVM’s FP optimizations are value safe unless fast math flags are used. For the most part that appears to be true. This case breaks my assumption.
 
I realize that any call to a library function puts claims of value safety on shaky ground, but the standard I’m going for is that you’ll get the same bitwise results compiling at -O0 as you will at -O2 (for instance).
 
That said, I agree that the difference between constant folding a library call and substituting an approximate calculation is significant. Most users would probably prefer to have this optimization enabled by default. It just leads to a kind of murky answer to the question of whether or not we’re value safe by default for the users who do care about that.
I guess what I’m saying is that I do like the idea of a separate flag for this, though as I recall we’re running out of bits for fast math flags. I’m also not sure whether it should be on by default. If we want to permit this transformation by default, then it shouldn’t be a fast math flag. Probably an attribute on the call site is better? And in that case it feels like we’d be circling back toward “nobuiltin” but can the front end identify which call sites would need that?
Could this not be a function attribute if it’s intended to be consistent across entire functions/programs?

I agree with Hal that afn doesn’t sound like the right approach, and in terms of how the compiler actually treats these calls (I’m thinking about more than constant folding here) then it seems to be that this is the same as -fno-builtin. For example, can the optimizer assume some properties about the result value of a call if it knowns some (partial) information about the argument (e.g. sign)? If we prevent constant folding then that also precludes this kind of optimization. Perhaps an umbrella flag like -fno-builtin-math-lib that would turn on -fno-builtin for all of the libm functions?


I certainly think that the umbrella flag is a good approach. The tricky part is implementing it so that Clang does not need to have a list of the relevant functions that LLVM's optimizer knows about (thus, I don't think that we can implement this just by having Clang add metadata to a predefined list of math functions).

 -Hal



Amara
 
-Andy
 
From: Finkel, Hal J. <[hidden email]> 
Sent: Tuesday, April 16, 2019 1:01 PM
To: llvm-dev <[hidden email]>; Kaylor, Andrew <[hidden email]>
Subject: Re: [FP] Constant folding math library functions
 
Hi, Andy,
 
This is somewhat tricky. 'afn' is for approximate functions, to "allow substitution of approximate calculations for functions", but in this case, the answers aren't any more approximate than the original function calls. Different, but likely no less accurate. This has long caused these kinds of subtle differences when cross compiling, etc. but it's not clear what the best thing to do actually is. Users often want the constant folding, and I've certainly seen code where the performance depends critically on it, and yet, the compiler will likely never be able to exactly replicate the behavior of whatever libm implementation is used at runtime. Maybe having a dedicated flag to disable just this behavior, aside from suggesting that users use -fno-builtin=..., would be useful for users who depend on the compiler not folding these kinds of expressions in ways that might differ from their runtime libm behavior?
 
 -Hal
 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
 

From: llvm-dev <[hidden email]> on behalf of Kaylor, Andrew via llvm-dev <[hidden email]>
Sent: Tuesday, April 16, 2019 2:23 PM
To: llvm-dev
Subject: [llvm-dev] [FP] Constant folding math library functions
 
Hi everyone,
 
I noticed today that LLVM’s constant folding of math library functions can lead to minor differences in results. A colleague sent me the following test case which demonstrates the issue:
 
#include <stdio.h>
#include <math.h>
 
typedef union {
  double d;
  unsigned long long i;
} my_dbl;
 
int main(void) {
  my_dbl res, x;
  x.i = 0x3feeb39556255de2ull;
  res.d = tanh(x.d);
  printf("tanh(%f) = %f = %016LX\n", x.d, res.d, res.i);
  return 0;
}
 
Compiling with “clang -O2 -g0 -emit-llvm” I get this:
 
define dso_local i32 @main() local_unnamed_addr #0 {
  %1 = tail call double @tanh(double 0x3FEEB39556255DE2) #2
  %2 = tail call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([24 x i8], [24 x i8]* @.str, i64 0, i64 0),
                                                             double 0x3FEEB39556255DE2, double 0x3FE7CF009CE7F169,
                                                             i64 4604876745549017449)
  ret i32 0
}
 
We’re still calling ‘tanh’ but all the values passed to printf are constant folded. The constant folding is based on a call to tanh made by the compiler. The problem with this is that if I am linking my program against a different version of the math library than was used by the compiler I may get a different result.
 
I can prevent this constant folding with either the ‘nobuiltin’ or ‘strictfp’ attribute. However, it seems to me like this optimization should really be checking the ‘afn’ fast math flag.
 
Opinions?
 
Thanks,
Andy
 
_______________________________________________
LLVM Developers mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
_______________________________________________
LLVM Developers mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [FP] Constant folding math library functions

Tingyuan LIANG via llvm-dev
Thanks for clarifying, Hal. 

Follow up on the 128bit float issue (understanding that it's not pressing): With the upcoming inclusion of Fortran into the project, would you consider it the front end's responsibility to fold REAL(16) elemental intrinsics? The Fortran standard requires the constant evaluation done at compile time if the kind type is implemented. A way around this would be to *not* support KIND 16 to be "compliant" but using KIND 16 in Fortran is not uncommon.

On Wed, Apr 17, 2019 at 1:08 PM Finkel, Hal J. <[hidden email]> wrote:
Scott,

Yes, LLVM uses host routines for constant-folding math functions (e.g., sin, cos). For arithmetic operations, LLVM has its own builtin implementations (called APFloat). I know people have talked about having our own implementations of trigonometric functions, etc., but I don't believe that anyone has viewed this as a high-priority item. We wouldn't use MPFR as a core dependency for licensing reasons, but in general, we shy away from any external dependencies that would affect the output of the compiler. In any case, as you say, this doesn't solve the problem of matching the libm at runtime.

This can cause issues for cross compiling for the same reason it can cause issues when the runtime libm doesn't match the one used by LLVM itself. The values are properly encoded for the target regardless, however, so there's no endianness concern. We only ever fold the float and double variants of these calls, so we don't need to worry about host support for 128-bit float formats (see lib/Analysis/ConstantFolding.cpp).

 -Hal

Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory


From: Scott Manley <[hidden email]>
Sent: Wednesday, April 17, 2019 2:35 PM
To: Finkel, Hal J.
Cc: Amara Emerson; Kaylor, Andrew; llvm-dev
Subject: Re: [llvm-dev] [FP] Constant folding math library functions
 
Hi, 

I don't mean to hijack this thread but after reading through some discussion here and poking around in the constant folder myself just now, is my understanding correct in that LLVM only uses host routines for constant folding? Has there been discussion in the past about using MPFR or something like it instead? (I'm aware that doing so wouldn't solve the original problem in the general case -- I'm asking out of curiosity).

What happens currently when cross compiling and endianness does not match between host/target? What about folding types that are unsupported on the host (such as __float128)? Does LLVM reject the fold in those cases? 

Regards,

Scott

On Wed, Apr 17, 2019 at 10:31 AM Finkel, Hal J. via llvm-dev <[hidden email]> wrote:


On 4/16/19 4:18 PM, Amara Emerson wrote:


On Apr 16, 2019, at 1:47 PM, Kaylor, Andrew via llvm-dev <[hidden email]> wrote:

Thanks, Hal.
 
I hear what you are saying about the accuracy. The problem, from my perspective, is trying to explain to users what they are going to get. The constant folding may be as accurate as the lib call would have been, but it isn’t necessarily value safe. I’ve been operating on the assumption that LLVM’s FP optimizations are value safe unless fast math flags are used. For the most part that appears to be true. This case breaks my assumption.
 
I realize that any call to a library function puts claims of value safety on shaky ground, but the standard I’m going for is that you’ll get the same bitwise results compiling at -O0 as you will at -O2 (for instance).
 
That said, I agree that the difference between constant folding a library call and substituting an approximate calculation is significant. Most users would probably prefer to have this optimization enabled by default. It just leads to a kind of murky answer to the question of whether or not we’re value safe by default for the users who do care about that.
I guess what I’m saying is that I do like the idea of a separate flag for this, though as I recall we’re running out of bits for fast math flags. I’m also not sure whether it should be on by default. If we want to permit this transformation by default, then it shouldn’t be a fast math flag. Probably an attribute on the call site is better? And in that case it feels like we’d be circling back toward “nobuiltin” but can the front end identify which call sites would need that?
Could this not be a function attribute if it’s intended to be consistent across entire functions/programs?

I agree with Hal that afn doesn’t sound like the right approach, and in terms of how the compiler actually treats these calls (I’m thinking about more than constant folding here) then it seems to be that this is the same as -fno-builtin. For example, can the optimizer assume some properties about the result value of a call if it knowns some (partial) information about the argument (e.g. sign)? If we prevent constant folding then that also precludes this kind of optimization. Perhaps an umbrella flag like -fno-builtin-math-lib that would turn on -fno-builtin for all of the libm functions?


I certainly think that the umbrella flag is a good approach. The tricky part is implementing it so that Clang does not need to have a list of the relevant functions that LLVM's optimizer knows about (thus, I don't think that we can implement this just by having Clang add metadata to a predefined list of math functions).

 -Hal



Amara
 
-Andy
 
From: Finkel, Hal J. <[hidden email]> 
Sent: Tuesday, April 16, 2019 1:01 PM
To: llvm-dev <[hidden email]>; Kaylor, Andrew <[hidden email]>
Subject: Re: [FP] Constant folding math library functions
 
Hi, Andy,
 
This is somewhat tricky. 'afn' is for approximate functions, to "allow substitution of approximate calculations for functions", but in this case, the answers aren't any more approximate than the original function calls. Different, but likely no less accurate. This has long caused these kinds of subtle differences when cross compiling, etc. but it's not clear what the best thing to do actually is. Users often want the constant folding, and I've certainly seen code where the performance depends critically on it, and yet, the compiler will likely never be able to exactly replicate the behavior of whatever libm implementation is used at runtime. Maybe having a dedicated flag to disable just this behavior, aside from suggesting that users use -fno-builtin=..., would be useful for users who depend on the compiler not folding these kinds of expressions in ways that might differ from their runtime libm behavior?
 
 -Hal
 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
 

From: llvm-dev <[hidden email]> on behalf of Kaylor, Andrew via llvm-dev <[hidden email]>
Sent: Tuesday, April 16, 2019 2:23 PM
To: llvm-dev
Subject: [llvm-dev] [FP] Constant folding math library functions
 
Hi everyone,
 
I noticed today that LLVM’s constant folding of math library functions can lead to minor differences in results. A colleague sent me the following test case which demonstrates the issue:
 
#include <stdio.h>
#include <math.h>
 
typedef union {
  double d;
  unsigned long long i;
} my_dbl;
 
int main(void) {
  my_dbl res, x;
  x.i = 0x3feeb39556255de2ull;
  res.d = tanh(x.d);
  printf("tanh(%f) = %f = %016LX\n", x.d, res.d, res.i);
  return 0;
}
 
Compiling with “clang -O2 -g0 -emit-llvm” I get this:
 
define dso_local i32 @main() local_unnamed_addr #0 {
  %1 = tail call double @tanh(double 0x3FEEB39556255DE2) #2
  %2 = tail call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([24 x i8], [24 x i8]* @.str, i64 0, i64 0),
                                                             double 0x3FEEB39556255DE2, double 0x3FE7CF009CE7F169,
                                                             i64 4604876745549017449)
  ret i32 0
}
 
We’re still calling ‘tanh’ but all the values passed to printf are constant folded. The constant folding is based on a call to tanh made by the compiler. The problem with this is that if I am linking my program against a different version of the math library than was used by the compiler I may get a different result.
 
I can prevent this constant folding with either the ‘nobuiltin’ or ‘strictfp’ attribute. However, it seems to me like this optimization should really be checking the ‘afn’ fast math flag.
 
Opinions?
 
Thanks,
Andy
 
_______________________________________________
LLVM Developers mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
_______________________________________________
LLVM Developers mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [FP] Constant folding math library functions

Tingyuan LIANG via llvm-dev
In reply to this post by Tingyuan LIANG via llvm-dev
On Wed, Apr 17, 2019 at 12:35:16PM -0700, Scott Manley via llvm-dev wrote:
> I don't mean to hijack this thread but after reading through some
> discussion here and poking around in the constant folder myself just now,
> is my understanding correct in that LLVM only uses host routines for
> constant folding? Has there been discussion in the past about using MPFR or
> something like it instead? (I'm aware that doing so wouldn't solve the
> original problem in the general case -- I'm asking out of curiosity).

Using MPFR doesn't solve anything, in fact it will often be worse as it
won't agree with even the native libm.

Joerg
_______________________________________________
LLVM Developers mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [FP] Constant folding math library functions

Tingyuan LIANG via llvm-dev
> -----Original Message-----
> From: llvm-dev <[hidden email]> On Behalf Of Joerg
> Sonnenberger via llvm-dev
> Sent: Wednesday, April 17, 2019 4:05 PM
> To: [hidden email]
> Subject: Re: [llvm-dev] [FP] Constant folding math library functions
>
> Using MPFR doesn't solve anything, in fact it will often be worse as it won't
> agree with even the native libm.

That depends on your goal.  If your goal is to match the native libm, then it doesn't help.  If your goal is to ensure that the compiler's folds are at least as precise as the native libm, then it helps.  In other words, people generally don't complain if a compiler folds with greater precision than a runtime call, but if the compiler folds with less precision then they will be upset.

-Troy
_______________________________________________
LLVM Developers mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [FP] Constant folding math library functions

Tingyuan LIANG via llvm-dev
In reply to this post by Tingyuan LIANG via llvm-dev

Using MPFR doesn't solve anything, in fact it will often be worse as it
won't agree with even the native libm.

Libraries like MPFR solve problems, just not the one mentioned in this thread. There are tradeoffs to both approaches if you need a cross compiler. It's common in HPC, for instance, to compile and run code on different architectures. Even with the same base architecture (x86) there could be subtle differences between say, Haswell and Skylake host libraries. I'm not sure I agree that it's "often worse", though. I've found it to be quite the opposite.

_______________________________________________
LLVM Developers mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev