Stange behavior in fp arithmetics on x86 (bug possibly)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Stange behavior in fp arithmetics on x86 (bug possibly)

Dmitry Borisenkov

Hello everyone.

I’m not an expert neither in llvm nor in x86 nor in IEEE standard for floating point numbers, thus any of my following assumptions maybe wrong. If so, I will be grateful if you clarify me what’s goes wrong. But if my guesses are correct we possibly have a bug in fp arithmetics on x86.

I have the following ir:

  @g = constant i64 1

define i32 @main() {

  %gval = load i64* @g

  %gvalfp = bitcast i64 %gval to double

  %fmul = fmul double %gvalfp, -5.000000e-01

  %fcmp = fcmp ueq double %fmul, -0.000000e+00

  %ret = select i1 %fcmp, i32 1, i32 0

  ret i32 %ret

}

And I expected that minimal positive denormalized double times -0.5 is equal to -0.0, so correct exit code is 1.

llvm-3.4.2 on x86 linux target produced the following assembly:

      .file "fpfail.ll"

      .section    .rodata.cst8,"aM",@progbits,8

      .align      8

.LCPI0_0:

      .quad -4620693217682128896    # double -0.5

.LCPI0_1:

      .quad -9223372036854775808    # double -0

      .text

      .globl      main

      .align      16, 0x90

      .type main,@function

main:                                   # @main

      .cfi_startproc

# BB#0:

      vmovsd      g, %xmm0

      vmulsd      .LCPI0_0, %xmm0, %xmm0

      vucomisd    .LCPI0_1, %xmm0

      sete  %al

      movzbl      %al, %eax

      ret

.Ltmp0:

      .size main, .Ltmp0-main

      .cfi_endproc

 

      .type g,@object               # @g

      .section    .rodata,"a",@progbits

      .globl      g

      .align      8

g:

      .quad 1                       # 0x1

      .size g, 8

 

      .section    ".note.GNU-stack","",@progbits

 

./llc -march=x86 fpfail.ll; g++ fpfail.s; ./a.out; echo $?

returns 1 as expected.

 

But llvm-3.5 (on the same target) lowers the previous ir using floating point instructions in the following way.

      .text

      .file "fpfail.ll"

      .section    .rodata.cst4,"aM",@progbits,4

      .align      4

.LCPI0_0:

      .long 3204448256              # float -0.5

      .text

      .globl      main

      .align      16, 0x90

      .type main,@function

main:                                   # @main

      .cfi_startproc

# BB#0:

      fldl  g

      fmuls .LCPI0_0

      fldz

      fchs

      fxch  %st(1)

      fucompp

      fnstsw      %ax

                                        # kill: AX<def> AX<kill> EAX<def>

                                        # kill: AH<def> AH<kill> EAX<kill>

      sahf

      sete  %al

      movzbl      %al, %eax

      retl

.Ltmp0:

      .size main, .Ltmp0-main

      .cfi_endproc

 

      .type g,@object               # @g

      .section    .rodata,"a",@progbits

      .globl      g

      .align      8

g:

      .quad 1                       # 0x1

      .size g, 8

 

 

      .section    ".note.GNU-stack","",@progbits

 

First, it doesn’t assemble with g++ (4.8):

fpfail.s:26: Error: invalid instruction suffix for `ret'

I downloaded Intel manual and haven’t found any mention of retl instruction, so I manually exchanged it with ret and reassemble:

g++ fpfail.s; ./a.out; echo $?

The exit code is 0. This is correct for Intel 80-bit floats but wrong for doubles. What am I do wrong or this is actually a bug or even worse – correct behavior?

 

--

Kind regards, Dmitry Borisenkov

 

 


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Stange behavior in fp arithmetics on x86 (bug possibly)

Tim Northover-2
Hi Dmitry,

On 7 October 2014 10:50, Dmitry Borisenkov <[hidden email]> wrote:
> fpfail.s:26: Error: invalid instruction suffix for `ret'
>
> I downloaded Intel manual and haven’t found any mention of retl instruction,

"retl" is the AT&T syntax for the normal "ret" instruction in the
Intel manual, which makes it mostly undocumented.

> The exit code is 0. This is correct for Intel 80-bit floats but wrong for
> doubles. What am I do wrong or this is actually a bug or even worse –
> correct behavior?

I think the default CPU used by llc was changed between 3.4 and 3.5.
Before, we defaulted to the host's CPU (from memory), but now we pick
a lowest common denominator "generic", which doesn't support SSE.

When the IR comes from Clang, I believe we define the
"FLT_EVAL_METHOD" macro to be 2 in this case (see C99 5.2.4.2.2),
which signals that operations are performed at "long double" precision
and the outcome you see is permitted.

So I *think* this is OK, unless I'm misunderstanding one of the specs involved.

Cheers.

Tim.

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Stange behavior in fp arithmetics on x86 (bug possibly)

Joerg Sonnenberger
In reply to this post by Dmitry Borisenkov
On Tue, Oct 07, 2014 at 09:50:37PM +0400, Dmitry Borisenkov wrote:
> I'm not an expert neither in llvm nor in x86 nor in IEEE standard for
> floating point numbers, thus any of my following assumptions maybe wrong. If
> so, I will be grateful if you clarify me what's goes wrong. But if my
> guesses are correct we possibly have a bug in fp arithmetics on x86.

Are you targetting the same backend? i386 (32bit mode) uses FPU registers
for argument passing and return values, x86_64 / amd64 (64bit mode) uses
SSE registers for float/double values and FPU registers for long double.
The error on retl makes me think the second example is compiled for
i386, while the first example looks more like x86_64.

Joerg
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Stange behavior in fp arithmetics on x86 (bug possibly)

Dmitry Borisenkov
Hi, Joerg

Both of the examples were compiled ./llc -march=x86 -O3 fpfail.ll (i386).
I've double checked it.


Kind regards, Dmitry Borisenkov


-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On
Behalf Of Joerg Sonnenberger
Sent: Tuesday, October 07, 2014 10:45 PM
To: [hidden email]
Subject: Re: [LLVMdev] Stange behavior in fp arithmetics on x86 (bug
possibly)

On Tue, Oct 07, 2014 at 09:50:37PM +0400, Dmitry Borisenkov wrote:
> I'm not an expert neither in llvm nor in x86 nor in IEEE standard for
> floating point numbers, thus any of my following assumptions maybe
> wrong. If so, I will be grateful if you clarify me what's goes wrong.
> But if my guesses are correct we possibly have a bug in fp arithmetics on
x86.

Are you targetting the same backend? i386 (32bit mode) uses FPU registers
for argument passing and return values, x86_64 / amd64 (64bit mode) uses SSE
registers for float/double values and FPU registers for long double.
The error on retl makes me think the second example is compiled for i386,
while the first example looks more like x86_64.

Joerg
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Stange behavior in fp arithmetics on x86 (bug possibly)

Stephen Checkoway
In reply to this post by Tim Northover-2

On Oct 7, 2014, at 2:26 PM, Tim Northover <[hidden email]> wrote:

> Hi Dmitry,
>
> On 7 October 2014 10:50, Dmitry Borisenkov <[hidden email]> wrote:
>> fpfail.s:26: Error: invalid instruction suffix for `ret'
>>
>> I downloaded Intel manual and haven’t found any mention of retl instruction,
>
> "retl" is the AT&T syntax for the normal "ret" instruction in the
> Intel manual, which makes it mostly undocumented.

Are you sure about that? I don't recall ever seeing retl before. A while back a reference for AT&T was mentioned and, as I recall, this was the best anyone had <http://docs.oracle.com/cd/E19253-01/817-5477/817-5477.pdf>. It contains no mention of retl.

This seems to be the commit that added support for it <http://lists.cs.uiuc.edu/pipermail/llvm-branch-commits/2010-May/003229.html>.

I'm not sure I understand the distinction between retl/retq. x86 has 4 return instruction (cribbing from the Intel manual):

C3 RET Near return
CB RET Far return
C2 iw RET imm16 Near return + pop imm16 bytes
CA iw RET imm16 Far return + pop imm16 bytes

(And I think that's been true since the 8086.)

Distinguishing between near and far (e.g., ret vs. lret in AT&T or retn vs. retf with some other assemblers) makes sense, but what would a l or q suffix denote?

But more to the point, even if there's a good reason to accept retl/retq as input, is there any reason to emit it ever?

--
Stephen Checkoway




_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Stange behavior in fp arithmetics on x86 (bug possibly)

Craig Topper
r198756 seems to be related too. That would explain why the difference appears in 3.5 relative to 3.4.

On Thu, Oct 9, 2014 at 11:48 PM, Stephen Checkoway <[hidden email]> wrote:

On Oct 7, 2014, at 2:26 PM, Tim Northover <[hidden email]> wrote:

> Hi Dmitry,
>
> On 7 October 2014 10:50, Dmitry Borisenkov <[hidden email]> wrote:
>> fpfail.s:26: Error: invalid instruction suffix for `ret'
>>
>> I downloaded Intel manual and haven’t found any mention of retl instruction,
>
> "retl" is the AT&T syntax for the normal "ret" instruction in the
> Intel manual, which makes it mostly undocumented.

Are you sure about that? I don't recall ever seeing retl before. A while back a reference for AT&T was mentioned and, as I recall, this was the best anyone had <http://docs.oracle.com/cd/E19253-01/817-5477/817-5477.pdf>. It contains no mention of retl.

This seems to be the commit that added support for it <http://lists.cs.uiuc.edu/pipermail/llvm-branch-commits/2010-May/003229.html>.

I'm not sure I understand the distinction between retl/retq. x86 has 4 return instruction (cribbing from the Intel manual):

C3      RET             Near return
CB      RET             Far return
C2 iw   RET imm16       Near return + pop imm16 bytes
CA iw   RET imm16       Far return + pop imm16 bytes

(And I think that's been true since the 8086.)

Distinguishing between near and far (e.g., ret vs. lret in AT&T or retn vs. retf with some other assemblers) makes sense, but what would a l or q suffix denote?

But more to the point, even if there's a good reason to accept retl/retq as input, is there any reason to emit it ever?

--
Stephen Checkoway




_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev



--
~Craig

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Stange behavior in fp arithmetics on x86 (bug possibly)

Pasi Parviainen
In reply to this post by Stephen Checkoway
On 10.10.2014 9:48, Stephen Checkoway wrote:

>
> On Oct 7, 2014, at 2:26 PM, Tim Northover <[hidden email]> wrote:
>
>> Hi Dmitry,
>>
>> On 7 October 2014 10:50, Dmitry Borisenkov <[hidden email]> wrote:
>>> fpfail.s:26: Error: invalid instruction suffix for `ret'
>>>
>>> I downloaded Intel manual and haven’t found any mention of retl instruction,
>>
>> "retl" is the AT&T syntax for the normal "ret" instruction in the
>> Intel manual, which makes it mostly undocumented.
>
> Are you sure about that? I don't recall ever seeing retl before. A while back a reference for AT&T was mentioned and, as I recall, this was the best anyone had <http://docs.oracle.com/cd/E19253-01/817-5477/817-5477.pdf>. It contains no mention of retl.
>
> This seems to be the commit that added support for it <http://lists.cs.uiuc.edu/pipermail/llvm-branch-commits/2010-May/003229.html>.
>
> I'm not sure I understand the distinction between retl/retq. x86 has 4 return instruction (cribbing from the Intel manual):
>
> C3 RET Near return
> CB RET Far return
> C2 iw RET imm16 Near return + pop imm16 bytes
> CA iw RET imm16 Far return + pop imm16 bytes
>
> (And I think that's been true since the 8086.)
>
> Distinguishing between near and far (e.g., ret vs. lret in AT&T or retn vs. retf with some other assemblers) makes sense, but what would a l or q suffix denote?
>
> But more to the point, even if there's a good reason to accept retl/retq as input, is there any reason to emit it ever?
>

Since in x86 you can mix 16-bit and 32-bit code, therefore you must be
able to distinguish between 16-bit and 32-bit return. And from there
comes the w and l suffix for the return instruction.

code16:
ret = retw => C3
retl => 66 C3

code32:
ret = retl => C3
retw => 66 C3

And what comes to q suffix, it is either to be consistent or it just got
cargo-culted.

Pasi
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Stange behavior in fp arithmetics on x86 (bug possibly)

Stephen Checkoway

On Oct 10, 2014, at 7:23 AM, Pasi Parviainen <[hidden email]> wrote:

> On 10.10.2014 9:48, Stephen Checkoway wrote:
>>
>> But more to the point, even if there's a good reason to accept retl/retq as input, is there any reason to emit it ever?
>>
>
> Since in x86 you can mix 16-bit and 32-bit code, therefore you must be able to distinguish between 16-bit and 32-bit return. And from there comes the w and l suffix for the return instruction.

Makes total sense. I didn't think about using the operand size override. (I didn't even realize that was legal for ret.)

Thanks,

--
Stephen Checkoway






_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev