[llvm-dev] LLVM behavior different depending on function symbol name

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[llvm-dev] LLVM behavior different depending on function symbol name

Tim Northover via llvm-dev
Greetings,

I have a Zig implementation of ceil which is emitted into LLVM IR like this:

; Function Attrs: nobuiltin nounwind
define internal fastcc float @ceil(float) unnamed_addr #3 !dbg !644 {
Entry:
  %x = alloca float, align 4
  store float %0, float* %x
  call void @llvm.dbg.declare(metadata float* %x, metadata !649, metadata !494), !dbg !651
  %1 = load float, float* %x, !dbg !652
  %2 = call fastcc float @ceil32(float %1) #8, !dbg !656
  ret float %2, !dbg !657
}

Test case:

test "math.ceil" {
    assert(ceil(f32(0.0)) == ceil32(0.0));
    assert(ceil(f64(0.0)) == ceil64(0.0));
}


When I compile with optimizations on, this test case fails. The optimized code for the test case ends up being a call to panic (assertion failure), which means that LLVM determined the test failed at compile-time.

What's strange about this is that if I change the function name from @ceil to @ceil_asdf (and change the callers) then the test passes.

So I think LLVM is doing some kind of string comparison on the symbol name and detecting that it is "ceil" and then having different, undesired behavior.

I tried putting `nobuiltin` in the function attributes and at the callsite, but that did not change anything.

Any ideas what's going on?

Downstream issue: https://github.com/zig-lang/zig/issues/393

Regards,
Andrew

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [llvm-dev] LLVM behavior different depending on function symbol name

Tim Northover via llvm-dev
On 06/19/2017 11:45 AM, Andrew Kelley via llvm-dev wrote:

> Greetings,
>
> I have a Zig implementation of ceil which is emitted into LLVM IR like this:
>
> ; Function Attrs: nobuiltin nounwind
> define internal fastcc float @ceil(float) unnamed_addr #3 !dbg !644 {
> Entry:
>   %x = alloca float, align 4
>   store float %0, float* %x
>   call void @llvm.dbg.declare(metadata float* %x, metadata !649, metadata !494), !dbg !651
>   %1 = load float, float* %x, !dbg !652
>   %2 = call fastcc float @ceil32(float %1) #8, !dbg !656
>   ret float %2, !dbg !657
> }
>

What does the declaration of @ceil32() look like ?


> Test case:
>
> test "math.ceil" {
>     assert(ceil(f32(0.0)) == ceil32(0.0));
>     assert(ceil(f64(0.0)) == ceil64(0.0));
> }
>
>
> When I compile with optimizations on, this test case fails. The optimized code for the test case ends up being a call to panic (assertion failure), which means that LLVM determined the test failed at compile-time.
>
> What's strange about this is that if I change the function name from @ceil to @ceil_asdf (and change the callers) then the test passes.
>
> So I think LLVM is doing some kind of string comparison on the symbol name and detecting that it is "ceil" and then having different, undesired behavior.
>
> I tried putting `nobuiltin` in the function attributes and at the callsite, but that did not change anything.
>
> Any ideas what's going on?
>
> Downstream issue: https://github.com/zig-lang/zig/issues/393
>
> Regards,
> Andrew
>
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [llvm-dev] LLVM behavior different depending on function symbol name

Tim Northover via llvm-dev


On Mon, Jun 19, 2017 at 11:58 AM, Tom Stellard <[hidden email]> wrote:
On 06/19/2017 11:45 AM, Andrew Kelley via llvm-dev wrote:
> Greetings,
>
> I have a Zig implementation of ceil which is emitted into LLVM IR like this:
>
> ; Function Attrs: nobuiltin nounwind
> define internal fastcc float @ceil(float) unnamed_addr #3 !dbg !644 {
> Entry:
>   %x = alloca float, align 4
>   store float %0, float* %x
>   call void @llvm.dbg.declare(metadata float* %x, metadata !649, metadata !494), !dbg !651
>   %1 = load float, float* %x, !dbg !652
>   %2 = call fastcc float @ceil32(float %1) #8, !dbg !656
>   ret float %2, !dbg !657
> }
>

What does the declaration of @ceil32() look like ?

LLVM IR follows; source follows after that.

; Function Attrs: nobuiltin nounwind
define internal fastcc float @ceil32(float) unnamed_addr #3 !dbg !658 {
Entry:
  %x = alloca float, align 4
  %u = alloca i32, align 4
  %e = alloca i32, align 4
  %m = alloca i32, align 4
  store float %0, float* %x
  call void @llvm.dbg.declare(metadata float* %x, metadata !660, metadata !494), !dbg !670
  %1 = load float, float* %x, !dbg !671
  %2 = bitcast float %1 to i32, !dbg !672
  store i32 %2, i32* %u, !dbg !673
  call void @llvm.dbg.declare(metadata i32* %u, metadata !661, metadata !494), !dbg !673
  %3 = load i32, i32* %u, !dbg !674
  %4 = lshr i32 %3, 23, !dbg !675
  %5 = and i32 %4, 255, !dbg !676
  %6 = sub nsw i32 %5, 127, !dbg !677
  store i32 %6, i32* %e, !dbg !678
  call void @llvm.dbg.declare(metadata i32* %e, metadata !665, metadata !494), !dbg !678
  call void @llvm.dbg.declare(metadata i32* %m, metadata !668, metadata !494), !dbg !679
  %7 = load i32, i32* %e, !dbg !680
  %8 = icmp sge i32 %7, 23, !dbg !682
  br i1 %8, label %Then, label %Else, !dbg !682

Then:                                             ; preds = %Entry
  %9 = load float, float* %x, !dbg !683
  ret float %9, !dbg !685

Else:                                             ; preds = %Entry
  %10 = load i32, i32* %e, !dbg !686
  %11 = icmp sge i32 %10, 0, !dbg !687
  br i1 %11, label %Then1, label %Else2, !dbg !687

Then1:                                            ; preds = %Else
  %12 = load i32, i32* %e, !dbg !688
  %13 = lshr i32 8388607, %12, !dbg !690
  store i32 %13, i32* %m, !dbg !691
  %14 = load i32, i32* %u, !dbg !692
  %15 = load i32, i32* %m, !dbg !693
  %16 = and i32 %14, %15, !dbg !694
  %17 = icmp eq i32 %16, 0, !dbg !695
  br i1 %17, label %Then3, label %Else4, !dbg !695

Else2:                                            ; preds = %Else
  %18 = load float, float* %x, !dbg !696
  %19 = fadd fast float %18, 0x4770000000000000, !dbg !698
  call fastcc void @forceEval(float %19), !dbg !699
  %20 = load i32, i32* %u, !dbg !700
  %21 = lshr i32 %20, 31, !dbg !701
  %22 = icmp ne i32 %21, 0, !dbg !702
  br i1 %22, label %Then5, label %Else6, !dbg !702

Then3:                                            ; preds = %Then1
  %23 = load float, float* %x, !dbg !703
  ret float %23, !dbg !705

Else4:                                            ; preds = %Then1
  br label %EndIf, !dbg !706

Then5:                                            ; preds = %Else2
  ret float -0.000000e+00, !dbg !707

Else6:                                            ; preds = %Else2
  br label %EndIf7, !dbg !709

EndIf:                                            ; preds = %Else4
  %24 = load float, float* %x, !dbg !710
  %25 = fadd fast float %24, 0x4770000000000000, !dbg !711
  call fastcc void @forceEval(float %25), !dbg !712
  %26 = load i32, i32* %u, !dbg !713
  %27 = lshr i32 %26, 31, !dbg !714
  %28 = icmp eq i32 %27, 0, !dbg !715
  br i1 %28, label %Then8, label %Else9, !dbg !715

EndIf7:                                           ; preds = %Else6
  br label %EndIf11, !dbg !716

Then8:                                            ; preds = %EndIf
  %29 = load i32, i32* %u, !dbg !717
  %30 = load i32, i32* %m, !dbg !719
  %31 = add nuw i32 %29, %30, !dbg !720
  store i32 %31, i32* %u, !dbg !720
  br label %EndIf10, !dbg !721

Else9:                                            ; preds = %EndIf
  br label %EndIf10, !dbg !721

EndIf10:                                          ; preds = %Else9, %Then8
  %32 = load i32, i32* %u, !dbg !722
  %33 = load i32, i32* %m, !dbg !723
  %34 = xor i32 %33, -1, !dbg !724
  %35 = and i32 %32, %34, !dbg !725
  store i32 %35, i32* %u, !dbg !725
  %36 = load i32, i32* %u, !dbg !726
  %37 = bitcast i32 %36 to float, !dbg !727
  br label %EndIf11, !dbg !716

EndIf11:                                          ; preds = %EndIf10, %EndIf7
  %38 = phi float [ %37, %EndIf10 ], [ 1.000000e+00, %EndIf7 ], !dbg !716
  ret float %38, !dbg !728
}
; Function Attrs: nobuiltin nounwind
define internal fastcc void @forceEval(float) unnamed_addr #3 !dbg !840 {
Entry:
  %value = alloca float, align 4
  %x = alloca float, align 4
  %p = alloca float*, align 8
  store float %0, float* %value
  call void @llvm.dbg.declare(metadata float* %value, metadata !844, metadata !494), !dbg !854
  call void @llvm.dbg.declare(metadata float* %x, metadata !846, metadata !494), !dbg !855
  store float* %x, float** %p, !dbg !856
  call void @llvm.dbg.declare(metadata float** %p, metadata !851, metadata !494), !dbg !856
  %1 = load float*, float** %p, !dbg !857
  %2 = load float, float* %x, !dbg !859
  store volatile float %2, float* %1, !dbg !860
  ret void, !dbg !861
}


Source:
fn ceil32(x: f32) -> f32 {
    var u = @bitCast(u32, x);
    var e = i32((u >> 23) & 0xFF) - 0x7F;
    var m: u32 = undefined;

    if (e >= 23) {
        return x;
    }
    else if (e >= 0) {
        m = 0x007FFFFF >> u32(e);
        if (u & m == 0) {
            return x;
        }
        math.forceEval(x + 0x1.0p120);
        if (u >> 31 == 0) {
            u += m;
        }
        u &= ~m;
        @bitCast(f32, u)
    } else {
        math.forceEval(x + 0x1.0p120);
        if (u >> 31 != 0) {
            return -0.0;
        } else {
            1.0
        }
    }
}
pub fn forceEval(value: var) {
    const T = @typeOf(value);
    switch (T) {
        f32 => {
            var x: f32 = undefined;
            const p = @ptrCast(&volatile f32, &x);
            *p = x;
        },
        f64 => {
            var x: f64 = undefined;
            const p = @ptrCast(&volatile f64, &x);
            *p = x;
        },
        else => {
            @compileError("forceEval not implemented for " ++ @typeName(T));
        },
    }
}

 


> Test case:
>
> test "math.ceil" {
>     assert(ceil(f32(0.0)) == ceil32(0.0));
>     assert(ceil(f64(0.0)) == ceil64(0.0));
> }
>
>
> When I compile with optimizations on, this test case fails. The optimized code for the test case ends up being a call to panic (assertion failure), which means that LLVM determined the test failed at compile-time.
>
> What's strange about this is that if I change the function name from @ceil to @ceil_asdf (and change the callers) then the test passes.
>
> So I think LLVM is doing some kind of string comparison on the symbol name and detecting that it is "ceil" and then having different, undesired behavior.
>
> I tried putting `nobuiltin` in the function attributes and at the callsite, but that did not change anything.
>
> Any ideas what's going on?
>
> Downstream issue: https://github.com/zig-lang/zig/issues/393
>
> Regards,
> Andrew
>
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>



_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [llvm-dev] LLVM behavior different depending on function symbol name

Tim Northover via llvm-dev
In reply to this post by Tim Northover via llvm-dev
Hi,

2017-06-19 8:45 GMT-07:00 Andrew Kelley via llvm-dev <[hidden email]>:
Greetings,

I have a Zig implementation of ceil which is emitted into LLVM IR like this:

; Function Attrs: nobuiltin nounwind
define internal fastcc float @ceil(float) unnamed_addr #3 !dbg !644 {
Entry:
  %x = alloca float, align 4
  store float %0, float* %x
  call void @llvm.dbg.declare(metadata float* %x, metadata !649, metadata !494), !dbg !651
  %1 = load float, float* %x, !dbg !652
  %2 = call fastcc float @ceil32(float %1) #8, !dbg !656
  ret float %2, !dbg !657
}

Test case:

test "math.ceil" {
    assert(ceil(f32(0.0)) == ceil32(0.0));
    assert(ceil(f64(0.0)) == ceil64(0.0));
}


When I compile with optimizations on, this test case fails. The optimized code for the test case ends up being a call to panic (assertion failure), which means that LLVM determined the test failed at compile-time.

What's strange about this is that if I change the function name from @ceil to @ceil_asdf (and change the callers) then the test passes.

So I think LLVM is doing some kind of string comparison on the symbol name and detecting that it is "ceil" and then having different, undesired behavior.

I tried putting `nobuiltin` in the function attributes and at the callsite, but that did not change anything.

Any ideas what's going on?

I think it'd be a lot easier to figure if you provide a standalone repro.

-- 
Mehdi


_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [llvm-dev] LLVM behavior different depending on function symbol name

Tim Northover via llvm-dev


On Mon, Jun 19, 2017 at 12:06 PM, Mehdi AMINI <[hidden email]> wrote:
Hi,

2017-06-19 8:45 GMT-07:00 Andrew Kelley via llvm-dev <[hidden email]>:
Greetings,

I have a Zig implementation of ceil which is emitted into LLVM IR like this:

; Function Attrs: nobuiltin nounwind
define internal fastcc float @ceil(float) unnamed_addr #3 !dbg !644 {
Entry:
  %x = alloca float, align 4
  store float %0, float* %x
  call void @llvm.dbg.declare(metadata float* %x, metadata !649, metadata !494), !dbg !651
  %1 = load float, float* %x, !dbg !652
  %2 = call fastcc float @ceil32(float %1) #8, !dbg !656
  ret float %2, !dbg !657
}

Test case:

test "math.ceil" {
    assert(ceil(f32(0.0)) == ceil32(0.0));
    assert(ceil(f64(0.0)) == ceil64(0.0));
}


When I compile with optimizations on, this test case fails. The optimized code for the test case ends up being a call to panic (assertion failure), which means that LLVM determined the test failed at compile-time.

What's strange about this is that if I change the function name from @ceil to @ceil_asdf (and change the callers) then the test passes.

So I think LLVM is doing some kind of string comparison on the symbol name and detecting that it is "ceil" and then having different, undesired behavior.

I tried putting `nobuiltin` in the function attributes and at the callsite, but that did not change anything.

Any ideas what's going on?

I think it'd be a lot easier to figure if you provide a standalone repro.

Standalone repro:

; ModuleID = 'test'
source_filename = "test"
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

%"[]u8" = type { i8*, i64 }

@__zig_panic_implementation_provided = internal unnamed_addr constant i1 true, align 1

; Function Attrs: nounwind
declare void @llvm.debugtrap() #0

; Function Attrs: argmemonly nounwind
declare void @llvm.memcpy.p0i8.p0i8.i64(i8* nocapture writeonly, i8* nocapture readonly, i64, i32, i1) #1

; Function Attrs: argmemonly nounwind
declare void @llvm.memset.p0i8.i64(i8* nocapture writeonly, i8, i64, i32, i1) #1

; Function Attrs: nobuiltin nounwind
define i1 @do_test() #2 !dbg !16 {
Entry:
  %0 = call fastcc float @ceil(float 0.000000e+00) #6, !dbg !21
  %1 = call fastcc float @ceil32(float 0.000000e+00) #6, !dbg !23
  %2 = fcmp fast oeq float %0, %1, !dbg !24
  ret i1 %2, !dbg !25
}

; Function Attrs: cold nobuiltin noreturn nounwind
define linkonce coldcc void @__zig_panic(i8* nonnull readonly, i64) #3 !dbg !26 {
Entry:
  %2 = alloca %"[]u8", align 8
  %message_ptr = alloca i8*, align 8
  %message_len = alloca i64, align 8
  store i8* %0, i8** %message_ptr
  call void @llvm.dbg.declare(metadata i8** %message_ptr, metadata !34, metadata !37), !dbg !38
  store i64 %1, i64* %message_len
  call void @llvm.dbg.declare(metadata i64* %message_len, metadata !35, metadata !37), !dbg !39
  %3 = load i64, i64* %message_len, !dbg !40
  %4 = load i8*, i8** %message_ptr, !dbg !44
  %5 = getelementptr inbounds %"[]u8", %"[]u8"* %2, i32 0, i32 0, !dbg !44
  %6 = getelementptr inbounds i8, i8* %4, i64 0, !dbg !44
  store i8* %6, i8** %5, !dbg !44
  %7 = getelementptr inbounds %"[]u8", %"[]u8"* %2, i32 0, i32 1, !dbg !44
  %8 = sub nsw i64 %3, 0, !dbg !44
  store i64 %8, i64* %7, !dbg !44
  call fastcc void @panic(%"[]u8"* byval %2) #6, !dbg !45
  unreachable, !dbg !45
}

; Function Attrs: nobuiltin nounwind
define internal fastcc float @ceil(float) unnamed_addr #2 !dbg !46 {
Entry:
  %x = alloca float, align 4
  store float %0, float* %x
  call void @llvm.dbg.declare(metadata float* %x, metadata !51, metadata !37), !dbg !53
  %1 = load float, float* %x, !dbg !54
  %2 = call fastcc float @ceil32(float %1) #7, !dbg !58
  ret float %2, !dbg !59
}

; Function Attrs: nobuiltin nounwind
define internal fastcc float @ceil32(float) unnamed_addr #2 !dbg !60 {
Entry:
  %x = alloca float, align 4
  %u = alloca i32, align 4
  %e = alloca i32, align 4
  %m = alloca i32, align 4
  store float %0, float* %x
  call void @llvm.dbg.declare(metadata float* %x, metadata !62, metadata !37), !dbg !72
  %1 = load float, float* %x, !dbg !73
  %2 = bitcast float %1 to i32, !dbg !74
  store i32 %2, i32* %u, !dbg !75
  call void @llvm.dbg.declare(metadata i32* %u, metadata !63, metadata !37), !dbg !75
  %3 = load i32, i32* %u, !dbg !76
  %4 = lshr i32 %3, 23, !dbg !77
  %5 = and i32 %4, 255, !dbg !78
  %6 = sub nsw i32 %5, 127, !dbg !79
  store i32 %6, i32* %e, !dbg !80
  call void @llvm.dbg.declare(metadata i32* %e, metadata !67, metadata !37), !dbg !80
  call void @llvm.dbg.declare(metadata i32* %m, metadata !70, metadata !37), !dbg !81
  %7 = load i32, i32* %e, !dbg !82
  %8 = icmp sge i32 %7, 23, !dbg !84
  br i1 %8, label %Then, label %Else, !dbg !84

Then:                                             ; preds = %Entry
  %9 = load float, float* %x, !dbg !85
  ret float %9, !dbg !87

Else:                                             ; preds = %Entry
  %10 = load i32, i32* %e, !dbg !88
  %11 = icmp sge i32 %10, 0, !dbg !89
  br i1 %11, label %Then1, label %Else2, !dbg !89

Then1:                                            ; preds = %Else
  %12 = load i32, i32* %e, !dbg !90
  %13 = lshr i32 8388607, %12, !dbg !92
  store i32 %13, i32* %m, !dbg !93
  %14 = load i32, i32* %u, !dbg !94
  %15 = load i32, i32* %m, !dbg !95
  %16 = and i32 %14, %15, !dbg !96
  %17 = icmp eq i32 %16, 0, !dbg !97
  br i1 %17, label %Then3, label %Else4, !dbg !97

Else2:                                            ; preds = %Else
  %18 = load float, float* %x, !dbg !98
  %19 = fadd fast float %18, 0x4770000000000000, !dbg !100
  call fastcc void @forceEval(float %19) #6, !dbg !101
  %20 = load i32, i32* %u, !dbg !102
  %21 = lshr i32 %20, 31, !dbg !103
  %22 = icmp ne i32 %21, 0, !dbg !104
  br i1 %22, label %Then5, label %Else6, !dbg !104

Then3:                                            ; preds = %Then1
  %23 = load float, float* %x, !dbg !105
  ret float %23, !dbg !107

Else4:                                            ; preds = %Then1
  br label %EndIf, !dbg !108

Then5:                                            ; preds = %Else2
  ret float -0.000000e+00, !dbg !109

Else6:                                            ; preds = %Else2
  br label %EndIf7, !dbg !111

EndIf:                                            ; preds = %Else4
  %24 = load float, float* %x, !dbg !112
  %25 = fadd fast float %24, 0x4770000000000000, !dbg !113
  call fastcc void @forceEval(float %25) #6, !dbg !114
  %26 = load i32, i32* %u, !dbg !115
  %27 = lshr i32 %26, 31, !dbg !116
  %28 = icmp eq i32 %27, 0, !dbg !117
  br i1 %28, label %Then8, label %Else9, !dbg !117

EndIf7:                                           ; preds = %Else6
  br label %EndIf11, !dbg !118

Then8:                                            ; preds = %EndIf
  %29 = load i32, i32* %u, !dbg !119
  %30 = load i32, i32* %m, !dbg !121
  %31 = add nuw i32 %29, %30, !dbg !122
  store i32 %31, i32* %u, !dbg !122
  br label %EndIf10, !dbg !123

Else9:                                            ; preds = %EndIf
  br label %EndIf10, !dbg !123

EndIf10:                                          ; preds = %Else9, %Then8
  %32 = load i32, i32* %u, !dbg !124
  %33 = load i32, i32* %m, !dbg !125
  %34 = xor i32 %33, -1, !dbg !126
  %35 = and i32 %32, %34, !dbg !127
  store i32 %35, i32* %u, !dbg !127
  %36 = load i32, i32* %u, !dbg !128
  %37 = bitcast i32 %36 to float, !dbg !129
  br label %EndIf11, !dbg !118

EndIf11:                                          ; preds = %EndIf10, %EndIf7
  %38 = phi float [ %37, %EndIf10 ], [ 1.000000e+00, %EndIf7 ], !dbg !118
  ret float %38, !dbg !130
}

; Function Attrs: nobuiltin noreturn nounwind
define internal fastcc void @panic(%"[]u8"* byval nonnull readonly) unnamed_addr #4 !dbg !131 {
Entry:
  call void @llvm.dbg.declare(metadata %"[]u8"* %0, metadata !141, metadata !37), !dbg !142
  call void @llvm.debugtrap(), !dbg !143
  br label %WhileCond, !dbg !146

WhileCond:                                        ; preds = %WhileCond, %Entry
  br label %WhileCond, !dbg !146
}

; Function Attrs: nobuiltin nounwind
define internal fastcc void @forceEval(float) unnamed_addr #2 !dbg !147 {
Entry:
  %value = alloca float, align 4
  %x = alloca float, align 4
  %p = alloca float*, align 8
  store float %0, float* %value
  call void @llvm.dbg.declare(metadata float* %value, metadata !151, metadata !37), !dbg !158
  call void @llvm.dbg.declare(metadata float* %x, metadata !152, metadata !37), !dbg !159
  store float* %x, float** %p, !dbg !160
  call void @llvm.dbg.declare(metadata float** %p, metadata !155, metadata !37), !dbg !160
  %1 = load float*, float** %p, !dbg !161
  %2 = load float, float* %x, !dbg !163
  store volatile float %2, float* %1, !dbg !164
  ret void, !dbg !165
}

; Function Attrs: nounwind readnone
declare void @llvm.dbg.declare(metadata, metadata, metadata) #5

attributes #0 = { nounwind }
attributes #1 = { argmemonly nounwind }
attributes #2 = { nobuiltin nounwind }
attributes #3 = { cold nobuiltin noreturn nounwind }
attributes #4 = { nobuiltin noreturn nounwind }
attributes #5 = { nounwind readnone }
attributes #6 = { nobuiltin }
attributes #7 = { alwaysinline nobuiltin }

!llvm.module.flags = !{!0}
!llvm.dbg.cu = !{!1}

!0 = !{i32 2, !"Debug Info Version", i32 3}
!1 = distinct !DICompileUnit(language: DW_LANG_C99, file: !2, producer: "zig 0.0.0", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: !3, globals: !12)
!2 = !DIFile(filename: "test", directory: ".")
!3 = !{!4}
!4 = !DICompositeType(tag: DW_TAG_enumeration_type, name: "GlobalLinkage", scope: !5, file: !5, line: 126, baseType: !6, size: 8, align: 8, elements: !7)
!5 = !DIFile(filename: "builtin.zig", directory: "/home/andy/dev/zig/build/zig-cache")
!6 = !DIBasicType(name: "u8", size: 8, encoding: DW_ATE_unsigned_char)
!7 = !{!8, !9, !10, !11}
!8 = !DIEnumerator(name: "Internal", value: 0)
!9 = !DIEnumerator(name: "Strong", value: 1)
!10 = !DIEnumerator(name: "Weak", value: 2)
!11 = !DIEnumerator(name: "LinkOnce", value: 3)
!12 = !{!13}
!13 = !DIGlobalVariableExpression(var: !14)
!14 = distinct !DIGlobalVariable(name: "__zig_panic_implementation_provided", linkageName: "__zig_panic_implementation_provided", scope: !5, file: !5, line: 189, type: !15, isLocal: true, isDefinition: true)
!15 = !DIBasicType(name: "bool", size: 8, encoding: DW_ATE_boolean)
!16 = distinct !DISubprogram(name: "do_test", scope: !17, file: !17, line: 46, type: !18, isLocal: false, isDefinition: true, scopeLine: 46, isOptimized: true, unit: !1, variables: !20)
!17 = !DIFile(filename: "test.zig", directory: "/home/andy/dev/zig/build")
!18 = !DISubroutineType(types: !19)
!19 = !{!15}
!20 = !{}
!21 = !DILocation(line: 47, column: 16, scope: !22)
!22 = distinct !DILexicalBlock(scope: !16, file: !17, line: 46, column: 29)
!23 = !DILocation(line: 47, column: 36, scope: !22)
!24 = !DILocation(line: 47, column: 27, scope: !22)
!25 = !DILocation(line: 47, column: 5, scope: !22)
!26 = distinct !DISubprogram(name: "__zig_panic", scope: !27, file: !27, line: 7, type: !28, isLocal: false, isDefinition: true, scopeLine: 7, isOptimized: true, unit: !1, variables: !33)
!27 = !DIFile(filename: "zigrt.zig", directory: "/home/andy/dev/zig/build/lib/zig/std/special")
!28 = !DISubroutineType(types: !29)
!29 = !{!30, !31, !32}
!30 = !DIBasicType(name: "void", encoding: DW_ATE_unsigned)
!31 = !DIDerivedType(tag: DW_TAG_pointer_type, name: "&const u8", baseType: !6, size: 64, align: 64)
!32 = !DIBasicType(name: "usize", size: 64, encoding: DW_ATE_unsigned)
!33 = !{!34, !35}
!34 = !DILocalVariable(name: "message_ptr", arg: 1, scope: !26, file: !27, line: 7, type: !31)
!35 = !DILocalVariable(name: "message_len", arg: 2, scope: !36, file: !27, line: 7, type: !32)
!36 = distinct !DILexicalBlock(scope: !26, file: !27, line: 7, column: 30)
!37 = !DIExpression()
!38 = !DILocation(line: 7, column: 30, scope: !26)
!39 = !DILocation(line: 7, column: 54, scope: !36)
!40 = !DILocation(line: 12, column: 48, scope: !41)
!41 = distinct !DILexicalBlock(scope: !42, file: !27, line: 11, column: 54)
!42 = distinct !DILexicalBlock(scope: !43, file: !27, line: 7, column: 86)
!43 = distinct !DILexicalBlock(scope: !36, file: !27, line: 7, column: 54)
!44 = !DILocation(line: 12, column: 43, scope: !41)
!45 = !DILocation(line: 12, column: 31, scope: !41)
!46 = distinct !DISubprogram(name: "ceil", scope: !17, file: !17, line: 3, type: !47, isLocal: true, isDefinition: true, scopeLine: 3, isOptimized: true, unit: !1, variables: !50)
!47 = !DISubroutineType(types: !48)
!48 = !{!49, !49}
!49 = !DIBasicType(name: "f32", size: 32, encoding: DW_ATE_float)
!50 = !{!51}
!51 = !DILocalVariable(name: "x", arg: 1, scope: !52, file: !17, line: 3, type: !49)
!52 = distinct !DILexicalBlock(scope: !46, file: !17, line: 3, column: 13)
!53 = !DILocation(line: 3, column: 13, scope: !52)
!54 = !DILocation(line: 6, column: 36, scope: !55)
!55 = distinct !DILexicalBlock(scope: !56, file: !17, line: 4, column: 5)
!56 = distinct !DILexicalBlock(scope: !57, file: !17, line: 3, column: 35)
!57 = distinct !DILexicalBlock(scope: !52, file: !17, line: 3, column: 13)
!58 = !DILocation(line: 6, column: 16, scope: !55)
!59 = !DILocation(line: 5, column: 5, scope: !57)
!60 = distinct !DISubprogram(name: "ceil32", scope: !17, file: !17, line: 11, type: !47, isLocal: true, isDefinition: true, scopeLine: 11, isOptimized: true, unit: !1, variables: !61)
!61 = !{!62, !63, !67, !70}
!62 = !DILocalVariable(name: "x", arg: 1, scope: !60, file: !17, line: 11, type: !49)
!63 = !DILocalVariable(name: "u", scope: !64, file: !17, line: 12, type: !66)
!64 = distinct !DILexicalBlock(scope: !65, file: !17, line: 11, column: 26)
!65 = distinct !DILexicalBlock(scope: !60, file: !17, line: 11, column: 11)
!66 = !DIBasicType(name: "u32", size: 32, encoding: DW_ATE_unsigned)
!67 = !DILocalVariable(name: "e", scope: !68, file: !17, line: 13, type: !69)
!68 = distinct !DILexicalBlock(scope: !64, file: !17, line: 12, column: 5)
!69 = !DIBasicType(name: "i32", size: 32, encoding: DW_ATE_signed)
!70 = !DILocalVariable(name: "m", scope: !71, file: !17, line: 14, type: !66)
!71 = distinct !DILexicalBlock(scope: !68, file: !17, line: 13, column: 5)
!72 = !DILocation(line: 11, column: 11, scope: !60)
!73 = !DILocation(line: 12, column: 27, scope: !64)
!74 = !DILocation(line: 12, column: 13, scope: !64)
!75 = !DILocation(line: 12, column: 5, scope: !64)
!76 = !DILocation(line: 13, column: 18, scope: !68)
!77 = !DILocation(line: 13, column: 20, scope: !68)
!78 = !DILocation(line: 13, column: 27, scope: !68)
!79 = !DILocation(line: 13, column: 35, scope: !68)
!80 = !DILocation(line: 13, column: 5, scope: !68)
!81 = !DILocation(line: 14, column: 5, scope: !71)
!82 = !DILocation(line: 16, column: 9, scope: !83)
!83 = distinct !DILexicalBlock(scope: !71, file: !17, line: 14, column: 5)
!84 = !DILocation(line: 16, column: 11, scope: !83)
!85 = !DILocation(line: 17, column: 16, scope: !86)
!86 = distinct !DILexicalBlock(scope: !83, file: !17, line: 16, column: 18)
!87 = !DILocation(line: 17, column: 9, scope: !86)
!88 = !DILocation(line: 19, column: 14, scope: !83)
!89 = !DILocation(line: 19, column: 16, scope: !83)
!90 = !DILocation(line: 20, column: 31, scope: !91)
!91 = distinct !DILexicalBlock(scope: !83, file: !17, line: 19, column: 22)
!92 = !DILocation(line: 20, column: 24, scope: !91)
!93 = !DILocation(line: 20, column: 11, scope: !91)
!94 = !DILocation(line: 21, column: 13, scope: !91)
!95 = !DILocation(line: 21, column: 17, scope: !91)
!96 = !DILocation(line: 21, column: 15, scope: !91)
!97 = !DILocation(line: 21, column: 19, scope: !91)
!98 = !DILocation(line: 31, column: 19, scope: !99)
!99 = distinct !DILexicalBlock(scope: !83, file: !17, line: 30, column: 12)
!100 = !DILocation(line: 31, column: 21, scope: !99)
!101 = !DILocation(line: 31, column: 18, scope: !99)
!102 = !DILocation(line: 32, column: 13, scope: !99)
!103 = !DILocation(line: 32, column: 15, scope: !99)
!104 = !DILocation(line: 32, column: 21, scope: !99)
!105 = !DILocation(line: 22, column: 20, scope: !106)
!106 = distinct !DILexicalBlock(scope: !91, file: !17, line: 21, column: 25)
!107 = !DILocation(line: 22, column: 13, scope: !106)
!108 = !DILocation(line: 21, column: 9, scope: !91)
!109 = !DILocation(line: 33, column: 13, scope: !110)
!110 = distinct !DILexicalBlock(scope: !99, file: !17, line: 32, column: 27)
!111 = !DILocation(line: 32, column: 9, scope: !99)
!112 = !DILocation(line: 24, column: 19, scope: !91)
!113 = !DILocation(line: 24, column: 21, scope: !91)
!114 = !DILocation(line: 24, column: 18, scope: !91)
!115 = !DILocation(line: 25, column: 13, scope: !91)
!116 = !DILocation(line: 25, column: 15, scope: !91)
!117 = !DILocation(line: 25, column: 21, scope: !91)
!118 = !DILocation(line: 19, column: 10, scope: !83)
!119 = !DILocation(line: 26, column: 13, scope: !120)
!120 = distinct !DILexicalBlock(scope: !91, file: !17, line: 25, column: 27)
!121 = !DILocation(line: 26, column: 18, scope: !120)
!122 = !DILocation(line: 26, column: 15, scope: !120)
!123 = !DILocation(line: 25, column: 9, scope: !91)
!124 = !DILocation(line: 28, column: 9, scope: !91)
!125 = !DILocation(line: 28, column: 15, scope: !91)
!126 = !DILocation(line: 28, column: 14, scope: !91)
!127 = !DILocation(line: 28, column: 11, scope: !91)
!128 = !DILocation(line: 29, column: 23, scope: !91)
!129 = !DILocation(line: 29, column: 9, scope: !91)
!130 = !DILocation(line: 16, column: 5, scope: !65)
!131 = distinct !DISubprogram(name: "panic", scope: !17, file: !17, line: 1, type: !132, isLocal: true, isDefinition: true, scopeLine: 1, isOptimized: true, unit: !1, variables: !140)
!132 = !DISubroutineType(types: !133)
!133 = !{!30, !134}
!134 = !DIDerivedType(tag: DW_TAG_pointer_type, name: "&const []const u8", baseType: !135, size: 64, align: 64)
!135 = !DICompositeType(tag: DW_TAG_structure_type, name: "[]u8", size: 128, align: 128, elements: !136)
!136 = !{!137, !139}
!137 = !DIDerivedType(tag: DW_TAG_member, name: "ptr", scope: !135, baseType: !138, size: 64, align: 64)
!138 = !DIDerivedType(tag: DW_TAG_pointer_type, name: "&u8", baseType: !6, size: 64, align: 64)
!139 = !DIDerivedType(tag: DW_TAG_member, name: "len", scope: !135, baseType: !32, size: 64, align: 64, offset: 64)
!140 = !{!141}
!141 = !DILocalVariable(name: "msg", arg: 1, scope: !131, file: !17, line: 1, type: !135)
!142 = !DILocation(line: 1, column: 14, scope: !131)
!143 = !DILocation(line: 1, column: 45, scope: !144)
!144 = distinct !DILexicalBlock(scope: !145, file: !17, line: 1, column: 43)
!145 = distinct !DILexicalBlock(scope: !131, file: !17, line: 1, column: 14)
!146 = !DILocation(line: 1, column: 60, scope: !144)
!147 = distinct !DISubprogram(name: "forceEval", scope: !17, file: !17, line: 39, type: !148, isLocal: true, isDefinition: true, scopeLine: 39, isOptimized: true, unit: !1, variables: !150)
!148 = !DISubroutineType(types: !149)
!149 = !{!30, !49}
!150 = !{!151, !152, !155}
!151 = !DILocalVariable(name: "value", arg: 1, scope: !147, file: !17, line: 39, type: !49)
!152 = !DILocalVariable(name: "x", scope: !153, file: !17, line: 40, type: !49)
!153 = distinct !DILexicalBlock(scope: !154, file: !17, line: 39, column: 30)
!154 = distinct !DILexicalBlock(scope: !147, file: !17, line: 39, column: 18)
!155 = !DILocalVariable(name: "p", scope: !156, file: !17, line: 41, type: !157)
!156 = distinct !DILexicalBlock(scope: !153, file: !17, line: 40, column: 5)
!157 = !DIDerivedType(tag: DW_TAG_pointer_type, name: "&volatile f32", baseType: !49, size: 64, align: 64)
!158 = !DILocation(line: 39, column: 18, scope: !147)
!159 = !DILocation(line: 40, column: 5, scope: !153)
!160 = !DILocation(line: 41, column: 5, scope: !156)
!161 = !DILocation(line: 42, column: 5, scope: !162)
!162 = distinct !DILexicalBlock(scope: !156, file: !17, line: 41, column: 5)
!163 = !DILocation(line: 42, column: 10, scope: !162)
!164 = !DILocation(line: 42, column: 8, scope: !162)
!165 = !DILocation(line: 39, column: 30, scope: !154)

source:
pub fn panic(msg: []const u8) -> noreturn { @breakpoint(); while (true) {} }

pub fn ceil(x: var) -> @typeOf(x) {
    const T = @typeOf(x);
    switch (T) {
        f32 => @inlineCall(ceil32, x),
        else => @compileError("ceil not implemented for " ++ @typeName(T)),
    }
}

fn ceil32(x: f32) -> f32 {
    var u = @bitCast(u32, x);
    var e = i32((u >> 23) & 0xFF) - 0x7F;
    var m: u32 = undefined;

    if (e >= 23) {
        return x;
    }
    else if (e >= 0) {
        m = 0x007FFFFF >> u32(e);
        if (u & m == 0) {
            return x;
        }
        forceEval(x + 0x1.0p120);
        if (u >> 31 == 0) {
            u += m;
        }
        u &= ~m;
        @bitCast(f32, u)
    } else {
        forceEval(x + 0x1.0p120);
        if (u >> 31 != 0) {
            return -0.0;
        } else {
            1.0
        }
    }
}
pub fn forceEval(value: f32) {
    var x: f32 = undefined;
    const p = @ptrCast(&volatile f32, &x);
    *p = x;
}


export fn do_test() -> bool {
    return ceil(f32(0.0)) == ceil32(0.0);
}


-----------------------

With no optimizations, the do_test function returns true, which is expected. With -O3, the module gets rewritten to:



 

-- 
Mehdi



_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [llvm-dev] LLVM behavior different depending on function symbol name

Tim Northover via llvm-dev
Oops. Pressed send on accident.

With -O3, the module gets rewritten to:
; Function Attrs: nobuiltin nounwind
define i1 @do_test() local_unnamed_addr #0 !dbg !16 {
Entry:
  %x.sroa.0.i.i = alloca i32, align 4
  %x.sroa.0.i.i.i = alloca i32, align 4
  tail call void @llvm.dbg.value(metadata float 0.000000e+00, i64 0, metadata !21, metadata !28) #3, !dbg !29
  tail call void @llvm.dbg.value(metadata float 0.000000e+00, i64 0, metadata !32, metadata !28) #3, !dbg !44
  tail call void @llvm.dbg.value(metadata i32 0, i64 0, metadata !35, metadata !28) #3, !dbg !49
  tail call void @llvm.dbg.value(metadata i32 -127, i64 0, metadata !39, metadata !28) #3, !dbg !50
  %x.sroa.0.i.i.i.0.sroa_cast = bitcast i32* %x.sroa.0.i.i.i to i8*, !dbg !51
  call void @llvm.lifetime.start(i64 4, i8* nonnull %x.sroa.0.i.i.i.0.sroa_cast), !dbg !51
  tail call void @llvm.dbg.value(metadata float 0.000000e+00, i64 0, metadata !57, metadata !28) #3, !dbg !51
  tail call void @llvm.dbg.value(metadata float* undef, i64 0, metadata !61, metadata !28) #3, !dbg !67
  %x.sroa.0.i.i.i.0.x.sroa.0.i.i.0.x.sroa.0.i.0.x.sroa.0.0.x.sroa.0.0.x.0.1.i.i.i = load i32, i32* %x.sroa.0.i.i.i, align 4, !dbg !68
  store volatile i32 %x.sroa.0.i.i.i.0.x.sroa.0.i.i.0.x.sroa.0.i.0.x.sroa.0.0.x.sroa.0.0.x.0.1.i.i.i, i32* %x.sroa.0.i.i.i, align 4, !dbg !70
  call void @llvm.lifetime.end(i64 4, i8* nonnull %x.sroa.0.i.i.i.0.sroa_cast), !dbg !71
  tail call void @llvm.dbg.value(metadata float 0.000000e+00, i64 0, metadata !32, metadata !28) #3, !dbg !72
  tail call void @llvm.dbg.value(metadata i32 0, i64 0, metadata !35, metadata !28) #3, !dbg !75
  tail call void @llvm.dbg.value(metadata i32 -127, i64 0, metadata !39, metadata !28) #3, !dbg !76
  %x.sroa.0.i.i.0.sroa_cast = bitcast i32* %x.sroa.0.i.i to i8*, !dbg !77
  call void @llvm.lifetime.start(i64 4, i8* nonnull %x.sroa.0.i.i.0.sroa_cast), !dbg !77
  tail call void @llvm.dbg.value(metadata float 0.000000e+00, i64 0, metadata !57, metadata !28) #3, !dbg !77
  tail call void @llvm.dbg.value(metadata float* undef, i64 0, metadata !61, metadata !28) #3, !dbg !79
  %x.sroa.0.i.i.0.x.sroa.0.i.0.x.sroa.0.0.x.sroa.0.0.x.0.1.i.i = load i32, i32* %x.sroa.0.i.i, align 4, !dbg !80
  store volatile i32 %x.sroa.0.i.i.0.x.sroa.0.i.0.x.sroa.0.0.x.sroa.0.0.x.0.1.i.i, i32* %x.sroa.0.i.i, align 4, !dbg !81
  call void @llvm.lifetime.end(i64 4, i8* nonnull %x.sroa.0.i.i.0.sroa_cast), !dbg !82
  ret i1 false, !dbg !83
}


Note the `ret i1 false` at the end. Expected it to return true.


On Mon, Jun 19, 2017 at 12:26 PM, Andrew Kelley <[hidden email]> wrote:


On Mon, Jun 19, 2017 at 12:06 PM, Mehdi AMINI <[hidden email]> wrote:
Hi,

2017-06-19 8:45 GMT-07:00 Andrew Kelley via llvm-dev <[hidden email]>:
Greetings,

I have a Zig implementation of ceil which is emitted into LLVM IR like this:

; Function Attrs: nobuiltin nounwind
define internal fastcc float @ceil(float) unnamed_addr #3 !dbg !644 {
Entry:
  %x = alloca float, align 4
  store float %0, float* %x
  call void @llvm.dbg.declare(metadata float* %x, metadata !649, metadata !494), !dbg !651
  %1 = load float, float* %x, !dbg !652
  %2 = call fastcc float @ceil32(float %1) #8, !dbg !656
  ret float %2, !dbg !657
}

Test case:

test "math.ceil" {
    assert(ceil(f32(0.0)) == ceil32(0.0));
    assert(ceil(f64(0.0)) == ceil64(0.0));
}


When I compile with optimizations on, this test case fails. The optimized code for the test case ends up being a call to panic (assertion failure), which means that LLVM determined the test failed at compile-time.

What's strange about this is that if I change the function name from @ceil to @ceil_asdf (and change the callers) then the test passes.

So I think LLVM is doing some kind of string comparison on the symbol name and detecting that it is "ceil" and then having different, undesired behavior.

I tried putting `nobuiltin` in the function attributes and at the callsite, but that did not change anything.

Any ideas what's going on?

I think it'd be a lot easier to figure if you provide a standalone repro.

Standalone repro:

; ModuleID = 'test'
source_filename = "test"
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

%"[]u8" = type { i8*, i64 }

@__zig_panic_implementation_provided = internal unnamed_addr constant i1 true, align 1

; Function Attrs: nounwind
declare void @llvm.debugtrap() #0

; Function Attrs: argmemonly nounwind
declare void @llvm.memcpy.p0i8.p0i8.i64(i8* nocapture writeonly, i8* nocapture readonly, i64, i32, i1) #1

; Function Attrs: argmemonly nounwind
declare void @llvm.memset.p0i8.i64(i8* nocapture writeonly, i8, i64, i32, i1) #1

; Function Attrs: nobuiltin nounwind
define i1 @do_test() #2 !dbg !16 {
Entry:
  %0 = call fastcc float @ceil(float 0.000000e+00) #6, !dbg !21
  %1 = call fastcc float @ceil32(float 0.000000e+00) #6, !dbg !23
  %2 = fcmp fast oeq float %0, %1, !dbg !24
  ret i1 %2, !dbg !25
}

; Function Attrs: cold nobuiltin noreturn nounwind
define linkonce coldcc void @__zig_panic(i8* nonnull readonly, i64) #3 !dbg !26 {
Entry:
  %2 = alloca %"[]u8", align 8
  %message_ptr = alloca i8*, align 8
  %message_len = alloca i64, align 8
  store i8* %0, i8** %message_ptr
  call void @llvm.dbg.declare(metadata i8** %message_ptr, metadata !34, metadata !37), !dbg !38
  store i64 %1, i64* %message_len
  call void @llvm.dbg.declare(metadata i64* %message_len, metadata !35, metadata !37), !dbg !39
  %3 = load i64, i64* %message_len, !dbg !40
  %4 = load i8*, i8** %message_ptr, !dbg !44
  %5 = getelementptr inbounds %"[]u8", %"[]u8"* %2, i32 0, i32 0, !dbg !44
  %6 = getelementptr inbounds i8, i8* %4, i64 0, !dbg !44
  store i8* %6, i8** %5, !dbg !44
  %7 = getelementptr inbounds %"[]u8", %"[]u8"* %2, i32 0, i32 1, !dbg !44
  %8 = sub nsw i64 %3, 0, !dbg !44
  store i64 %8, i64* %7, !dbg !44
  call fastcc void @panic(%"[]u8"* byval %2) #6, !dbg !45
  unreachable, !dbg !45
}

; Function Attrs: nobuiltin nounwind
define internal fastcc float @ceil(float) unnamed_addr #2 !dbg !46 {
Entry:
  %x = alloca float, align 4
  store float %0, float* %x
  call void @llvm.dbg.declare(metadata float* %x, metadata !51, metadata !37), !dbg !53
  %1 = load float, float* %x, !dbg !54
  %2 = call fastcc float @ceil32(float %1) #7, !dbg !58
  ret float %2, !dbg !59
}

; Function Attrs: nobuiltin nounwind
define internal fastcc float @ceil32(float) unnamed_addr #2 !dbg !60 {
Entry:
  %x = alloca float, align 4
  %u = alloca i32, align 4
  %e = alloca i32, align 4
  %m = alloca i32, align 4
  store float %0, float* %x
  call void @llvm.dbg.declare(metadata float* %x, metadata !62, metadata !37), !dbg !72
  %1 = load float, float* %x, !dbg !73
  %2 = bitcast float %1 to i32, !dbg !74
  store i32 %2, i32* %u, !dbg !75
  call void @llvm.dbg.declare(metadata i32* %u, metadata !63, metadata !37), !dbg !75
  %3 = load i32, i32* %u, !dbg !76
  %4 = lshr i32 %3, 23, !dbg !77
  %5 = and i32 %4, 255, !dbg !78
  %6 = sub nsw i32 %5, 127, !dbg !79
  store i32 %6, i32* %e, !dbg !80
  call void @llvm.dbg.declare(metadata i32* %e, metadata !67, metadata !37), !dbg !80
  call void @llvm.dbg.declare(metadata i32* %m, metadata !70, metadata !37), !dbg !81
  %7 = load i32, i32* %e, !dbg !82
  %8 = icmp sge i32 %7, 23, !dbg !84
  br i1 %8, label %Then, label %Else, !dbg !84

Then:                                             ; preds = %Entry
  %9 = load float, float* %x, !dbg !85
  ret float %9, !dbg !87

Else:                                             ; preds = %Entry
  %10 = load i32, i32* %e, !dbg !88
  %11 = icmp sge i32 %10, 0, !dbg !89
  br i1 %11, label %Then1, label %Else2, !dbg !89

Then1:                                            ; preds = %Else
  %12 = load i32, i32* %e, !dbg !90
  %13 = lshr i32 8388607, %12, !dbg !92
  store i32 %13, i32* %m, !dbg !93
  %14 = load i32, i32* %u, !dbg !94
  %15 = load i32, i32* %m, !dbg !95
  %16 = and i32 %14, %15, !dbg !96
  %17 = icmp eq i32 %16, 0, !dbg !97
  br i1 %17, label %Then3, label %Else4, !dbg !97

Else2:                                            ; preds = %Else
  %18 = load float, float* %x, !dbg !98
  %19 = fadd fast float %18, 0x4770000000000000, !dbg !100
  call fastcc void @forceEval(float %19) #6, !dbg !101
  %20 = load i32, i32* %u, !dbg !102
  %21 = lshr i32 %20, 31, !dbg !103
  %22 = icmp ne i32 %21, 0, !dbg !104
  br i1 %22, label %Then5, label %Else6, !dbg !104

Then3:                                            ; preds = %Then1
  %23 = load float, float* %x, !dbg !105
  ret float %23, !dbg !107

Else4:                                            ; preds = %Then1
  br label %EndIf, !dbg !108

Then5:                                            ; preds = %Else2
  ret float -0.000000e+00, !dbg !109

Else6:                                            ; preds = %Else2
  br label %EndIf7, !dbg !111

EndIf:                                            ; preds = %Else4
  %24 = load float, float* %x, !dbg !112
  %25 = fadd fast float %24, 0x4770000000000000, !dbg !113
  call fastcc void @forceEval(float %25) #6, !dbg !114
  %26 = load i32, i32* %u, !dbg !115
  %27 = lshr i32 %26, 31, !dbg !116
  %28 = icmp eq i32 %27, 0, !dbg !117
  br i1 %28, label %Then8, label %Else9, !dbg !117

EndIf7:                                           ; preds = %Else6
  br label %EndIf11, !dbg !118

Then8:                                            ; preds = %EndIf
  %29 = load i32, i32* %u, !dbg !119
  %30 = load i32, i32* %m, !dbg !121
  %31 = add nuw i32 %29, %30, !dbg !122
  store i32 %31, i32* %u, !dbg !122
  br label %EndIf10, !dbg !123

Else9:                                            ; preds = %EndIf
  br label %EndIf10, !dbg !123

EndIf10:                                          ; preds = %Else9, %Then8
  %32 = load i32, i32* %u, !dbg !124
  %33 = load i32, i32* %m, !dbg !125
  %34 = xor i32 %33, -1, !dbg !126
  %35 = and i32 %32, %34, !dbg !127
  store i32 %35, i32* %u, !dbg !127
  %36 = load i32, i32* %u, !dbg !128
  %37 = bitcast i32 %36 to float, !dbg !129
  br label %EndIf11, !dbg !118

EndIf11:                                          ; preds = %EndIf10, %EndIf7
  %38 = phi float [ %37, %EndIf10 ], [ 1.000000e+00, %EndIf7 ], !dbg !118
  ret float %38, !dbg !130
}

; Function Attrs: nobuiltin noreturn nounwind
define internal fastcc void @panic(%"[]u8"* byval nonnull readonly) unnamed_addr #4 !dbg !131 {
Entry:
  call void @llvm.dbg.declare(metadata %"[]u8"* %0, metadata !141, metadata !37), !dbg !142
  call void @llvm.debugtrap(), !dbg !143
  br label %WhileCond, !dbg !146

WhileCond:                                        ; preds = %WhileCond, %Entry
  br label %WhileCond, !dbg !146
}

; Function Attrs: nobuiltin nounwind
define internal fastcc void @forceEval(float) unnamed_addr #2 !dbg !147 {
Entry:
  %value = alloca float, align 4
  %x = alloca float, align 4
  %p = alloca float*, align 8
  store float %0, float* %value
  call void @llvm.dbg.declare(metadata float* %value, metadata !151, metadata !37), !dbg !158
  call void @llvm.dbg.declare(metadata float* %x, metadata !152, metadata !37), !dbg !159
  store float* %x, float** %p, !dbg !160
  call void @llvm.dbg.declare(metadata float** %p, metadata !155, metadata !37), !dbg !160
  %1 = load float*, float** %p, !dbg !161
  %2 = load float, float* %x, !dbg !163
  store volatile float %2, float* %1, !dbg !164
  ret void, !dbg !165
}

; Function Attrs: nounwind readnone
declare void @llvm.dbg.declare(metadata, metadata, metadata) #5

attributes #0 = { nounwind }
attributes #1 = { argmemonly nounwind }
attributes #2 = { nobuiltin nounwind }
attributes #3 = { cold nobuiltin noreturn nounwind }
attributes #4 = { nobuiltin noreturn nounwind }
attributes #5 = { nounwind readnone }
attributes #6 = { nobuiltin }
attributes #7 = { alwaysinline nobuiltin }

!llvm.module.flags = !{!0}
!llvm.dbg.cu = !{!1}

!0 = !{i32 2, !"Debug Info Version", i32 3}
!1 = distinct !DICompileUnit(language: DW_LANG_C99, file: !2, producer: "zig 0.0.0", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: !3, globals: !12)
!2 = !DIFile(filename: "test", directory: ".")
!3 = !{!4}
!4 = !DICompositeType(tag: DW_TAG_enumeration_type, name: "GlobalLinkage", scope: !5, file: !5, line: 126, baseType: !6, size: 8, align: 8, elements: !7)
!5 = !DIFile(filename: "builtin.zig", directory: "/home/andy/dev/zig/build/zig-cache")
!6 = !DIBasicType(name: "u8", size: 8, encoding: DW_ATE_unsigned_char)
!7 = !{!8, !9, !10, !11}
!8 = !DIEnumerator(name: "Internal", value: 0)
!9 = !DIEnumerator(name: "Strong", value: 1)
!10 = !DIEnumerator(name: "Weak", value: 2)
!11 = !DIEnumerator(name: "LinkOnce", value: 3)
!12 = !{!13}
!13 = !DIGlobalVariableExpression(var: !14)
!14 = distinct !DIGlobalVariable(name: "__zig_panic_implementation_provided", linkageName: "__zig_panic_implementation_provided", scope: !5, file: !5, line: 189, type: !15, isLocal: true, isDefinition: true)
!15 = !DIBasicType(name: "bool", size: 8, encoding: DW_ATE_boolean)
!16 = distinct !DISubprogram(name: "do_test", scope: !17, file: !17, line: 46, type: !18, isLocal: false, isDefinition: true, scopeLine: 46, isOptimized: true, unit: !1, variables: !20)
!17 = !DIFile(filename: "test.zig", directory: "/home/andy/dev/zig/build")
!18 = !DISubroutineType(types: !19)
!19 = !{!15}
!20 = !{}
!21 = !DILocation(line: 47, column: 16, scope: !22)
!22 = distinct !DILexicalBlock(scope: !16, file: !17, line: 46, column: 29)
!23 = !DILocation(line: 47, column: 36, scope: !22)
!24 = !DILocation(line: 47, column: 27, scope: !22)
!25 = !DILocation(line: 47, column: 5, scope: !22)
!26 = distinct !DISubprogram(name: "__zig_panic", scope: !27, file: !27, line: 7, type: !28, isLocal: false, isDefinition: true, scopeLine: 7, isOptimized: true, unit: !1, variables: !33)
!27 = !DIFile(filename: "zigrt.zig", directory: "/home/andy/dev/zig/build/lib/zig/std/special")
!28 = !DISubroutineType(types: !29)
!29 = !{!30, !31, !32}
!30 = !DIBasicType(name: "void", encoding: DW_ATE_unsigned)
!31 = !DIDerivedType(tag: DW_TAG_pointer_type, name: "&const u8", baseType: !6, size: 64, align: 64)
!32 = !DIBasicType(name: "usize", size: 64, encoding: DW_ATE_unsigned)
!33 = !{!34, !35}
!34 = !DILocalVariable(name: "message_ptr", arg: 1, scope: !26, file: !27, line: 7, type: !31)
!35 = !DILocalVariable(name: "message_len", arg: 2, scope: !36, file: !27, line: 7, type: !32)
!36 = distinct !DILexicalBlock(scope: !26, file: !27, line: 7, column: 30)
!37 = !DIExpression()
!38 = !DILocation(line: 7, column: 30, scope: !26)
!39 = !DILocation(line: 7, column: 54, scope: !36)
!40 = !DILocation(line: 12, column: 48, scope: !41)
!41 = distinct !DILexicalBlock(scope: !42, file: !27, line: 11, column: 54)
!42 = distinct !DILexicalBlock(scope: !43, file: !27, line: 7, column: 86)
!43 = distinct !DILexicalBlock(scope: !36, file: !27, line: 7, column: 54)
!44 = !DILocation(line: 12, column: 43, scope: !41)
!45 = !DILocation(line: 12, column: 31, scope: !41)
!46 = distinct !DISubprogram(name: "ceil", scope: !17, file: !17, line: 3, type: !47, isLocal: true, isDefinition: true, scopeLine: 3, isOptimized: true, unit: !1, variables: !50)
!47 = !DISubroutineType(types: !48)
!48 = !{!49, !49}
!49 = !DIBasicType(name: "f32", size: 32, encoding: DW_ATE_float)
!50 = !{!51}
!51 = !DILocalVariable(name: "x", arg: 1, scope: !52, file: !17, line: 3, type: !49)
!52 = distinct !DILexicalBlock(scope: !46, file: !17, line: 3, column: 13)
!53 = !DILocation(line: 3, column: 13, scope: !52)
!54 = !DILocation(line: 6, column: 36, scope: !55)
!55 = distinct !DILexicalBlock(scope: !56, file: !17, line: 4, column: 5)
!56 = distinct !DILexicalBlock(scope: !57, file: !17, line: 3, column: 35)
!57 = distinct !DILexicalBlock(scope: !52, file: !17, line: 3, column: 13)
!58 = !DILocation(line: 6, column: 16, scope: !55)
!59 = !DILocation(line: 5, column: 5, scope: !57)
!60 = distinct !DISubprogram(name: "ceil32", scope: !17, file: !17, line: 11, type: !47, isLocal: true, isDefinition: true, scopeLine: 11, isOptimized: true, unit: !1, variables: !61)
!61 = !{!62, !63, !67, !70}
!62 = !DILocalVariable(name: "x", arg: 1, scope: !60, file: !17, line: 11, type: !49)
!63 = !DILocalVariable(name: "u", scope: !64, file: !17, line: 12, type: !66)
!64 = distinct !DILexicalBlock(scope: !65, file: !17, line: 11, column: 26)
!65 = distinct !DILexicalBlock(scope: !60, file: !17, line: 11, column: 11)
!66 = !DIBasicType(name: "u32", size: 32, encoding: DW_ATE_unsigned)
!67 = !DILocalVariable(name: "e", scope: !68, file: !17, line: 13, type: !69)
!68 = distinct !DILexicalBlock(scope: !64, file: !17, line: 12, column: 5)
!69 = !DIBasicType(name: "i32", size: 32, encoding: DW_ATE_signed)
!70 = !DILocalVariable(name: "m", scope: !71, file: !17, line: 14, type: !66)
!71 = distinct !DILexicalBlock(scope: !68, file: !17, line: 13, column: 5)
!72 = !DILocation(line: 11, column: 11, scope: !60)
!73 = !DILocation(line: 12, column: 27, scope: !64)
!74 = !DILocation(line: 12, column: 13, scope: !64)
!75 = !DILocation(line: 12, column: 5, scope: !64)
!76 = !DILocation(line: 13, column: 18, scope: !68)
!77 = !DILocation(line: 13, column: 20, scope: !68)
!78 = !DILocation(line: 13, column: 27, scope: !68)
!79 = !DILocation(line: 13, column: 35, scope: !68)
!80 = !DILocation(line: 13, column: 5, scope: !68)
!81 = !DILocation(line: 14, column: 5, scope: !71)
!82 = !DILocation(line: 16, column: 9, scope: !83)
!83 = distinct !DILexicalBlock(scope: !71, file: !17, line: 14, column: 5)
!84 = !DILocation(line: 16, column: 11, scope: !83)
!85 = !DILocation(line: 17, column: 16, scope: !86)
!86 = distinct !DILexicalBlock(scope: !83, file: !17, line: 16, column: 18)
!87 = !DILocation(line: 17, column: 9, scope: !86)
!88 = !DILocation(line: 19, column: 14, scope: !83)
!89 = !DILocation(line: 19, column: 16, scope: !83)
!90 = !DILocation(line: 20, column: 31, scope: !91)
!91 = distinct !DILexicalBlock(scope: !83, file: !17, line: 19, column: 22)
!92 = !DILocation(line: 20, column: 24, scope: !91)
!93 = !DILocation(line: 20, column: 11, scope: !91)
!94 = !DILocation(line: 21, column: 13, scope: !91)
!95 = !DILocation(line: 21, column: 17, scope: !91)
!96 = !DILocation(line: 21, column: 15, scope: !91)
!97 = !DILocation(line: 21, column: 19, scope: !91)
!98 = !DILocation(line: 31, column: 19, scope: !99)
!99 = distinct !DILexicalBlock(scope: !83, file: !17, line: 30, column: 12)
!100 = !DILocation(line: 31, column: 21, scope: !99)
!101 = !DILocation(line: 31, column: 18, scope: !99)
!102 = !DILocation(line: 32, column: 13, scope: !99)
!103 = !DILocation(line: 32, column: 15, scope: !99)
!104 = !DILocation(line: 32, column: 21, scope: !99)
!105 = !DILocation(line: 22, column: 20, scope: !106)
!106 = distinct !DILexicalBlock(scope: !91, file: !17, line: 21, column: 25)
!107 = !DILocation(line: 22, column: 13, scope: !106)
!108 = !DILocation(line: 21, column: 9, scope: !91)
!109 = !DILocation(line: 33, column: 13, scope: !110)
!110 = distinct !DILexicalBlock(scope: !99, file: !17, line: 32, column: 27)
!111 = !DILocation(line: 32, column: 9, scope: !99)
!112 = !DILocation(line: 24, column: 19, scope: !91)
!113 = !DILocation(line: 24, column: 21, scope: !91)
!114 = !DILocation(line: 24, column: 18, scope: !91)
!115 = !DILocation(line: 25, column: 13, scope: !91)
!116 = !DILocation(line: 25, column: 15, scope: !91)
!117 = !DILocation(line: 25, column: 21, scope: !91)
!118 = !DILocation(line: 19, column: 10, scope: !83)
!119 = !DILocation(line: 26, column: 13, scope: !120)
!120 = distinct !DILexicalBlock(scope: !91, file: !17, line: 25, column: 27)
!121 = !DILocation(line: 26, column: 18, scope: !120)
!122 = !DILocation(line: 26, column: 15, scope: !120)
!123 = !DILocation(line: 25, column: 9, scope: !91)
!124 = !DILocation(line: 28, column: 9, scope: !91)
!125 = !DILocation(line: 28, column: 15, scope: !91)
!126 = !DILocation(line: 28, column: 14, scope: !91)
!127 = !DILocation(line: 28, column: 11, scope: !91)
!128 = !DILocation(line: 29, column: 23, scope: !91)
!129 = !DILocation(line: 29, column: 9, scope: !91)
!130 = !DILocation(line: 16, column: 5, scope: !65)
!131 = distinct !DISubprogram(name: "panic", scope: !17, file: !17, line: 1, type: !132, isLocal: true, isDefinition: true, scopeLine: 1, isOptimized: true, unit: !1, variables: !140)
!132 = !DISubroutineType(types: !133)
!133 = !{!30, !134}
!134 = !DIDerivedType(tag: DW_TAG_pointer_type, name: "&const []const u8", baseType: !135, size: 64, align: 64)
!135 = !DICompositeType(tag: DW_TAG_structure_type, name: "[]u8", size: 128, align: 128, elements: !136)
!136 = !{!137, !139}
!137 = !DIDerivedType(tag: DW_TAG_member, name: "ptr", scope: !135, baseType: !138, size: 64, align: 64)
!138 = !DIDerivedType(tag: DW_TAG_pointer_type, name: "&u8", baseType: !6, size: 64, align: 64)
!139 = !DIDerivedType(tag: DW_TAG_member, name: "len", scope: !135, baseType: !32, size: 64, align: 64, offset: 64)
!140 = !{!141}
!141 = !DILocalVariable(name: "msg", arg: 1, scope: !131, file: !17, line: 1, type: !135)
!142 = !DILocation(line: 1, column: 14, scope: !131)
!143 = !DILocation(line: 1, column: 45, scope: !144)
!144 = distinct !DILexicalBlock(scope: !145, file: !17, line: 1, column: 43)
!145 = distinct !DILexicalBlock(scope: !131, file: !17, line: 1, column: 14)
!146 = !DILocation(line: 1, column: 60, scope: !144)
!147 = distinct !DISubprogram(name: "forceEval", scope: !17, file: !17, line: 39, type: !148, isLocal: true, isDefinition: true, scopeLine: 39, isOptimized: true, unit: !1, variables: !150)
!148 = !DISubroutineType(types: !149)
!149 = !{!30, !49}
!150 = !{!151, !152, !155}
!151 = !DILocalVariable(name: "value", arg: 1, scope: !147, file: !17, line: 39, type: !49)
!152 = !DILocalVariable(name: "x", scope: !153, file: !17, line: 40, type: !49)
!153 = distinct !DILexicalBlock(scope: !154, file: !17, line: 39, column: 30)
!154 = distinct !DILexicalBlock(scope: !147, file: !17, line: 39, column: 18)
!155 = !DILocalVariable(name: "p", scope: !156, file: !17, line: 41, type: !157)
!156 = distinct !DILexicalBlock(scope: !153, file: !17, line: 40, column: 5)
!157 = !DIDerivedType(tag: DW_TAG_pointer_type, name: "&volatile f32", baseType: !49, size: 64, align: 64)
!158 = !DILocation(line: 39, column: 18, scope: !147)
!159 = !DILocation(line: 40, column: 5, scope: !153)
!160 = !DILocation(line: 41, column: 5, scope: !156)
!161 = !DILocation(line: 42, column: 5, scope: !162)
!162 = distinct !DILexicalBlock(scope: !156, file: !17, line: 41, column: 5)
!163 = !DILocation(line: 42, column: 10, scope: !162)
!164 = !DILocation(line: 42, column: 8, scope: !162)
!165 = !DILocation(line: 39, column: 30, scope: !154)

source:
pub fn panic(msg: []const u8) -> noreturn { @breakpoint(); while (true) {} }

pub fn ceil(x: var) -> @typeOf(x) {
    const T = @typeOf(x);
    switch (T) {
        f32 => @inlineCall(ceil32, x),
        else => @compileError("ceil not implemented for " ++ @typeName(T)),
    }
}

fn ceil32(x: f32) -> f32 {
    var u = @bitCast(u32, x);
    var e = i32((u >> 23) & 0xFF) - 0x7F;
    var m: u32 = undefined;

    if (e >= 23) {
        return x;
    }
    else if (e >= 0) {
        m = 0x007FFFFF >> u32(e);
        if (u & m == 0) {
            return x;
        }
        forceEval(x + 0x1.0p120);
        if (u >> 31 == 0) {
            u += m;
        }
        u &= ~m;
        @bitCast(f32, u)
    } else {
        forceEval(x + 0x1.0p120);
        if (u >> 31 != 0) {
            return -0.0;
        } else {
            1.0
        }
    }
}
pub fn forceEval(value: f32) {
    var x: f32 = undefined;
    const p = @ptrCast(&volatile f32, &x);
    *p = x;
}


export fn do_test() -> bool {
    return ceil(f32(0.0)) == ceil32(0.0);
}


-----------------------

With no optimizations, the do_test function returns true, which is expected. With -O3, the module gets rewritten to:



 

-- 
Mehdi




_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [llvm-dev] LLVM behavior different depending on function symbol name

Tim Northover via llvm-dev
using `opt --print-after-all -O3` I see that EarlyCSE is interpreting the call to `ceil` and constant fold:

*** IR Dump After Early CSE ***
; Function Attrs: nobuiltin nounwind
define i1 @do_test() #2 {
Entry:
  %0 = call fastcc float @ceil(float 0.000000e+00) #6
  %1 = call fastcc float @ceil32(float 0.000000e+00) #6
  %2 = fcmp fast oeq float 0.000000e+00, %1
  ret i1 %2
}

So just running `opt -early-cse -debug` seems enough:

EarlyCSE Simplify:   %0 = call fastcc float @ceil(float 0.000000e+00) #6  to: float 0.000000e+00

I suspect it is not correct from EarlyCSE to do that.

-- 
Mehdi




2017-06-19 9:27 GMT-07:00 Andrew Kelley <[hidden email]>:
Oops. Pressed send on accident.

With -O3, the module gets rewritten to:
; Function Attrs: nobuiltin nounwind
define i1 @do_test() local_unnamed_addr #0 !dbg !16 {
Entry:
  %x.sroa.0.i.i = alloca i32, align 4
  %x.sroa.0.i.i.i = alloca i32, align 4
  tail call void @llvm.dbg.value(metadata float 0.000000e+00, i64 0, metadata !21, metadata !28) #3, !dbg !29
  tail call void @llvm.dbg.value(metadata float 0.000000e+00, i64 0, metadata !32, metadata !28) #3, !dbg !44
  tail call void @llvm.dbg.value(metadata i32 0, i64 0, metadata !35, metadata !28) #3, !dbg !49
  tail call void @llvm.dbg.value(metadata i32 -127, i64 0, metadata !39, metadata !28) #3, !dbg !50
  %x.sroa.0.i.i.i.0.sroa_cast = bitcast i32* %x.sroa.0.i.i.i to i8*, !dbg !51
  call void @llvm.lifetime.start(i64 4, i8* nonnull %x.sroa.0.i.i.i.0.sroa_cast), !dbg !51
  tail call void @llvm.dbg.value(metadata float 0.000000e+00, i64 0, metadata !57, metadata !28) #3, !dbg !51
  tail call void @llvm.dbg.value(metadata float* undef, i64 0, metadata !61, metadata !28) #3, !dbg !67
  %x.sroa.0.i.i.i.0.x.sroa.0.i.i.0.x.sroa.0.i.0.x.sroa.0.0.x.sroa.0.0.x.0.1.i.i.i = load i32, i32* %x.sroa.0.i.i.i, align 4, !dbg !68
  store volatile i32 %x.sroa.0.i.i.i.0.x.sroa.0.i.i.0.x.sroa.0.i.0.x.sroa.0.0.x.sroa.0.0.x.0.1.i.i.i, i32* %x.sroa.0.i.i.i, align 4, !dbg !70
  call void @llvm.lifetime.end(i64 4, i8* nonnull %x.sroa.0.i.i.i.0.sroa_cast), !dbg !71
  tail call void @llvm.dbg.value(metadata float 0.000000e+00, i64 0, metadata !32, metadata !28) #3, !dbg !72
  tail call void @llvm.dbg.value(metadata i32 0, i64 0, metadata !35, metadata !28) #3, !dbg !75
  tail call void @llvm.dbg.value(metadata i32 -127, i64 0, metadata !39, metadata !28) #3, !dbg !76
  %x.sroa.0.i.i.0.sroa_cast = bitcast i32* %x.sroa.0.i.i to i8*, !dbg !77
  call void @llvm.lifetime.start(i64 4, i8* nonnull %x.sroa.0.i.i.0.sroa_cast), !dbg !77
  tail call void @llvm.dbg.value(metadata float 0.000000e+00, i64 0, metadata !57, metadata !28) #3, !dbg !77
  tail call void @llvm.dbg.value(metadata float* undef, i64 0, metadata !61, metadata !28) #3, !dbg !79
  %x.sroa.0.i.i.0.x.sroa.0.i.0.x.sroa.0.0.x.sroa.0.0.x.0.1.i.i = load i32, i32* %x.sroa.0.i.i, align 4, !dbg !80
  store volatile i32 %x.sroa.0.i.i.0.x.sroa.0.i.0.x.sroa.0.0.x.sroa.0.0.x.0.1.i.i, i32* %x.sroa.0.i.i, align 4, !dbg !81
  call void @llvm.lifetime.end(i64 4, i8* nonnull %x.sroa.0.i.i.0.sroa_cast), !dbg !82
  ret i1 false, !dbg !83
}


Note the `ret i1 false` at the end. Expected it to return true.


On Mon, Jun 19, 2017 at 12:26 PM, Andrew Kelley <[hidden email]> wrote:


On Mon, Jun 19, 2017 at 12:06 PM, Mehdi AMINI <[hidden email]> wrote:
Hi,

2017-06-19 8:45 GMT-07:00 Andrew Kelley via llvm-dev <[hidden email]>:
Greetings,

I have a Zig implementation of ceil which is emitted into LLVM IR like this:

; Function Attrs: nobuiltin nounwind
define internal fastcc float @ceil(float) unnamed_addr #3 !dbg !644 {
Entry:
  %x = alloca float, align 4
  store float %0, float* %x
  call void @llvm.dbg.declare(metadata float* %x, metadata !649, metadata !494), !dbg !651
  %1 = load float, float* %x, !dbg !652
  %2 = call fastcc float @ceil32(float %1) #8, !dbg !656
  ret float %2, !dbg !657
}

Test case:

test "math.ceil" {
    assert(ceil(f32(0.0)) == ceil32(0.0));
    assert(ceil(f64(0.0)) == ceil64(0.0));
}


When I compile with optimizations on, this test case fails. The optimized code for the test case ends up being a call to panic (assertion failure), which means that LLVM determined the test failed at compile-time.

What's strange about this is that if I change the function name from @ceil to @ceil_asdf (and change the callers) then the test passes.

So I think LLVM is doing some kind of string comparison on the symbol name and detecting that it is "ceil" and then having different, undesired behavior.

I tried putting `nobuiltin` in the function attributes and at the callsite, but that did not change anything.

Any ideas what's going on?

I think it'd be a lot easier to figure if you provide a standalone repro.

Standalone repro:

; ModuleID = 'test'
source_filename = "test"
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

%"[]u8" = type { i8*, i64 }

@__zig_panic_implementation_provided = internal unnamed_addr constant i1 true, align 1

; Function Attrs: nounwind
declare void @llvm.debugtrap() #0

; Function Attrs: argmemonly nounwind
declare void @llvm.memcpy.p0i8.p0i8.i64(i8* nocapture writeonly, i8* nocapture readonly, i64, i32, i1) #1

; Function Attrs: argmemonly nounwind
declare void @llvm.memset.p0i8.i64(i8* nocapture writeonly, i8, i64, i32, i1) #1

; Function Attrs: nobuiltin nounwind
define i1 @do_test() #2 !dbg !16 {
Entry:
  %0 = call fastcc float @ceil(float 0.000000e+00) #6, !dbg !21
  %1 = call fastcc float @ceil32(float 0.000000e+00) #6, !dbg !23
  %2 = fcmp fast oeq float %0, %1, !dbg !24
  ret i1 %2, !dbg !25
}

; Function Attrs: cold nobuiltin noreturn nounwind
define linkonce coldcc void @__zig_panic(i8* nonnull readonly, i64) #3 !dbg !26 {
Entry:
  %2 = alloca %"[]u8", align 8
  %message_ptr = alloca i8*, align 8
  %message_len = alloca i64, align 8
  store i8* %0, i8** %message_ptr
  call void @llvm.dbg.declare(metadata i8** %message_ptr, metadata !34, metadata !37), !dbg !38
  store i64 %1, i64* %message_len
  call void @llvm.dbg.declare(metadata i64* %message_len, metadata !35, metadata !37), !dbg !39
  %3 = load i64, i64* %message_len, !dbg !40
  %4 = load i8*, i8** %message_ptr, !dbg !44
  %5 = getelementptr inbounds %"[]u8", %"[]u8"* %2, i32 0, i32 0, !dbg !44
  %6 = getelementptr inbounds i8, i8* %4, i64 0, !dbg !44
  store i8* %6, i8** %5, !dbg !44
  %7 = getelementptr inbounds %"[]u8", %"[]u8"* %2, i32 0, i32 1, !dbg !44
  %8 = sub nsw i64 %3, 0, !dbg !44
  store i64 %8, i64* %7, !dbg !44
  call fastcc void @panic(%"[]u8"* byval %2) #6, !dbg !45
  unreachable, !dbg !45
}

; Function Attrs: nobuiltin nounwind
define internal fastcc float @ceil(float) unnamed_addr #2 !dbg !46 {
Entry:
  %x = alloca float, align 4
  store float %0, float* %x
  call void @llvm.dbg.declare(metadata float* %x, metadata !51, metadata !37), !dbg !53
  %1 = load float, float* %x, !dbg !54
  %2 = call fastcc float @ceil32(float %1) #7, !dbg !58
  ret float %2, !dbg !59
}

; Function Attrs: nobuiltin nounwind
define internal fastcc float @ceil32(float) unnamed_addr #2 !dbg !60 {
Entry:
  %x = alloca float, align 4
  %u = alloca i32, align 4
  %e = alloca i32, align 4
  %m = alloca i32, align 4
  store float %0, float* %x
  call void @llvm.dbg.declare(metadata float* %x, metadata !62, metadata !37), !dbg !72
  %1 = load float, float* %x, !dbg !73
  %2 = bitcast float %1 to i32, !dbg !74
  store i32 %2, i32* %u, !dbg !75
  call void @llvm.dbg.declare(metadata i32* %u, metadata !63, metadata !37), !dbg !75
  %3 = load i32, i32* %u, !dbg !76
  %4 = lshr i32 %3, 23, !dbg !77
  %5 = and i32 %4, 255, !dbg !78
  %6 = sub nsw i32 %5, 127, !dbg !79
  store i32 %6, i32* %e, !dbg !80
  call void @llvm.dbg.declare(metadata i32* %e, metadata !67, metadata !37), !dbg !80
  call void @llvm.dbg.declare(metadata i32* %m, metadata !70, metadata !37), !dbg !81
  %7 = load i32, i32* %e, !dbg !82
  %8 = icmp sge i32 %7, 23, !dbg !84
  br i1 %8, label %Then, label %Else, !dbg !84

Then:                                             ; preds = %Entry
  %9 = load float, float* %x, !dbg !85
  ret float %9, !dbg !87

Else:                                             ; preds = %Entry
  %10 = load i32, i32* %e, !dbg !88
  %11 = icmp sge i32 %10, 0, !dbg !89
  br i1 %11, label %Then1, label %Else2, !dbg !89

Then1:                                            ; preds = %Else
  %12 = load i32, i32* %e, !dbg !90
  %13 = lshr i32 8388607, %12, !dbg !92
  store i32 %13, i32* %m, !dbg !93
  %14 = load i32, i32* %u, !dbg !94
  %15 = load i32, i32* %m, !dbg !95
  %16 = and i32 %14, %15, !dbg !96
  %17 = icmp eq i32 %16, 0, !dbg !97
  br i1 %17, label %Then3, label %Else4, !dbg !97

Else2:                                            ; preds = %Else
  %18 = load float, float* %x, !dbg !98
  %19 = fadd fast float %18, 0x4770000000000000, !dbg !100
  call fastcc void @forceEval(float %19) #6, !dbg !101
  %20 = load i32, i32* %u, !dbg !102
  %21 = lshr i32 %20, 31, !dbg !103
  %22 = icmp ne i32 %21, 0, !dbg !104
  br i1 %22, label %Then5, label %Else6, !dbg !104

Then3:                                            ; preds = %Then1
  %23 = load float, float* %x, !dbg !105
  ret float %23, !dbg !107

Else4:                                            ; preds = %Then1
  br label %EndIf, !dbg !108

Then5:                                            ; preds = %Else2
  ret float -0.000000e+00, !dbg !109

Else6:                                            ; preds = %Else2
  br label %EndIf7, !dbg !111

EndIf:                                            ; preds = %Else4
  %24 = load float, float* %x, !dbg !112
  %25 = fadd fast float %24, 0x4770000000000000, !dbg !113
  call fastcc void @forceEval(float %25) #6, !dbg !114
  %26 = load i32, i32* %u, !dbg !115
  %27 = lshr i32 %26, 31, !dbg !116
  %28 = icmp eq i32 %27, 0, !dbg !117
  br i1 %28, label %Then8, label %Else9, !dbg !117

EndIf7:                                           ; preds = %Else6
  br label %EndIf11, !dbg !118

Then8:                                            ; preds = %EndIf
  %29 = load i32, i32* %u, !dbg !119
  %30 = load i32, i32* %m, !dbg !121
  %31 = add nuw i32 %29, %30, !dbg !122
  store i32 %31, i32* %u, !dbg !122
  br label %EndIf10, !dbg !123

Else9:                                            ; preds = %EndIf
  br label %EndIf10, !dbg !123

EndIf10:                                          ; preds = %Else9, %Then8
  %32 = load i32, i32* %u, !dbg !124
  %33 = load i32, i32* %m, !dbg !125
  %34 = xor i32 %33, -1, !dbg !126
  %35 = and i32 %32, %34, !dbg !127
  store i32 %35, i32* %u, !dbg !127
  %36 = load i32, i32* %u, !dbg !128
  %37 = bitcast i32 %36 to float, !dbg !129
  br label %EndIf11, !dbg !118

EndIf11:                                          ; preds = %EndIf10, %EndIf7
  %38 = phi float [ %37, %EndIf10 ], [ 1.000000e+00, %EndIf7 ], !dbg !118
  ret float %38, !dbg !130
}

; Function Attrs: nobuiltin noreturn nounwind
define internal fastcc void @panic(%"[]u8"* byval nonnull readonly) unnamed_addr #4 !dbg !131 {
Entry:
  call void @llvm.dbg.declare(metadata %"[]u8"* %0, metadata !141, metadata !37), !dbg !142
  call void @llvm.debugtrap(), !dbg !143
  br label %WhileCond, !dbg !146

WhileCond:                                        ; preds = %WhileCond, %Entry
  br label %WhileCond, !dbg !146
}

; Function Attrs: nobuiltin nounwind
define internal fastcc void @forceEval(float) unnamed_addr #2 !dbg !147 {
Entry:
  %value = alloca float, align 4
  %x = alloca float, align 4
  %p = alloca float*, align 8
  store float %0, float* %value
  call void @llvm.dbg.declare(metadata float* %value, metadata !151, metadata !37), !dbg !158
  call void @llvm.dbg.declare(metadata float* %x, metadata !152, metadata !37), !dbg !159
  store float* %x, float** %p, !dbg !160
  call void @llvm.dbg.declare(metadata float** %p, metadata !155, metadata !37), !dbg !160
  %1 = load float*, float** %p, !dbg !161
  %2 = load float, float* %x, !dbg !163
  store volatile float %2, float* %1, !dbg !164
  ret void, !dbg !165
}

; Function Attrs: nounwind readnone
declare void @llvm.dbg.declare(metadata, metadata, metadata) #5

attributes #0 = { nounwind }
attributes #1 = { argmemonly nounwind }
attributes #2 = { nobuiltin nounwind }
attributes #3 = { cold nobuiltin noreturn nounwind }
attributes #4 = { nobuiltin noreturn nounwind }
attributes #5 = { nounwind readnone }
attributes #6 = { nobuiltin }
attributes #7 = { alwaysinline nobuiltin }

!llvm.module.flags = !{!0}
!llvm.dbg.cu = !{!1}

!0 = !{i32 2, !"Debug Info Version", i32 3}
!1 = distinct !DICompileUnit(language: DW_LANG_C99, file: !2, producer: "zig 0.0.0", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: !3, globals: !12)
!2 = !DIFile(filename: "test", directory: ".")
!3 = !{!4}
!4 = !DICompositeType(tag: DW_TAG_enumeration_type, name: "GlobalLinkage", scope: !5, file: !5, line: 126, baseType: !6, size: 8, align: 8, elements: !7)
!5 = !DIFile(filename: "builtin.zig", directory: "/home/andy/dev/zig/build/zig-cache")
!6 = !DIBasicType(name: "u8", size: 8, encoding: DW_ATE_unsigned_char)
!7 = !{!8, !9, !10, !11}
!8 = !DIEnumerator(name: "Internal", value: 0)
!9 = !DIEnumerator(name: "Strong", value: 1)
!10 = !DIEnumerator(name: "Weak", value: 2)
!11 = !DIEnumerator(name: "LinkOnce", value: 3)
!12 = !{!13}
!13 = !DIGlobalVariableExpression(var: !14)
!14 = distinct !DIGlobalVariable(name: "__zig_panic_implementation_provided", linkageName: "__zig_panic_implementation_provided", scope: !5, file: !5, line: 189, type: !15, isLocal: true, isDefinition: true)
!15 = !DIBasicType(name: "bool", size: 8, encoding: DW_ATE_boolean)
!16 = distinct !DISubprogram(name: "do_test", scope: !17, file: !17, line: 46, type: !18, isLocal: false, isDefinition: true, scopeLine: 46, isOptimized: true, unit: !1, variables: !20)
!17 = !DIFile(filename: "test.zig", directory: "/home/andy/dev/zig/build")
!18 = !DISubroutineType(types: !19)
!19 = !{!15}
!20 = !{}
!21 = !DILocation(line: 47, column: 16, scope: !22)
!22 = distinct !DILexicalBlock(scope: !16, file: !17, line: 46, column: 29)
!23 = !DILocation(line: 47, column: 36, scope: !22)
!24 = !DILocation(line: 47, column: 27, scope: !22)
!25 = !DILocation(line: 47, column: 5, scope: !22)
!26 = distinct !DISubprogram(name: "__zig_panic", scope: !27, file: !27, line: 7, type: !28, isLocal: false, isDefinition: true, scopeLine: 7, isOptimized: true, unit: !1, variables: !33)
!27 = !DIFile(filename: "zigrt.zig", directory: "/home/andy/dev/zig/build/lib/zig/std/special")
!28 = !DISubroutineType(types: !29)
!29 = !{!30, !31, !32}
!30 = !DIBasicType(name: "void", encoding: DW_ATE_unsigned)
!31 = !DIDerivedType(tag: DW_TAG_pointer_type, name: "&const u8", baseType: !6, size: 64, align: 64)
!32 = !DIBasicType(name: "usize", size: 64, encoding: DW_ATE_unsigned)
!33 = !{!34, !35}
!34 = !DILocalVariable(name: "message_ptr", arg: 1, scope: !26, file: !27, line: 7, type: !31)
!35 = !DILocalVariable(name: "message_len", arg: 2, scope: !36, file: !27, line: 7, type: !32)
!36 = distinct !DILexicalBlock(scope: !26, file: !27, line: 7, column: 30)
!37 = !DIExpression()
!38 = !DILocation(line: 7, column: 30, scope: !26)
!39 = !DILocation(line: 7, column: 54, scope: !36)
!40 = !DILocation(line: 12, column: 48, scope: !41)
!41 = distinct !DILexicalBlock(scope: !42, file: !27, line: 11, column: 54)
!42 = distinct !DILexicalBlock(scope: !43, file: !27, line: 7, column: 86)
!43 = distinct !DILexicalBlock(scope: !36, file: !27, line: 7, column: 54)
!44 = !DILocation(line: 12, column: 43, scope: !41)
!45 = !DILocation(line: 12, column: 31, scope: !41)
!46 = distinct !DISubprogram(name: "ceil", scope: !17, file: !17, line: 3, type: !47, isLocal: true, isDefinition: true, scopeLine: 3, isOptimized: true, unit: !1, variables: !50)
!47 = !DISubroutineType(types: !48)
!48 = !{!49, !49}
!49 = !DIBasicType(name: "f32", size: 32, encoding: DW_ATE_float)
!50 = !{!51}
!51 = !DILocalVariable(name: "x", arg: 1, scope: !52, file: !17, line: 3, type: !49)
!52 = distinct !DILexicalBlock(scope: !46, file: !17, line: 3, column: 13)
!53 = !DILocation(line: 3, column: 13, scope: !52)
!54 = !DILocation(line: 6, column: 36, scope: !55)
!55 = distinct !DILexicalBlock(scope: !56, file: !17, line: 4, column: 5)
!56 = distinct !DILexicalBlock(scope: !57, file: !17, line: 3, column: 35)
!57 = distinct !DILexicalBlock(scope: !52, file: !17, line: 3, column: 13)
!58 = !DILocation(line: 6, column: 16, scope: !55)
!59 = !DILocation(line: 5, column: 5, scope: !57)
!60 = distinct !DISubprogram(name: "ceil32", scope: !17, file: !17, line: 11, type: !47, isLocal: true, isDefinition: true, scopeLine: 11, isOptimized: true, unit: !1, variables: !61)
!61 = !{!62, !63, !67, !70}
!62 = !DILocalVariable(name: "x", arg: 1, scope: !60, file: !17, line: 11, type: !49)
!63 = !DILocalVariable(name: "u", scope: !64, file: !17, line: 12, type: !66)
!64 = distinct !DILexicalBlock(scope: !65, file: !17, line: 11, column: 26)
!65 = distinct !DILexicalBlock(scope: !60, file: !17, line: 11, column: 11)
!66 = !DIBasicType(name: "u32", size: 32, encoding: DW_ATE_unsigned)
!67 = !DILocalVariable(name: "e", scope: !68, file: !17, line: 13, type: !69)
!68 = distinct !DILexicalBlock(scope: !64, file: !17, line: 12, column: 5)
!69 = !DIBasicType(name: "i32", size: 32, encoding: DW_ATE_signed)
!70 = !DILocalVariable(name: "m", scope: !71, file: !17, line: 14, type: !66)
!71 = distinct !DILexicalBlock(scope: !68, file: !17, line: 13, column: 5)
!72 = !DILocation(line: 11, column: 11, scope: !60)
!73 = !DILocation(line: 12, column: 27, scope: !64)
!74 = !DILocation(line: 12, column: 13, scope: !64)
!75 = !DILocation(line: 12, column: 5, scope: !64)
!76 = !DILocation(line: 13, column: 18, scope: !68)
!77 = !DILocation(line: 13, column: 20, scope: !68)
!78 = !DILocation(line: 13, column: 27, scope: !68)
!79 = !DILocation(line: 13, column: 35, scope: !68)
!80 = !DILocation(line: 13, column: 5, scope: !68)
!81 = !DILocation(line: 14, column: 5, scope: !71)
!82 = !DILocation(line: 16, column: 9, scope: !83)
!83 = distinct !DILexicalBlock(scope: !71, file: !17, line: 14, column: 5)
!84 = !DILocation(line: 16, column: 11, scope: !83)
!85 = !DILocation(line: 17, column: 16, scope: !86)
!86 = distinct !DILexicalBlock(scope: !83, file: !17, line: 16, column: 18)
!87 = !DILocation(line: 17, column: 9, scope: !86)
!88 = !DILocation(line: 19, column: 14, scope: !83)
!89 = !DILocation(line: 19, column: 16, scope: !83)
!90 = !DILocation(line: 20, column: 31, scope: !91)
!91 = distinct !DILexicalBlock(scope: !83, file: !17, line: 19, column: 22)
!92 = !DILocation(line: 20, column: 24, scope: !91)
!93 = !DILocation(line: 20, column: 11, scope: !91)
!94 = !DILocation(line: 21, column: 13, scope: !91)
!95 = !DILocation(line: 21, column: 17, scope: !91)
!96 = !DILocation(line: 21, column: 15, scope: !91)
!97 = !DILocation(line: 21, column: 19, scope: !91)
!98 = !DILocation(line: 31, column: 19, scope: !99)
!99 = distinct !DILexicalBlock(scope: !83, file: !17, line: 30, column: 12)
!100 = !DILocation(line: 31, column: 21, scope: !99)
!101 = !DILocation(line: 31, column: 18, scope: !99)
!102 = !DILocation(line: 32, column: 13, scope: !99)
!103 = !DILocation(line: 32, column: 15, scope: !99)
!104 = !DILocation(line: 32, column: 21, scope: !99)
!105 = !DILocation(line: 22, column: 20, scope: !106)
!106 = distinct !DILexicalBlock(scope: !91, file: !17, line: 21, column: 25)
!107 = !DILocation(line: 22, column: 13, scope: !106)
!108 = !DILocation(line: 21, column: 9, scope: !91)
!109 = !DILocation(line: 33, column: 13, scope: !110)
!110 = distinct !DILexicalBlock(scope: !99, file: !17, line: 32, column: 27)
!111 = !DILocation(line: 32, column: 9, scope: !99)
!112 = !DILocation(line: 24, column: 19, scope: !91)
!113 = !DILocation(line: 24, column: 21, scope: !91)
!114 = !DILocation(line: 24, column: 18, scope: !91)
!115 = !DILocation(line: 25, column: 13, scope: !91)
!116 = !DILocation(line: 25, column: 15, scope: !91)
!117 = !DILocation(line: 25, column: 21, scope: !91)
!118 = !DILocation(line: 19, column: 10, scope: !83)
!119 = !DILocation(line: 26, column: 13, scope: !120)
!120 = distinct !DILexicalBlock(scope: !91, file: !17, line: 25, column: 27)
!121 = !DILocation(line: 26, column: 18, scope: !120)
!122 = !DILocation(line: 26, column: 15, scope: !120)
!123 = !DILocation(line: 25, column: 9, scope: !91)
!124 = !DILocation(line: 28, column: 9, scope: !91)
!125 = !DILocation(line: 28, column: 15, scope: !91)
!126 = !DILocation(line: 28, column: 14, scope: !91)
!127 = !DILocation(line: 28, column: 11, scope: !91)
!128 = !DILocation(line: 29, column: 23, scope: !91)
!129 = !DILocation(line: 29, column: 9, scope: !91)
!130 = !DILocation(line: 16, column: 5, scope: !65)
!131 = distinct !DISubprogram(name: "panic", scope: !17, file: !17, line: 1, type: !132, isLocal: true, isDefinition: true, scopeLine: 1, isOptimized: true, unit: !1, variables: !140)
!132 = !DISubroutineType(types: !133)
!133 = !{!30, !134}
!134 = !DIDerivedType(tag: DW_TAG_pointer_type, name: "&const []const u8", baseType: !135, size: 64, align: 64)
!135 = !DICompositeType(tag: DW_TAG_structure_type, name: "[]u8", size: 128, align: 128, elements: !136)
!136 = !{!137, !139}
!137 = !DIDerivedType(tag: DW_TAG_member, name: "ptr", scope: !135, baseType: !138, size: 64, align: 64)
!138 = !DIDerivedType(tag: DW_TAG_pointer_type, name: "&u8", baseType: !6, size: 64, align: 64)
!139 = !DIDerivedType(tag: DW_TAG_member, name: "len", scope: !135, baseType: !32, size: 64, align: 64, offset: 64)
!140 = !{!141}
!141 = !DILocalVariable(name: "msg", arg: 1, scope: !131, file: !17, line: 1, type: !135)
!142 = !DILocation(line: 1, column: 14, scope: !131)
!143 = !DILocation(line: 1, column: 45, scope: !144)
!144 = distinct !DILexicalBlock(scope: !145, file: !17, line: 1, column: 43)
!145 = distinct !DILexicalBlock(scope: !131, file: !17, line: 1, column: 14)
!146 = !DILocation(line: 1, column: 60, scope: !144)
!147 = distinct !DISubprogram(name: "forceEval", scope: !17, file: !17, line: 39, type: !148, isLocal: true, isDefinition: true, scopeLine: 39, isOptimized: true, unit: !1, variables: !150)
!148 = !DISubroutineType(types: !149)
!149 = !{!30, !49}
!150 = !{!151, !152, !155}
!151 = !DILocalVariable(name: "value", arg: 1, scope: !147, file: !17, line: 39, type: !49)
!152 = !DILocalVariable(name: "x", scope: !153, file: !17, line: 40, type: !49)
!153 = distinct !DILexicalBlock(scope: !154, file: !17, line: 39, column: 30)
!154 = distinct !DILexicalBlock(scope: !147, file: !17, line: 39, column: 18)
!155 = !DILocalVariable(name: "p", scope: !156, file: !17, line: 41, type: !157)
!156 = distinct !DILexicalBlock(scope: !153, file: !17, line: 40, column: 5)
!157 = !DIDerivedType(tag: DW_TAG_pointer_type, name: "&volatile f32", baseType: !49, size: 64, align: 64)
!158 = !DILocation(line: 39, column: 18, scope: !147)
!159 = !DILocation(line: 40, column: 5, scope: !153)
!160 = !DILocation(line: 41, column: 5, scope: !156)
!161 = !DILocation(line: 42, column: 5, scope: !162)
!162 = distinct !DILexicalBlock(scope: !156, file: !17, line: 41, column: 5)
!163 = !DILocation(line: 42, column: 10, scope: !162)
!164 = !DILocation(line: 42, column: 8, scope: !162)
!165 = !DILocation(line: 39, column: 30, scope: !154)

source:
pub fn panic(msg: []const u8) -> noreturn { @breakpoint(); while (true) {} }

pub fn ceil(x: var) -> @typeOf(x) {
    const T = @typeOf(x);
    switch (T) {
        f32 => @inlineCall(ceil32, x),
        else => @compileError("ceil not implemented for " ++ @typeName(T)),
    }
}

fn ceil32(x: f32) -> f32 {
    var u = @bitCast(u32, x);
    var e = i32((u >> 23) & 0xFF) - 0x7F;
    var m: u32 = undefined;

    if (e >= 23) {
        return x;
    }
    else if (e >= 0) {
        m = 0x007FFFFF >> u32(e);
        if (u & m == 0) {
            return x;
        }
        forceEval(x + 0x1.0p120);
        if (u >> 31 == 0) {
            u += m;
        }
        u &= ~m;
        @bitCast(f32, u)
    } else {
        forceEval(x + 0x1.0p120);
        if (u >> 31 != 0) {
            return -0.0;
        } else {
            1.0
        }
    }
}
pub fn forceEval(value: f32) {
    var x: f32 = undefined;
    const p = @ptrCast(&volatile f32, &x);
    *p = x;
}


export fn do_test() -> bool {
    return ceil(f32(0.0)) == ceil32(0.0);
}


-----------------------

With no optimizations, the do_test function returns true, which is expected. With -O3, the module gets rewritten to:



 

-- 
Mehdi





_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [llvm-dev] LLVM behavior different depending on function symbol name

Tim Northover via llvm-dev


On Mon, Jun 19, 2017 at 12:34 PM, Mehdi AMINI via llvm-dev <[hidden email]> wrote:
using `opt --print-after-all -O3` I see that EarlyCSE is interpreting the call to `ceil` and constant fold:

*** IR Dump After Early CSE ***
; Function Attrs: nobuiltin nounwind
define i1 @do_test() #2 {
Entry:
  %0 = call fastcc float @ceil(float 0.000000e+00) #6
  %1 = call fastcc float @ceil32(float 0.000000e+00) #6
  %2 = fcmp fast oeq float 0.000000e+00, %1
  ret i1 %2
}

So just running `opt -early-cse -debug` seems enough:

EarlyCSE Simplify:   %0 = call fastcc float @ceil(float 0.000000e+00) #6  to: float 0.000000e+00

I suspect it is not correct from EarlyCSE to do that.


This was actually _just_ fixed:

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [llvm-dev] LLVM behavior different depending on function symbol name

Tim Northover via llvm-dev


On Mon, Jun 19, 2017 at 12:47 PM, James Y Knight <[hidden email]> wrote:


On Mon, Jun 19, 2017 at 12:34 PM, Mehdi AMINI via llvm-dev <[hidden email]> wrote:
using `opt --print-after-all -O3` I see that EarlyCSE is interpreting the call to `ceil` and constant fold:

*** IR Dump After Early CSE ***
; Function Attrs: nobuiltin nounwind
define i1 @do_test() #2 {
Entry:
  %0 = call fastcc float @ceil(float 0.000000e+00) #6
  %1 = call fastcc float @ceil32(float 0.000000e+00) #6
  %2 = fcmp fast oeq float 0.000000e+00, %1
  ret i1 %2
}

So just running `opt -early-cse -debug` seems enough:

EarlyCSE Simplify:   %0 = call fastcc float @ceil(float 0.000000e+00) #6  to: float 0.000000e+00

I suspect it is not correct from EarlyCSE to do that.


This was actually _just_ fixed:

Excellent. Is the fix included in llvm 4.0.1?
 


_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Loading...