[llvm-dev] [newbie] trouble with global variables and CreateLoad/Store in JIT

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[llvm-dev] [newbie] trouble with global variables and CreateLoad/Store in JIT

Hal Finkel via llvm-dev
Emitting calls to these functions (written in an .ll file linked in) works fine, and does the right thing.

%Any = type { i8*, i32 }

define dllexport void @setGlobal(%Any* %ptr, %Any %value) {
  store %Any %value, %Any* %ptr
  ret void
}

define dllexport %Any @getGlobal(%Any* %ptr) {
  %val = load %Any, %Any* %ptr
  ret %Any %val
}

Trying to replace the setGlobal call with what should be equivalent

builder.CreateStore(value, ptr)

results in what should end up in the second (i32) slot being stored in the first (i8*).

I've added ::dump() calls where the CreateStore is, and this is what I get:

{ i8*, i32 } { i8* @FixnumClass, i32 32 } ; for value
@foo = external global { i8*, i32 } ; for ptr


Even more bizarrely trying to replace the getGlobal call with

builder.CreateLoad(val)

results in what has been stored in the first (i8*) slot being loaded correctly, but the second (i32) getting garbage out despite the correct value being stored in memory. Dump call there reports the @foo pointer identically.

This is using LLVM 4.0.0

Just so I'm not leaving anything out, what follows are IR dumps of the sample functions using either the direct store / load, or the setGlobal getGlobal
functions. As far as I can tell they should do exactly the same thing...

; Does NOT do the right thing
define { i8*, i32 } @"__anonToplevel/2"() {
entry:
  %.unpack = load i8*, i8** getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0), align 4
  %0 = insertvalue { i8*, i32 } undef, i8* %.unpack, 0
  %.unpack1 = load i32, i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1), align 4
  %1 = insertvalue { i8*, i32 } %0, i32 %.unpack1, 1
  ret { i8*, i32 } %1
}

; Does NOT do the right thing
define { i8*, i32 } @"__anonToplevel/0"() {
entry:
  store i8* @FixnumClass, i8** getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0), align 4
  store i32 123, i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1), align 4
  ret { i8*, i32 } { i8* @FixnumClass, i32 123 }
}

; DOES the right thing
define { i8*, i32 } @"__anonToplevel/0"() {
entry:
  call void @setGlobal({ i8*, i32 }* nonnull @foo, { i8*, i32 } { i8* @FixnumClass, i32 123 })
  ret { i8*, i32 } { i8* @FixnumClass, i32 123 }
}

; DOES the right thing
define { i8*, i32 } @"__anonToplevel/1"() {
entry:
  %0 = call { i8*, i32 } @getGlobal({ i8*, i32 }* nonnull @foo)
  ret { i8*, i32 } %0
}

I'm at my wit's end. Any hints as to what I might be messing up would be much appreciated. I expect it is something ridiculously obvious...

Cheers,

 -- nikodemus



_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [llvm-dev] [newbie] trouble with global variables and CreateLoad/Store in JIT

Hal Finkel via llvm-dev
This is a bit mystifying. Can you also show the assembly? What offsets are actually used for the stores in the "bad" versions? In other words, try verifying that the offsets that the getelementptr's should generate match your expectations (and if they deviate, in what ways they deviate).

-- Sean Silva

On Sun, Jun 4, 2017 at 1:39 PM, Nikodemus Siivola via llvm-dev <[hidden email]> wrote:
Emitting calls to these functions (written in an .ll file linked in) works fine, and does the right thing.

%Any = type { i8*, i32 }

define dllexport void @setGlobal(%Any* %ptr, %Any %value) {
  store %Any %value, %Any* %ptr
  ret void
}

define dllexport %Any @getGlobal(%Any* %ptr) {
  %val = load %Any, %Any* %ptr
  ret %Any %val
}

Trying to replace the setGlobal call with what should be equivalent

builder.CreateStore(value, ptr)

results in what should end up in the second (i32) slot being stored in the first (i8*).

I've added ::dump() calls where the CreateStore is, and this is what I get:

{ i8*, i32 } { i8* @FixnumClass, i32 32 } ; for value
@foo = external global { i8*, i32 } ; for ptr


Even more bizarrely trying to replace the getGlobal call with

builder.CreateLoad(val)

results in what has been stored in the first (i8*) slot being loaded correctly, but the second (i32) getting garbage out despite the correct value being stored in memory. Dump call there reports the @foo pointer identically.

This is using LLVM 4.0.0

Just so I'm not leaving anything out, what follows are IR dumps of the sample functions using either the direct store / load, or the setGlobal getGlobal
functions. As far as I can tell they should do exactly the same thing...

; Does NOT do the right thing
define { i8*, i32 } @"__anonToplevel/2"() {
entry:
  %.unpack = load i8*, i8** getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0), align 4
  %0 = insertvalue { i8*, i32 } undef, i8* %.unpack, 0
  %.unpack1 = load i32, i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1), align 4
  %1 = insertvalue { i8*, i32 } %0, i32 %.unpack1, 1
  ret { i8*, i32 } %1
}

; Does NOT do the right thing
define { i8*, i32 } @"__anonToplevel/0"() {
entry:
  store i8* @FixnumClass, i8** getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0), align 4
  store i32 123, i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1), align 4
  ret { i8*, i32 } { i8* @FixnumClass, i32 123 }
}

; DOES the right thing
define { i8*, i32 } @"__anonToplevel/0"() {
entry:
  call void @setGlobal({ i8*, i32 }* nonnull @foo, { i8*, i32 } { i8* @FixnumClass, i32 123 })
  ret { i8*, i32 } { i8* @FixnumClass, i32 123 }
}

; DOES the right thing
define { i8*, i32 } @"__anonToplevel/1"() {
entry:
  %0 = call { i8*, i32 } @getGlobal({ i8*, i32 }* nonnull @foo)
  ret { i8*, i32 } %0
}

I'm at my wit's end. Any hints as to what I might be messing up would be much appreciated. I expect it is something ridiculously obvious...

Cheers,

 -- nikodemus



_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev



_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [llvm-dev] [newbie] trouble with global variables and CreateLoad/Store in JIT

Hal Finkel via llvm-dev
Since the getelementptrs were implicitly generated by the CreateStore/Load I'm not sure how to get access to them.

So I hacked the assignment to be done thrice: once using a manual decomposition into two GEPs and stores, once using the "big" CreateStore, once via the setGlobal function, printing addresses and memory contents at each point to the degree that I have access to them.

It seems the following GEPs compute the same address?! I can buy myself not understanding how GEP works and doing it wrong, but builder.CreateStore() creates what look like identical GEPs implicitly...

i8** getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0), align 4
i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1), align 4

The details.

This is the relevant part from my codegen:

            auto ty = val->getType();
            cout << "val type:" << endl;
            ty->dump();
            cout << "ptr type:" << endl;
            ptr->getType()->dump();
            // Print memory
            ctx.EmitCall1("debugPointer", ptr);
            // Set class pointer
            auto c = ctx.bld.CreateExtractValue(val, 0, "class");
            auto cp = ctx.bld.CreateConstGEP2_32(ty, ptr, 0, 0);
            auto cx = ctx.bld.CreatePtrToInt(cp, ctx.Int32Type());
            ctx.EmitCall1("debugInt", cx);
            ctx.bld.CreateStore(c, cp);
            // Set datum
            auto d = ctx.bld.CreateExtractValue(val, 1, "datum");
            auto dp = ctx.bld.CreateConstGEP2_32(ty, ptr, 0, 1);
            auto dx = ctx.bld.CreatePtrToInt(dp, ctx.Int32Type());
            ctx.EmitCall1("debugInt", dx);
            ctx.bld.CreateStore(d, dp);
            // Print memory
            ctx.EmitCall1("debugPointer", ptr);
            // Do the same with a single store
            ctx.bld.CreateStore(val, ptr);
            // Print memory
            ctx.EmitCall1("debugPointer", ptr);
            // Call out
            ctx.EmitCall2("setGlobal", ptr, val);
            // Print memory
            ctx.EmitCall1("debugPointer", ptr);

Here is the compile-time output showing types of the value and the pointer:

val type:
{ i8*, i32 }
ptr type:
{ i8*, i32 }*

Here is the IR dump for the function (after a couple of passes), right before it's fed to the JIT:

define { i8*, i32 } @"__anonToplevel/0"() prefix { i8*, i32 } (i32)* @"XEP:__anonToplevel/0" {
entry:
  %0 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
  %1 = call { i8*, i32 } @debugInt(i32 ptrtoint ({ i8*, i32 }* @foo to i32))
  store i8* @FixnumClass, i8** getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0), align 4
  %2 = call { i8*, i32 } @debugInt(i32 ptrtoint (i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1) to i32))
  store i32 123, i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1), align 4
  %3 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
  store i8* @FixnumClass, i8** getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0), align 4
  store i32 123, i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1), align 4
  %4 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
  call void @setGlobal({ i8*, i32 }* nonnull @foo, { i8*, i32 } { i8* @FixnumClass, i32 123 })
  %5 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
  ret { i8*, i32 } { i8* @FixnumClass, i32 123 }
}

​Here is the runtime from calling the JITed function, including memory addresses and contents, with my annotations:

# Before
p = 03C10000
  class: 00000000
  datum: 00000000
# Should be address of the class slot --> correct
x = 03C10000
# Should be address of the datum slot, ie address of class slot + 4 --> incorrect
x = 03C10000
# Yeah, both values want to class slot, so actual class pointer got clobbered
p = 03C10000
  class: 0000007B
  datum: 00000000
# Same result from the single CreateStore
p = 03C10000
  class: 0000007B
  datum: 00000000
# Calling out to setGlobal as in my first email works
p = 03C10000
  class: 039D2E98
  datum: 0000007B

Finally, I didn't manage nice disassembly yet, so here is the last output from --print-after-all for the function. The bizarre thing is that even this looks correct: the debugInt is called first with @foo, then @foo+4, and the stores seem to be going to the right addresses as well: @foo and @foo+4!

BB#0: derived from LLVM BB %entry
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugInt>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        MOV32mi %noreg, 1, %noreg, <ga:@foo>, %noreg, <ga:@JazzFixnumClass>; mem:ST4[getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0)]
        PUSHi32 <ga:@foo+4>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugInt>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        MOV32mi %noreg, 1, %noreg, <ga:@foo+4>, %noreg, 123; mem:ST4[getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1)]
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        MOV32mi %noreg, 1, %noreg, <ga:@foo>, %noreg, <ga:@JazzFixnumClass>; mem:ST4[getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0)]
        MOV32mi %noreg, 1, %noreg, <ga:@foo+4>, %noreg, 123; mem:ST4[getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1)]
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        PUSH32i8 123, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        PUSHi32 <ga:@JazzFixnumClass>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@setGlobal>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 12, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        %EAX<def> = MOV32ri <ga:@JazzFixnumClass>
        %EDX<def> = MOV32ri 123
        RETL %EAX<kill>, %EDX<kill>

Also, I have essentially identical code working perfectly fine when the memory being written to is from @alloca.

I am completely clueless. Any suggestions most welcome.

Cheers,

 -- nikodemus


_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [llvm-dev] [newbie] trouble with global variables and CreateLoad/Store in JIT

Hal Finkel via llvm-dev
Uh. Turns out that if I hide the pointer to @foo from LLVM by passing it through an opaque identity function ... then everything works fine.

Is this a bug in LLVM or is there some magic involving globals I'm misunderstanding?

define { i8*, i32 } @"__anonToplevel/0"() prefix { i8*, i32 } (i32)* @"XEP:__anonToplevel/0" {
entry:
  %0 = call { i8*, i32 }* @identity({ i8*, i32 }* nonnull @foo)
  %1 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
  %2 = getelementptr { i8*, i32 }, { i8*, i32 }* %0, i32 0, i32 0
  %3 = ptrtoint { i8*, i32 }* %0 to i32
  %4 = call { i8*, i32 } @debugInt(i32 %3)
  store i8* @FixnumClass, i8** %2, align 4
  %5 = getelementptr { i8*, i32 }, { i8*, i32 }* %0, i32 0, i32 1
  %6 = ptrtoint i32* %5 to i32
  %7 = call { i8*, i32 } @debugInt(i32 %6)
  store i32 123, i32* %5, align 4
  %8 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
  store i8* @FixnumClass, i8** %2, align 4
  store i32 123, i32* %5, align 4
  %9 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
  call void @setGlobal({ i8*, i32 }* %0, { i8*, i32 } { i8* @FixnumClass, i32 123 })
  %10 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
  ret { i8*, i32 } { i8* @FixnumClass, i32 123 }
}

Output, now with correct addresses out of the GEPs, and memory being modified as expected:

p = 02F80000
  class: 00000000
  datum: 00000000
x = 02F80000
x = 02F80004
p = 02F80000
  class: 028D3E98
  datum: 0000007B
p = 02F80000
  class: 028D3E98
  datum: 0000007B
p = 02F80000
  class: 028D3E98
  datum: 0000007B

Cheers,

 -- nikodemus


On Mon, Jun 5, 2017 at 10:57 PM, Nikodemus Siivola <[hidden email]> wrote:
Since the getelementptrs were implicitly generated by the CreateStore/Load I'm not sure how to get access to them.

So I hacked the assignment to be done thrice: once using a manual decomposition into two GEPs and stores, once using the "big" CreateStore, once via the setGlobal function, printing addresses and memory contents at each point to the degree that I have access to them.

It seems the following GEPs compute the same address?! I can buy myself not understanding how GEP works and doing it wrong, but builder.CreateStore() creates what look like identical GEPs implicitly...

i8** getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0), align 4
i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1), align 4

The details.

This is the relevant part from my codegen:

            auto ty = val->getType();
            cout << "val type:" << endl;
            ty->dump();
            cout << "ptr type:" << endl;
            ptr->getType()->dump();
            // Print memory
            ctx.EmitCall1("debugPointer", ptr);
            // Set class pointer
            auto c = ctx.bld.CreateExtractValue(val, 0, "class");
            auto cp = ctx.bld.CreateConstGEP2_32(ty, ptr, 0, 0);
            auto cx = ctx.bld.CreatePtrToInt(cp, ctx.Int32Type());
            ctx.EmitCall1("debugInt", cx);
            ctx.bld.CreateStore(c, cp);
            // Set datum
            auto d = ctx.bld.CreateExtractValue(val, 1, "datum");
            auto dp = ctx.bld.CreateConstGEP2_32(ty, ptr, 0, 1);
            auto dx = ctx.bld.CreatePtrToInt(dp, ctx.Int32Type());
            ctx.EmitCall1("debugInt", dx);
            ctx.bld.CreateStore(d, dp);
            // Print memory
            ctx.EmitCall1("debugPointer", ptr);
            // Do the same with a single store
            ctx.bld.CreateStore(val, ptr);
            // Print memory
            ctx.EmitCall1("debugPointer", ptr);
            // Call out
            ctx.EmitCall2("setGlobal", ptr, val);
            // Print memory
            ctx.EmitCall1("debugPointer", ptr);

Here is the compile-time output showing types of the value and the pointer:

val type:
{ i8*, i32 }
ptr type:
{ i8*, i32 }*

Here is the IR dump for the function (after a couple of passes), right before it's fed to the JIT:

define { i8*, i32 } @"__anonToplevel/0"() prefix { i8*, i32 } (i32)* @"XEP:__anonToplevel/0" {
entry:
  %0 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
  %1 = call { i8*, i32 } @debugInt(i32 ptrtoint ({ i8*, i32 }* @foo to i32))
  store i8* @FixnumClass, i8** getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0), align 4
  %2 = call { i8*, i32 } @debugInt(i32 ptrtoint (i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1) to i32))
  store i32 123, i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1), align 4
  %3 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
  store i8* @FixnumClass, i8** getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0), align 4
  store i32 123, i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1), align 4
  %4 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
  call void @setGlobal({ i8*, i32 }* nonnull @foo, { i8*, i32 } { i8* @FixnumClass, i32 123 })
  %5 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
  ret { i8*, i32 } { i8* @FixnumClass, i32 123 }
}

​Here is the runtime from calling the JITed function, including memory addresses and contents, with my annotations:

# Before
p = 03C10000
  class: 00000000
  datum: 00000000
# Should be address of the class slot --> correct
x = 03C10000
# Should be address of the datum slot, ie address of class slot + 4 --> incorrect
x = 03C10000
# Yeah, both values want to class slot, so actual class pointer got clobbered
p = 03C10000
  class: 0000007B
  datum: 00000000
# Same result from the single CreateStore
p = 03C10000
  class: 0000007B
  datum: 00000000
# Calling out to setGlobal as in my first email works
p = 03C10000
  class: 039D2E98
  datum: 0000007B

Finally, I didn't manage nice disassembly yet, so here is the last output from --print-after-all for the function. The bizarre thing is that even this looks correct: the debugInt is called first with @foo, then @foo+4, and the stores seem to be going to the right addresses as well: @foo and @foo+4!

BB#0: derived from LLVM BB %entry
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugInt>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        MOV32mi %noreg, 1, %noreg, <ga:@foo>, %noreg, <ga:@JazzFixnumClass>; mem:ST4[getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0)]
        PUSHi32 <ga:@foo+4>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugInt>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        MOV32mi %noreg, 1, %noreg, <ga:@foo+4>, %noreg, 123; mem:ST4[getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1)]
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        MOV32mi %noreg, 1, %noreg, <ga:@foo>, %noreg, <ga:@JazzFixnumClass>; mem:ST4[getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0)]
        MOV32mi %noreg, 1, %noreg, <ga:@foo+4>, %noreg, 123; mem:ST4[getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1)]
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        PUSH32i8 123, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        PUSHi32 <ga:@JazzFixnumClass>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@setGlobal>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 12, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        %EAX<def> = MOV32ri <ga:@JazzFixnumClass>
        %EDX<def> = MOV32ri 123
        RETL %EAX<kill>, %EDX<kill>

Also, I have essentially identical code working perfectly fine when the memory being written to is from @alloca.

I am completely clueless. Any suggestions most welcome.

Cheers,

 -- nikodemus



_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [llvm-dev] [newbie] trouble with global variables and CreateLoad/Store in JIT

Hal Finkel via llvm-dev


On Mon, Jun 5, 2017 at 1:34 PM, Nikodemus Siivola <[hidden email]> wrote:
Uh. Turns out that if I hide the pointer to @foo from LLVM by passing it through an opaque identity function ... then everything works fine.

Is this a bug in LLVM or is there some magic involving globals I'm misunderstanding?

This looks like a bug in the handling of constant GEP's. Specifically the `getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1)` used to calculate the address of the integer inside the struct. Your observation "The bizarre thing is that even this looks correct: the debugInt is called first with @foo, then @foo+4, and the stores seem to be going to the right addresses as well: @foo and @foo+4!" at the level of the MachineInstr dump rules out problems before that.

After MachineInstr comes MC to emit the object file, but `foo+4` is one of the most basic relocation types, so I doubt that there's a bug in the lowering there or else "everything" would be broken.
Just to verify though, checking assembly of a small example across 32-bit targets of all 3 object file formats looks fine at a glance (MC is getting the +4 addend, though you would need to run `llvm-objdump -d -r` to see the actual relocation in the binary) .

Beyond MC, you already have your static object file. If that is fine, then in a JIT context you might be running into issues with RuntimeDyld. The actual GEP's that clang generates are identical to the ones in your code, further suggesting that this is JIT specific and that static links are unaffected (if you could verify that, it would help to narrow down the possibilities).
Maybe look at the output of `llvm-objdump -d -r` on a static .o file generated from your IR and see where the relocation is handled in lib/ExecutionEngine/RuntimeDyld (this will depend on your platform; grepping for the name of the relocation shown by llvm-objdump should find the right code to look at).

By the way, what platform are you JIT'ing on? I noticed that it is a 32-bit target, and I suspect that the 32-bit support in the JIT infrastructure isn't as well tested / commonly used as the 64-bit code, possibly explaining why this sort of bug could sneak through.

-- Sean Silva
 

define { i8*, i32 } @"__anonToplevel/0"() prefix { i8*, i32 } (i32)* @"XEP:__anonToplevel/0" {
entry:
  %0 = call { i8*, i32 }* @identity({ i8*, i32 }* nonnull @foo)
  %1 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
  %2 = getelementptr { i8*, i32 }, { i8*, i32 }* %0, i32 0, i32 0
  %3 = ptrtoint { i8*, i32 }* %0 to i32
  %4 = call { i8*, i32 } @debugInt(i32 %3)
  store i8* @FixnumClass, i8** %2, align 4
  %5 = getelementptr { i8*, i32 }, { i8*, i32 }* %0, i32 0, i32 1
  %6 = ptrtoint i32* %5 to i32
  %7 = call { i8*, i32 } @debugInt(i32 %6)
  store i32 123, i32* %5, align 4
  %8 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
  store i8* @FixnumClass, i8** %2, align 4
  store i32 123, i32* %5, align 4
  %9 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
  call void @setGlobal({ i8*, i32 }* %0, { i8*, i32 } { i8* @FixnumClass, i32 123 })
  %10 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
  ret { i8*, i32 } { i8* @FixnumClass, i32 123 }
}

Output, now with correct addresses out of the GEPs, and memory being modified as expected:

p = 02F80000
  class: 00000000
  datum: 00000000
x = 02F80000
x = 02F80004
p = 02F80000
  class: 028D3E98
  datum: 0000007B
p = 02F80000
  class: 028D3E98
  datum: 0000007B
p = 02F80000
  class: 028D3E98
  datum: 0000007B

Cheers,

 -- nikodemus


On Mon, Jun 5, 2017 at 10:57 PM, Nikodemus Siivola <[hidden email]> wrote:
Since the getelementptrs were implicitly generated by the CreateStore/Load I'm not sure how to get access to them.

So I hacked the assignment to be done thrice: once using a manual decomposition into two GEPs and stores, once using the "big" CreateStore, once via the setGlobal function, printing addresses and memory contents at each point to the degree that I have access to them.

It seems the following GEPs compute the same address?! I can buy myself not understanding how GEP works and doing it wrong, but builder.CreateStore() creates what look like identical GEPs implicitly...

i8** getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0), align 4
i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1), align 4

The details.

This is the relevant part from my codegen:

            auto ty = val->getType();
            cout << "val type:" << endl;
            ty->dump();
            cout << "ptr type:" << endl;
            ptr->getType()->dump();
            // Print memory
            ctx.EmitCall1("debugPointer", ptr);
            // Set class pointer
            auto c = ctx.bld.CreateExtractValue(val, 0, "class");
            auto cp = ctx.bld.CreateConstGEP2_32(ty, ptr, 0, 0);
            auto cx = ctx.bld.CreatePtrToInt(cp, ctx.Int32Type());
            ctx.EmitCall1("debugInt", cx);
            ctx.bld.CreateStore(c, cp);
            // Set datum
            auto d = ctx.bld.CreateExtractValue(val, 1, "datum");
            auto dp = ctx.bld.CreateConstGEP2_32(ty, ptr, 0, 1);
            auto dx = ctx.bld.CreatePtrToInt(dp, ctx.Int32Type());
            ctx.EmitCall1("debugInt", dx);
            ctx.bld.CreateStore(d, dp);
            // Print memory
            ctx.EmitCall1("debugPointer", ptr);
            // Do the same with a single store
            ctx.bld.CreateStore(val, ptr);
            // Print memory
            ctx.EmitCall1("debugPointer", ptr);
            // Call out
            ctx.EmitCall2("setGlobal", ptr, val);
            // Print memory
            ctx.EmitCall1("debugPointer", ptr);

Here is the compile-time output showing types of the value and the pointer:

val type:
{ i8*, i32 }
ptr type:
{ i8*, i32 }*

Here is the IR dump for the function (after a couple of passes), right before it's fed to the JIT:

define { i8*, i32 } @"__anonToplevel/0"() prefix { i8*, i32 } (i32)* @"XEP:__anonToplevel/0" {
entry:
  %0 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
  %1 = call { i8*, i32 } @debugInt(i32 ptrtoint ({ i8*, i32 }* @foo to i32))
  store i8* @FixnumClass, i8** getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0), align 4
  %2 = call { i8*, i32 } @debugInt(i32 ptrtoint (i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1) to i32))
  store i32 123, i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1), align 4
  %3 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
  store i8* @FixnumClass, i8** getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0), align 4
  store i32 123, i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1), align 4
  %4 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
  call void @setGlobal({ i8*, i32 }* nonnull @foo, { i8*, i32 } { i8* @FixnumClass, i32 123 })
  %5 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
  ret { i8*, i32 } { i8* @FixnumClass, i32 123 }
}

​Here is the runtime from calling the JITed function, including memory addresses and contents, with my annotations:

# Before
p = 03C10000
  class: 00000000
  datum: 00000000
# Should be address of the class slot --> correct
x = 03C10000
# Should be address of the datum slot, ie address of class slot + 4 --> incorrect
x = 03C10000
# Yeah, both values want to class slot, so actual class pointer got clobbered
p = 03C10000
  class: 0000007B
  datum: 00000000
# Same result from the single CreateStore
p = 03C10000
  class: 0000007B
  datum: 00000000
# Calling out to setGlobal as in my first email works
p = 03C10000
  class: 039D2E98
  datum: 0000007B

Finally, I didn't manage nice disassembly yet, so here is the last output from --print-after-all for the function. The bizarre thing is that even this looks correct: the debugInt is called first with @foo, then @foo+4, and the stores seem to be going to the right addresses as well: @foo and @foo+4!

BB#0: derived from LLVM BB %entry
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugInt>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        MOV32mi %noreg, 1, %noreg, <ga:@foo>, %noreg, <ga:@JazzFixnumClass>; mem:ST4[getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0)]
        PUSHi32 <ga:@foo+4>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugInt>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        MOV32mi %noreg, 1, %noreg, <ga:@foo+4>, %noreg, 123; mem:ST4[getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1)]
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        MOV32mi %noreg, 1, %noreg, <ga:@foo>, %noreg, <ga:@JazzFixnumClass>; mem:ST4[getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0)]
        MOV32mi %noreg, 1, %noreg, <ga:@foo+4>, %noreg, 123; mem:ST4[getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1)]
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        PUSH32i8 123, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        PUSHi32 <ga:@JazzFixnumClass>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@setGlobal>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 12, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        %EAX<def> = MOV32ri <ga:@JazzFixnumClass>
        %EDX<def> = MOV32ri 123
        RETL %EAX<kill>, %EDX<kill>

Also, I have essentially identical code working perfectly fine when the memory being written to is from @alloca.

I am completely clueless. Any suggestions most welcome.

Cheers,

 -- nikodemus




_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [llvm-dev] [newbie] trouble with global variables and CreateLoad/Store in JIT

Hal Finkel via llvm-dev
This is on Windows 10: didn't yet manage to get a 64-bit toolchain set up that agreed on everything necessary.

Dumped bitcode, but when I did that everything landed in the same module (normally the global is defined in a different module then its uses) --> the relocations are different... different enough that when I loaded the bitcode back in and handed the single module to JIT it worked fine.

I'll try to dump a case where the definition is in a different module tomorrow. 

Anyhow, below is what clang-cl turned the bitcode from my IR into -- probably not very useful though as this code does what it should...

$ llvm-objdump.exe -r -d test.o

test.o: file format COFF-i386

Disassembly of section .text:
.text:
       0:       00 00   addb    %al, (%eax)
                        00000000:  IMAGE_REL_I386_DIR32 _XEP:setfoo
       2:       00 00   addb    %al, (%eax)

_setfoo:
       4:       56      pushl   %esi
       5:       83 ec 40        subl    $64, %esp
       8:       89 e0   movl    %esp, %eax
       a:       c7 00 00 00 00 00       movl    $0, (%eax)
                        0000000c:  IMAGE_REL_I386_DIR32 _foo
      10:       e8 00 00 00 00  calll   0 <_setfoo+0x11>
                        00000011:  IMAGE_REL_I386_REL32 _debugPointer
      15:       89 e1   movl    %esp, %ecx
      17:       c7 01 00 00 00 00       movl    $0, (%ecx)
                        00000019:  IMAGE_REL_I386_DIR32 _foo
      1d:       89 44 24 3c     movl    %eax, 60(%esp)
      21:       89 54 24 38     movl    %edx, 56(%esp)
      25:       e8 00 00 00 00  calll   0 <_setfoo+0x26>
                        00000026:  IMAGE_REL_I386_REL32 _debugInt
      2a:       c7 05 00 00 00 00 00 00 00 00   movl    $0, 0
                        0000002c:  IMAGE_REL_I386_DIR32 _foo
                        00000030:  IMAGE_REL_I386_DIR32 _JazzFixnumClass
      34:       b9 00 00 00 00  movl    $0, %ecx
                        00000035:  IMAGE_REL_I386_DIR32 _JazzFixnumClass
      39:       89 e6   movl    %esp, %esi
      3b:       c7 06 04 00 00 00       movl    $4, (%esi)
                        0000003d:  IMAGE_REL_I386_DIR32 _foo
      41:       89 44 24 34     movl    %eax, 52(%esp)
      45:       89 54 24 30     movl    %edx, 48(%esp)
      49:       89 4c 24 2c     movl    %ecx, 44(%esp)
      4d:       e8 00 00 00 00  calll   0 <_setfoo+0x4E>
                        0000004e:  IMAGE_REL_I386_REL32 _debugInt
      52:       c7 05 04 00 00 00 d5 00 00 00   movl    $213, 4
                        00000054:  IMAGE_REL_I386_DIR32 _foo
      5c:       89 e1   movl    %esp, %ecx
      5e:       c7 01 00 00 00 00       movl    $0, (%ecx)
                        00000060:  IMAGE_REL_I386_DIR32 _foo
      64:       89 44 24 28     movl    %eax, 40(%esp)
      68:       89 54 24 24     movl    %edx, 36(%esp)
      6c:       e8 00 00 00 00  calll   0 <_setfoo+0x6D>
                        0000006d:  IMAGE_REL_I386_REL32 _debugPointer
      71:       c7 05 00 00 00 00 00 00 00 00   movl    $0, 0
                        00000073:  IMAGE_REL_I386_DIR32 _foo
                        00000077:  IMAGE_REL_I386_DIR32 _JazzFixnumClass
      7b:       c7 05 04 00 00 00 d5 00 00 00   movl    $213, 4
                        0000007d:  IMAGE_REL_I386_DIR32 _foo
      85:       89 e1   movl    %esp, %ecx
      87:       c7 01 00 00 00 00       movl    $0, (%ecx)
                        00000089:  IMAGE_REL_I386_DIR32 _foo
      8d:       89 44 24 20     movl    %eax, 32(%esp)
      91:       89 54 24 1c     movl    %edx, 28(%esp)
      95:       e8 00 00 00 00  calll   0 <_setfoo+0x96>
                        00000096:  IMAGE_REL_I386_REL32 _debugPointer
      9a:       89 e1   movl    %esp, %ecx
      9c:       c7 41 08 d5 00 00 00    movl    $213, 8(%ecx)
      a3:       c7 41 04 00 00 00 00    movl    $0, 4(%ecx)
                        000000a6:  IMAGE_REL_I386_DIR32 _JazzFixnumClass
      aa:       c7 01 00 00 00 00       movl    $0, (%ecx)
                        000000ac:  IMAGE_REL_I386_DIR32 _foo
      b0:       89 44 24 18     movl    %eax, 24(%esp)
      b4:       89 54 24 14     movl    %edx, 20(%esp)
      b8:       e8 00 00 00 00  calll   0 <_setfoo+0xB9>
                        000000b9:  IMAGE_REL_I386_REL32 _setGlobal
      bd:       89 e0   movl    %esp, %eax
      bf:       c7 00 00 00 00 00       movl    $0, (%eax)
                        000000c1:  IMAGE_REL_I386_DIR32 _foo
      c5:       e8 00 00 00 00  calll   0 <_setfoo+0xC6>
                        000000c6:  IMAGE_REL_I386_REL32 _debugPointer
      ca:       b9 d5 00 00 00  movl    $213, %ecx
      cf:       8b 74 24 2c     movl    44(%esp), %esi
      d3:       89 44 24 10     movl    %eax, 16(%esp)
      d7:       89 f0   movl    %esi, %eax
      d9:       89 54 24 0c     movl    %edx, 12(%esp)
      dd:       89 ca   movl    %ecx, %edx
      df:       83 c4 40        addl    $64, %esp
      e2:       5e      popl    %esi
      e3:       c3      retl
      e4:       66 66 66 2e 0f 1f 84 00 00 00 00 00     nopw    %cs:(%eax,%eax)

_XEP:setfoo:
      f0:       8b 44 24 04     movl    4(%esp), %eax
      f4:       83 f8 00        cmpl    $0, %eax
      f7:       0f 84 05 00 00 00       je      5 <_XEP:setfoo+0x12>
      fd:       e8 00 00 00 00  calll   0 <_XEP:setfoo+0x12>
                        000000fe:  IMAGE_REL_I386_REL32 _typeError
     102:       e8 00 00 00 00  calll   0 <_XEP:setfoo+0x17>
                        00000103:  IMAGE_REL_I386_REL32 _setfoo
     107:       c3      retl
     108:       0f 1f 84 00 00 00 00 00         nopl    (%eax,%eax)
     110:       00 00   addb    %al, (%eax)
                        00000110:  IMAGE_REL_I386_DIR32 _XEP:getfoo
     112:       00 00   addb    %al, (%eax)

_getfoo:
     114:       50      pushl   %eax
     115:       89 e0   movl    %esp, %eax
     117:       c7 00 00 00 00 00       movl    $0, (%eax)
                        00000119:  IMAGE_REL_I386_DIR32 _foo
     11d:       e8 00 00 00 00  calll   0 <_getfoo+0xE>
                        0000011e:  IMAGE_REL_I386_REL32 _getGlobal
     122:       59      popl    %ecx
     123:       c3      retl
     124:       66 66 66 2e 0f 1f 84 00 00 00 00 00     nopw    %cs:(%eax,%eax)

_XEP:getfoo:
     130:       8b 44 24 04     movl    4(%esp), %eax
     134:       83 f8 00        cmpl    $0, %eax
     137:       0f 84 05 00 00 00       je      5 <_XEP:getfoo+0x12>
     13d:       e8 00 00 00 00  calll   0 <_XEP:getfoo+0x12>
                        0000013e:  IMAGE_REL_I386_REL32 _typeError
     142:       e8 00 00 00 00  calll   0 <_XEP:getfoo+0x17>
                        00000143:  IMAGE_REL_I386_REL32 _getfoo
     147:       c3      retl


On Tue, Jun 6, 2017 at 3:18 AM, Sean Silva <[hidden email]> wrote:


On Mon, Jun 5, 2017 at 1:34 PM, Nikodemus Siivola <[hidden email]> wrote:
Uh. Turns out that if I hide the pointer to @foo from LLVM by passing it through an opaque identity function ... then everything works fine.

Is this a bug in LLVM or is there some magic involving globals I'm misunderstanding?

This looks like a bug in the handling of constant GEP's. Specifically the `getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1)` used to calculate the address of the integer inside the struct. Your observation "The bizarre thing is that even this looks correct: the debugInt is called first with @foo, then @foo+4, and the stores seem to be going to the right addresses as well: @foo and @foo+4!" at the level of the MachineInstr dump rules out problems before that.

After MachineInstr comes MC to emit the object file, but `foo+4` is one of the most basic relocation types, so I doubt that there's a bug in the lowering there or else "everything" would be broken.
Just to verify though, checking assembly of a small example across 32-bit targets of all 3 object file formats looks fine at a glance (MC is getting the +4 addend, though you would need to run `llvm-objdump -d -r` to see the actual relocation in the binary) .

Beyond MC, you already have your static object file. If that is fine, then in a JIT context you might be running into issues with RuntimeDyld. The actual GEP's that clang generates are identical to the ones in your code, further suggesting that this is JIT specific and that static links are unaffected (if you could verify that, it would help to narrow down the possibilities).
Maybe look at the output of `llvm-objdump -d -r` on a static .o file generated from your IR and see where the relocation is handled in lib/ExecutionEngine/RuntimeDyld (this will depend on your platform; grepping for the name of the relocation shown by llvm-objdump should find the right code to look at).

By the way, what platform are you JIT'ing on? I noticed that it is a 32-bit target, and I suspect that the 32-bit support in the JIT infrastructure isn't as well tested / commonly used as the 64-bit code, possibly explaining why this sort of bug could sneak through.

-- Sean Silva
 

define { i8*, i32 } @"__anonToplevel/0"() prefix { i8*, i32 } (i32)* @"XEP:__anonToplevel/0" {
entry:
  %0 = call { i8*, i32 }* @identity({ i8*, i32 }* nonnull @foo)
  %1 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
  %2 = getelementptr { i8*, i32 }, { i8*, i32 }* %0, i32 0, i32 0
  %3 = ptrtoint { i8*, i32 }* %0 to i32
  %4 = call { i8*, i32 } @debugInt(i32 %3)
  store i8* @FixnumClass, i8** %2, align 4
  %5 = getelementptr { i8*, i32 }, { i8*, i32 }* %0, i32 0, i32 1
  %6 = ptrtoint i32* %5 to i32
  %7 = call { i8*, i32 } @debugInt(i32 %6)
  store i32 123, i32* %5, align 4
  %8 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
  store i8* @FixnumClass, i8** %2, align 4
  store i32 123, i32* %5, align 4
  %9 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
  call void @setGlobal({ i8*, i32 }* %0, { i8*, i32 } { i8* @FixnumClass, i32 123 })
  %10 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
  ret { i8*, i32 } { i8* @FixnumClass, i32 123 }
}

Output, now with correct addresses out of the GEPs, and memory being modified as expected:

p = 02F80000
  class: 00000000
  datum: 00000000
x = 02F80000
x = 02F80004
p = 02F80000
  class: 028D3E98
  datum: 0000007B
p = 02F80000
  class: 028D3E98
  datum: 0000007B
p = 02F80000
  class: 028D3E98
  datum: 0000007B

Cheers,

 -- nikodemus


On Mon, Jun 5, 2017 at 10:57 PM, Nikodemus Siivola <[hidden email]> wrote:
Since the getelementptrs were implicitly generated by the CreateStore/Load I'm not sure how to get access to them.

So I hacked the assignment to be done thrice: once using a manual decomposition into two GEPs and stores, once using the "big" CreateStore, once via the setGlobal function, printing addresses and memory contents at each point to the degree that I have access to them.

It seems the following GEPs compute the same address?! I can buy myself not understanding how GEP works and doing it wrong, but builder.CreateStore() creates what look like identical GEPs implicitly...

i8** getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0), align 4
i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1), align 4

The details.

This is the relevant part from my codegen:

            auto ty = val->getType();
            cout << "val type:" << endl;
            ty->dump();
            cout << "ptr type:" << endl;
            ptr->getType()->dump();
            // Print memory
            ctx.EmitCall1("debugPointer", ptr);
            // Set class pointer
            auto c = ctx.bld.CreateExtractValue(val, 0, "class");
            auto cp = ctx.bld.CreateConstGEP2_32(ty, ptr, 0, 0);
            auto cx = ctx.bld.CreatePtrToInt(cp, ctx.Int32Type());
            ctx.EmitCall1("debugInt", cx);
            ctx.bld.CreateStore(c, cp);
            // Set datum
            auto d = ctx.bld.CreateExtractValue(val, 1, "datum");
            auto dp = ctx.bld.CreateConstGEP2_32(ty, ptr, 0, 1);
            auto dx = ctx.bld.CreatePtrToInt(dp, ctx.Int32Type());
            ctx.EmitCall1("debugInt", dx);
            ctx.bld.CreateStore(d, dp);
            // Print memory
            ctx.EmitCall1("debugPointer", ptr);
            // Do the same with a single store
            ctx.bld.CreateStore(val, ptr);
            // Print memory
            ctx.EmitCall1("debugPointer", ptr);
            // Call out
            ctx.EmitCall2("setGlobal", ptr, val);
            // Print memory
            ctx.EmitCall1("debugPointer", ptr);

Here is the compile-time output showing types of the value and the pointer:

val type:
{ i8*, i32 }
ptr type:
{ i8*, i32 }*

Here is the IR dump for the function (after a couple of passes), right before it's fed to the JIT:

define { i8*, i32 } @"__anonToplevel/0"() prefix { i8*, i32 } (i32)* @"XEP:__anonToplevel/0" {
entry:
  %0 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
  %1 = call { i8*, i32 } @debugInt(i32 ptrtoint ({ i8*, i32 }* @foo to i32))
  store i8* @FixnumClass, i8** getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0), align 4
  %2 = call { i8*, i32 } @debugInt(i32 ptrtoint (i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1) to i32))
  store i32 123, i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1), align 4
  %3 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
  store i8* @FixnumClass, i8** getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0), align 4
  store i32 123, i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1), align 4
  %4 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
  call void @setGlobal({ i8*, i32 }* nonnull @foo, { i8*, i32 } { i8* @FixnumClass, i32 123 })
  %5 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
  ret { i8*, i32 } { i8* @FixnumClass, i32 123 }
}

​Here is the runtime from calling the JITed function, including memory addresses and contents, with my annotations:

# Before
p = 03C10000
  class: 00000000
  datum: 00000000
# Should be address of the class slot --> correct
x = 03C10000
# Should be address of the datum slot, ie address of class slot + 4 --> incorrect
x = 03C10000
# Yeah, both values want to class slot, so actual class pointer got clobbered
p = 03C10000
  class: 0000007B
  datum: 00000000
# Same result from the single CreateStore
p = 03C10000
  class: 0000007B
  datum: 00000000
# Calling out to setGlobal as in my first email works
p = 03C10000
  class: 039D2E98
  datum: 0000007B

Finally, I didn't manage nice disassembly yet, so here is the last output from --print-after-all for the function. The bizarre thing is that even this looks correct: the debugInt is called first with @foo, then @foo+4, and the stores seem to be going to the right addresses as well: @foo and @foo+4!

BB#0: derived from LLVM BB %entry
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugInt>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        MOV32mi %noreg, 1, %noreg, <ga:@foo>, %noreg, <ga:@JazzFixnumClass>; mem:ST4[getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0)]
        PUSHi32 <ga:@foo+4>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugInt>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        MOV32mi %noreg, 1, %noreg, <ga:@foo+4>, %noreg, 123; mem:ST4[getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1)]
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        MOV32mi %noreg, 1, %noreg, <ga:@foo>, %noreg, <ga:@JazzFixnumClass>; mem:ST4[getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0)]
        MOV32mi %noreg, 1, %noreg, <ga:@foo+4>, %noreg, 123; mem:ST4[getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1)]
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        PUSH32i8 123, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        PUSHi32 <ga:@JazzFixnumClass>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@setGlobal>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 12, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        %EAX<def> = MOV32ri <ga:@JazzFixnumClass>
        %EDX<def> = MOV32ri 123
        RETL %EAX<kill>, %EDX<kill>

Also, I have essentially identical code working perfectly fine when the memory being written to is from @alloca.

I am completely clueless. Any suggestions most welcome.

Cheers,

 -- nikodemus





_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [llvm-dev] [newbie] trouble with global variables and CreateLoad/Store in JIT

Hal Finkel via llvm-dev
That's useful to know that the static compilation code path works. Furthermore, as expected from that:

      52:       c7 05 04 00 00 00 d5 00 00 00   movl    $213, 4
                        00000054:  IMAGE_REL_I386_DIR32 _foo

It looks like the offset `4` of the second field of your struct is correct in the object file, so this does seem to be a problem in the JIT-specific linking/loading.

Can you try debugging into lib/ExecutionEngine/RuntimeDyld/Targets/RuntimeDyldCOFFI386.h to see if the relocation is getting applied correctly in the context of your JIT?

You may be able to repro this more easily using `lli`. It has a `-jit-kind` argument that should get you into the JIT codepath. (see test/ExecutionEngine/{MCJIT,ORCMCJIT}/)

-- Sean Silva


On Tue, Jun 6, 2017 at 1:09 AM, Nikodemus Siivola <[hidden email]> wrote:
This is on Windows 10: didn't yet manage to get a 64-bit toolchain set up that agreed on everything necessary.

Dumped bitcode, but when I did that everything landed in the same module (normally the global is defined in a different module then its uses) --> the relocations are different... different enough that when I loaded the bitcode back in and handed the single module to JIT it worked fine.

I'll try to dump a case where the definition is in a different module tomorrow. 

Anyhow, below is what clang-cl turned the bitcode from my IR into -- probably not very useful though as this code does what it should...

$ llvm-objdump.exe -r -d test.o

test.o: file format COFF-i386

Disassembly of section .text:
.text:
       0:       00 00   addb    %al, (%eax)
                        00000000:  IMAGE_REL_I386_DIR32 _XEP:setfoo
       2:       00 00   addb    %al, (%eax)

_setfoo:
       4:       56      pushl   %esi
       5:       83 ec 40        subl    $64, %esp
       8:       89 e0   movl    %esp, %eax
       a:       c7 00 00 00 00 00       movl    $0, (%eax)
                        0000000c:  IMAGE_REL_I386_DIR32 _foo
      10:       e8 00 00 00 00  calll   0 <_setfoo+0x11>
                        00000011:  IMAGE_REL_I386_REL32 _debugPointer
      15:       89 e1   movl    %esp, %ecx
      17:       c7 01 00 00 00 00       movl    $0, (%ecx)
                        00000019:  IMAGE_REL_I386_DIR32 _foo
      1d:       89 44 24 3c     movl    %eax, 60(%esp)
      21:       89 54 24 38     movl    %edx, 56(%esp)
      25:       e8 00 00 00 00  calll   0 <_setfoo+0x26>
                        00000026:  IMAGE_REL_I386_REL32 _debugInt
      2a:       c7 05 00 00 00 00 00 00 00 00   movl    $0, 0
                        0000002c:  IMAGE_REL_I386_DIR32 _foo
                        00000030:  IMAGE_REL_I386_DIR32 _JazzFixnumClass
      34:       b9 00 00 00 00  movl    $0, %ecx
                        00000035:  IMAGE_REL_I386_DIR32 _JazzFixnumClass
      39:       89 e6   movl    %esp, %esi
      3b:       c7 06 04 00 00 00       movl    $4, (%esi)
                        0000003d:  IMAGE_REL_I386_DIR32 _foo
      41:       89 44 24 34     movl    %eax, 52(%esp)
      45:       89 54 24 30     movl    %edx, 48(%esp)
      49:       89 4c 24 2c     movl    %ecx, 44(%esp)
      4d:       e8 00 00 00 00  calll   0 <_setfoo+0x4E>
                        0000004e:  IMAGE_REL_I386_REL32 _debugInt
      52:       c7 05 04 00 00 00 d5 00 00 00   movl    $213, 4
                        00000054:  IMAGE_REL_I386_DIR32 _foo
      5c:       89 e1   movl    %esp, %ecx
      5e:       c7 01 00 00 00 00       movl    $0, (%ecx)
                        00000060:  IMAGE_REL_I386_DIR32 _foo
      64:       89 44 24 28     movl    %eax, 40(%esp)
      68:       89 54 24 24     movl    %edx, 36(%esp)
      6c:       e8 00 00 00 00  calll   0 <_setfoo+0x6D>
                        0000006d:  IMAGE_REL_I386_REL32 _debugPointer
      71:       c7 05 00 00 00 00 00 00 00 00   movl    $0, 0
                        00000073:  IMAGE_REL_I386_DIR32 _foo
                        00000077:  IMAGE_REL_I386_DIR32 _JazzFixnumClass
      7b:       c7 05 04 00 00 00 d5 00 00 00   movl    $213, 4
                        0000007d:  IMAGE_REL_I386_DIR32 _foo
      85:       89 e1   movl    %esp, %ecx
      87:       c7 01 00 00 00 00       movl    $0, (%ecx)
                        00000089:  IMAGE_REL_I386_DIR32 _foo
      8d:       89 44 24 20     movl    %eax, 32(%esp)
      91:       89 54 24 1c     movl    %edx, 28(%esp)
      95:       e8 00 00 00 00  calll   0 <_setfoo+0x96>
                        00000096:  IMAGE_REL_I386_REL32 _debugPointer
      9a:       89 e1   movl    %esp, %ecx
      9c:       c7 41 08 d5 00 00 00    movl    $213, 8(%ecx)
      a3:       c7 41 04 00 00 00 00    movl    $0, 4(%ecx)
                        000000a6:  IMAGE_REL_I386_DIR32 _JazzFixnumClass
      aa:       c7 01 00 00 00 00       movl    $0, (%ecx)
                        000000ac:  IMAGE_REL_I386_DIR32 _foo
      b0:       89 44 24 18     movl    %eax, 24(%esp)
      b4:       89 54 24 14     movl    %edx, 20(%esp)
      b8:       e8 00 00 00 00  calll   0 <_setfoo+0xB9>
                        000000b9:  IMAGE_REL_I386_REL32 _setGlobal
      bd:       89 e0   movl    %esp, %eax
      bf:       c7 00 00 00 00 00       movl    $0, (%eax)
                        000000c1:  IMAGE_REL_I386_DIR32 _foo
      c5:       e8 00 00 00 00  calll   0 <_setfoo+0xC6>
                        000000c6:  IMAGE_REL_I386_REL32 _debugPointer
      ca:       b9 d5 00 00 00  movl    $213, %ecx
      cf:       8b 74 24 2c     movl    44(%esp), %esi
      d3:       89 44 24 10     movl    %eax, 16(%esp)
      d7:       89 f0   movl    %esi, %eax
      d9:       89 54 24 0c     movl    %edx, 12(%esp)
      dd:       89 ca   movl    %ecx, %edx
      df:       83 c4 40        addl    $64, %esp
      e2:       5e      popl    %esi
      e3:       c3      retl
      e4:       66 66 66 2e 0f 1f 84 00 00 00 00 00     nopw    %cs:(%eax,%eax)

_XEP:setfoo:
      f0:       8b 44 24 04     movl    4(%esp), %eax
      f4:       83 f8 00        cmpl    $0, %eax
      f7:       0f 84 05 00 00 00       je      5 <_XEP:setfoo+0x12>
      fd:       e8 00 00 00 00  calll   0 <_XEP:setfoo+0x12>
                        000000fe:  IMAGE_REL_I386_REL32 _typeError
     102:       e8 00 00 00 00  calll   0 <_XEP:setfoo+0x17>
                        00000103:  IMAGE_REL_I386_REL32 _setfoo
     107:       c3      retl
     108:       0f 1f 84 00 00 00 00 00         nopl    (%eax,%eax)
     110:       00 00   addb    %al, (%eax)
                        00000110:  IMAGE_REL_I386_DIR32 _XEP:getfoo
     112:       00 00   addb    %al, (%eax)

_getfoo:
     114:       50      pushl   %eax
     115:       89 e0   movl    %esp, %eax
     117:       c7 00 00 00 00 00       movl    $0, (%eax)
                        00000119:  IMAGE_REL_I386_DIR32 _foo
     11d:       e8 00 00 00 00  calll   0 <_getfoo+0xE>
                        0000011e:  IMAGE_REL_I386_REL32 _getGlobal
     122:       59      popl    %ecx
     123:       c3      retl
     124:       66 66 66 2e 0f 1f 84 00 00 00 00 00     nopw    %cs:(%eax,%eax)

_XEP:getfoo:
     130:       8b 44 24 04     movl    4(%esp), %eax
     134:       83 f8 00        cmpl    $0, %eax
     137:       0f 84 05 00 00 00       je      5 <_XEP:getfoo+0x12>
     13d:       e8 00 00 00 00  calll   0 <_XEP:getfoo+0x12>
                        0000013e:  IMAGE_REL_I386_REL32 _typeError
     142:       e8 00 00 00 00  calll   0 <_XEP:getfoo+0x17>
                        00000143:  IMAGE_REL_I386_REL32 _getfoo
     147:       c3      retl


On Tue, Jun 6, 2017 at 3:18 AM, Sean Silva <[hidden email]> wrote:


On Mon, Jun 5, 2017 at 1:34 PM, Nikodemus Siivola <[hidden email]> wrote:
Uh. Turns out that if I hide the pointer to @foo from LLVM by passing it through an opaque identity function ... then everything works fine.

Is this a bug in LLVM or is there some magic involving globals I'm misunderstanding?

This looks like a bug in the handling of constant GEP's. Specifically the `getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1)` used to calculate the address of the integer inside the struct. Your observation "The bizarre thing is that even this looks correct: the debugInt is called first with @foo, then @foo+4, and the stores seem to be going to the right addresses as well: @foo and @foo+4!" at the level of the MachineInstr dump rules out problems before that.

After MachineInstr comes MC to emit the object file, but `foo+4` is one of the most basic relocation types, so I doubt that there's a bug in the lowering there or else "everything" would be broken.
Just to verify though, checking assembly of a small example across 32-bit targets of all 3 object file formats looks fine at a glance (MC is getting the +4 addend, though you would need to run `llvm-objdump -d -r` to see the actual relocation in the binary) .

Beyond MC, you already have your static object file. If that is fine, then in a JIT context you might be running into issues with RuntimeDyld. The actual GEP's that clang generates are identical to the ones in your code, further suggesting that this is JIT specific and that static links are unaffected (if you could verify that, it would help to narrow down the possibilities).
Maybe look at the output of `llvm-objdump -d -r` on a static .o file generated from your IR and see where the relocation is handled in lib/ExecutionEngine/RuntimeDyld (this will depend on your platform; grepping for the name of the relocation shown by llvm-objdump should find the right code to look at).

By the way, what platform are you JIT'ing on? I noticed that it is a 32-bit target, and I suspect that the 32-bit support in the JIT infrastructure isn't as well tested / commonly used as the 64-bit code, possibly explaining why this sort of bug could sneak through.

-- Sean Silva
 

define { i8*, i32 } @"__anonToplevel/0"() prefix { i8*, i32 } (i32)* @"XEP:__anonToplevel/0" {
entry:
  %0 = call { i8*, i32 }* @identity({ i8*, i32 }* nonnull @foo)
  %1 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
  %2 = getelementptr { i8*, i32 }, { i8*, i32 }* %0, i32 0, i32 0
  %3 = ptrtoint { i8*, i32 }* %0 to i32
  %4 = call { i8*, i32 } @debugInt(i32 %3)
  store i8* @FixnumClass, i8** %2, align 4
  %5 = getelementptr { i8*, i32 }, { i8*, i32 }* %0, i32 0, i32 1
  %6 = ptrtoint i32* %5 to i32
  %7 = call { i8*, i32 } @debugInt(i32 %6)
  store i32 123, i32* %5, align 4
  %8 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
  store i8* @FixnumClass, i8** %2, align 4
  store i32 123, i32* %5, align 4
  %9 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
  call void @setGlobal({ i8*, i32 }* %0, { i8*, i32 } { i8* @FixnumClass, i32 123 })
  %10 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
  ret { i8*, i32 } { i8* @FixnumClass, i32 123 }
}

Output, now with correct addresses out of the GEPs, and memory being modified as expected:

p = 02F80000
  class: 00000000
  datum: 00000000
x = 02F80000
x = 02F80004
p = 02F80000
  class: 028D3E98
  datum: 0000007B
p = 02F80000
  class: 028D3E98
  datum: 0000007B
p = 02F80000
  class: 028D3E98
  datum: 0000007B

Cheers,

 -- nikodemus


On Mon, Jun 5, 2017 at 10:57 PM, Nikodemus Siivola <[hidden email]> wrote:
Since the getelementptrs were implicitly generated by the CreateStore/Load I'm not sure how to get access to them.

So I hacked the assignment to be done thrice: once using a manual decomposition into two GEPs and stores, once using the "big" CreateStore, once via the setGlobal function, printing addresses and memory contents at each point to the degree that I have access to them.

It seems the following GEPs compute the same address?! I can buy myself not understanding how GEP works and doing it wrong, but builder.CreateStore() creates what look like identical GEPs implicitly...

i8** getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0), align 4
i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1), align 4

The details.

This is the relevant part from my codegen:

            auto ty = val->getType();
            cout << "val type:" << endl;
            ty->dump();
            cout << "ptr type:" << endl;
            ptr->getType()->dump();
            // Print memory
            ctx.EmitCall1("debugPointer", ptr);
            // Set class pointer
            auto c = ctx.bld.CreateExtractValue(val, 0, "class");
            auto cp = ctx.bld.CreateConstGEP2_32(ty, ptr, 0, 0);
            auto cx = ctx.bld.CreatePtrToInt(cp, ctx.Int32Type());
            ctx.EmitCall1("debugInt", cx);
            ctx.bld.CreateStore(c, cp);
            // Set datum
            auto d = ctx.bld.CreateExtractValue(val, 1, "datum");
            auto dp = ctx.bld.CreateConstGEP2_32(ty, ptr, 0, 1);
            auto dx = ctx.bld.CreatePtrToInt(dp, ctx.Int32Type());
            ctx.EmitCall1("debugInt", dx);
            ctx.bld.CreateStore(d, dp);
            // Print memory
            ctx.EmitCall1("debugPointer", ptr);
            // Do the same with a single store
            ctx.bld.CreateStore(val, ptr);
            // Print memory
            ctx.EmitCall1("debugPointer", ptr);
            // Call out
            ctx.EmitCall2("setGlobal", ptr, val);
            // Print memory
            ctx.EmitCall1("debugPointer", ptr);

Here is the compile-time output showing types of the value and the pointer:

val type:
{ i8*, i32 }
ptr type:
{ i8*, i32 }*

Here is the IR dump for the function (after a couple of passes), right before it's fed to the JIT:

define { i8*, i32 } @"__anonToplevel/0"() prefix { i8*, i32 } (i32)* @"XEP:__anonToplevel/0" {
entry:
  %0 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
  %1 = call { i8*, i32 } @debugInt(i32 ptrtoint ({ i8*, i32 }* @foo to i32))
  store i8* @FixnumClass, i8** getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0), align 4
  %2 = call { i8*, i32 } @debugInt(i32 ptrtoint (i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1) to i32))
  store i32 123, i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1), align 4
  %3 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
  store i8* @FixnumClass, i8** getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0), align 4
  store i32 123, i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1), align 4
  %4 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
  call void @setGlobal({ i8*, i32 }* nonnull @foo, { i8*, i32 } { i8* @FixnumClass, i32 123 })
  %5 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
  ret { i8*, i32 } { i8* @FixnumClass, i32 123 }
}

​Here is the runtime from calling the JITed function, including memory addresses and contents, with my annotations:

# Before
p = 03C10000
  class: 00000000
  datum: 00000000
# Should be address of the class slot --> correct
x = 03C10000
# Should be address of the datum slot, ie address of class slot + 4 --> incorrect
x = 03C10000
# Yeah, both values want to class slot, so actual class pointer got clobbered
p = 03C10000
  class: 0000007B
  datum: 00000000
# Same result from the single CreateStore
p = 03C10000
  class: 0000007B
  datum: 00000000
# Calling out to setGlobal as in my first email works
p = 03C10000
  class: 039D2E98
  datum: 0000007B

Finally, I didn't manage nice disassembly yet, so here is the last output from --print-after-all for the function. The bizarre thing is that even this looks correct: the debugInt is called first with @foo, then @foo+4, and the stores seem to be going to the right addresses as well: @foo and @foo+4!

BB#0: derived from LLVM BB %entry
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugInt>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        MOV32mi %noreg, 1, %noreg, <ga:@foo>, %noreg, <ga:@JazzFixnumClass>; mem:ST4[getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0)]
        PUSHi32 <ga:@foo+4>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugInt>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        MOV32mi %noreg, 1, %noreg, <ga:@foo+4>, %noreg, 123; mem:ST4[getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1)]
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        MOV32mi %noreg, 1, %noreg, <ga:@foo>, %noreg, <ga:@JazzFixnumClass>; mem:ST4[getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0)]
        MOV32mi %noreg, 1, %noreg, <ga:@foo+4>, %noreg, 123; mem:ST4[getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1)]
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        PUSH32i8 123, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        PUSHi32 <ga:@JazzFixnumClass>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@setGlobal>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 12, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        %EAX<def> = MOV32ri <ga:@JazzFixnumClass>
        %EDX<def> = MOV32ri 123
        RETL %EAX<kill>, %EDX<kill>

Also, I have essentially identical code working perfectly fine when the memory being written to is from @alloca.

I am completely clueless. Any suggestions most welcome.

Cheers,

 -- nikodemus






_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [llvm-dev] [newbie] trouble with global variables and CreateLoad/Store in JIT

Hal Finkel via llvm-dev
I just managed a quick experiment today to dump and load the definition of the variable and the function that sets it into separate modules.

...loading those bitcode files into separate modules (and handing those modules to JIT) works as expected. What *should* be same code going directly into JIT does not work.

Which smells like the problem may be in my JIT hookup and not in RuntimeDyld.

I'll try to sort out my codepaths before digging into RuntimeDyld, so I can be sure I'm doing same things in "live" JIT and when dumping/loading bitcode.

I'll let you know what turns up.

Cheers,

 -- nikodemus


On Wed, Jun 7, 2017 at 12:16 AM, Sean Silva <[hidden email]> wrote:
That's useful to know that the static compilation code path works. Furthermore, as expected from that:

      52:       c7 05 04 00 00 00 d5 00 00 00   movl    $213, 4
                        00000054:  IMAGE_REL_I386_DIR32 _foo

It looks like the offset `4` of the second field of your struct is correct in the object file, so this does seem to be a problem in the JIT-specific linking/loading.

Can you try debugging into lib/ExecutionEngine/RuntimeDyld/Targets/RuntimeDyldCOFFI386.h to see if the relocation is getting applied correctly in the context of your JIT?

You may be able to repro this more easily using `lli`. It has a `-jit-kind` argument that should get you into the JIT codepath. (see test/ExecutionEngine/{MCJIT,ORCMCJIT}/)

-- Sean Silva


On Tue, Jun 6, 2017 at 1:09 AM, Nikodemus Siivola <[hidden email]> wrote:
This is on Windows 10: didn't yet manage to get a 64-bit toolchain set up that agreed on everything necessary.

Dumped bitcode, but when I did that everything landed in the same module (normally the global is defined in a different module then its uses) --> the relocations are different... different enough that when I loaded the bitcode back in and handed the single module to JIT it worked fine.

I'll try to dump a case where the definition is in a different module tomorrow. 

Anyhow, below is what clang-cl turned the bitcode from my IR into -- probably not very useful though as this code does what it should...

$ llvm-objdump.exe -r -d test.o

test.o: file format COFF-i386

Disassembly of section .text:
.text:
       0:       00 00   addb    %al, (%eax)
                        00000000:  IMAGE_REL_I386_DIR32 _XEP:setfoo
       2:       00 00   addb    %al, (%eax)

_setfoo:
       4:       56      pushl   %esi
       5:       83 ec 40        subl    $64, %esp
       8:       89 e0   movl    %esp, %eax
       a:       c7 00 00 00 00 00       movl    $0, (%eax)
                        0000000c:  IMAGE_REL_I386_DIR32 _foo
      10:       e8 00 00 00 00  calll   0 <_setfoo+0x11>
                        00000011:  IMAGE_REL_I386_REL32 _debugPointer
      15:       89 e1   movl    %esp, %ecx
      17:       c7 01 00 00 00 00       movl    $0, (%ecx)
                        00000019:  IMAGE_REL_I386_DIR32 _foo
      1d:       89 44 24 3c     movl    %eax, 60(%esp)
      21:       89 54 24 38     movl    %edx, 56(%esp)
      25:       e8 00 00 00 00  calll   0 <_setfoo+0x26>
                        00000026:  IMAGE_REL_I386_REL32 _debugInt
      2a:       c7 05 00 00 00 00 00 00 00 00   movl    $0, 0
                        0000002c:  IMAGE_REL_I386_DIR32 _foo
                        00000030:  IMAGE_REL_I386_DIR32 _JazzFixnumClass
      34:       b9 00 00 00 00  movl    $0, %ecx
                        00000035:  IMAGE_REL_I386_DIR32 _JazzFixnumClass
      39:       89 e6   movl    %esp, %esi
      3b:       c7 06 04 00 00 00       movl    $4, (%esi)
                        0000003d:  IMAGE_REL_I386_DIR32 _foo
      41:       89 44 24 34     movl    %eax, 52(%esp)
      45:       89 54 24 30     movl    %edx, 48(%esp)
      49:       89 4c 24 2c     movl    %ecx, 44(%esp)
      4d:       e8 00 00 00 00  calll   0 <_setfoo+0x4E>
                        0000004e:  IMAGE_REL_I386_REL32 _debugInt
      52:       c7 05 04 00 00 00 d5 00 00 00   movl    $213, 4
                        00000054:  IMAGE_REL_I386_DIR32 _foo
      5c:       89 e1   movl    %esp, %ecx
      5e:       c7 01 00 00 00 00       movl    $0, (%ecx)
                        00000060:  IMAGE_REL_I386_DIR32 _foo
      64:       89 44 24 28     movl    %eax, 40(%esp)
      68:       89 54 24 24     movl    %edx, 36(%esp)
      6c:       e8 00 00 00 00  calll   0 <_setfoo+0x6D>
                        0000006d:  IMAGE_REL_I386_REL32 _debugPointer
      71:       c7 05 00 00 00 00 00 00 00 00   movl    $0, 0
                        00000073:  IMAGE_REL_I386_DIR32 _foo
                        00000077:  IMAGE_REL_I386_DIR32 _JazzFixnumClass
      7b:       c7 05 04 00 00 00 d5 00 00 00   movl    $213, 4
                        0000007d:  IMAGE_REL_I386_DIR32 _foo
      85:       89 e1   movl    %esp, %ecx
      87:       c7 01 00 00 00 00       movl    $0, (%ecx)
                        00000089:  IMAGE_REL_I386_DIR32 _foo
      8d:       89 44 24 20     movl    %eax, 32(%esp)
      91:       89 54 24 1c     movl    %edx, 28(%esp)
      95:       e8 00 00 00 00  calll   0 <_setfoo+0x96>
                        00000096:  IMAGE_REL_I386_REL32 _debugPointer
      9a:       89 e1   movl    %esp, %ecx
      9c:       c7 41 08 d5 00 00 00    movl    $213, 8(%ecx)
      a3:       c7 41 04 00 00 00 00    movl    $0, 4(%ecx)
                        000000a6:  IMAGE_REL_I386_DIR32 _JazzFixnumClass
      aa:       c7 01 00 00 00 00       movl    $0, (%ecx)
                        000000ac:  IMAGE_REL_I386_DIR32 _foo
      b0:       89 44 24 18     movl    %eax, 24(%esp)
      b4:       89 54 24 14     movl    %edx, 20(%esp)
      b8:       e8 00 00 00 00  calll   0 <_setfoo+0xB9>
                        000000b9:  IMAGE_REL_I386_REL32 _setGlobal
      bd:       89 e0   movl    %esp, %eax
      bf:       c7 00 00 00 00 00       movl    $0, (%eax)
                        000000c1:  IMAGE_REL_I386_DIR32 _foo
      c5:       e8 00 00 00 00  calll   0 <_setfoo+0xC6>
                        000000c6:  IMAGE_REL_I386_REL32 _debugPointer
      ca:       b9 d5 00 00 00  movl    $213, %ecx
      cf:       8b 74 24 2c     movl    44(%esp), %esi
      d3:       89 44 24 10     movl    %eax, 16(%esp)
      d7:       89 f0   movl    %esi, %eax
      d9:       89 54 24 0c     movl    %edx, 12(%esp)
      dd:       89 ca   movl    %ecx, %edx
      df:       83 c4 40        addl    $64, %esp
      e2:       5e      popl    %esi
      e3:       c3      retl
      e4:       66 66 66 2e 0f 1f 84 00 00 00 00 00     nopw    %cs:(%eax,%eax)

_XEP:setfoo:
      f0:       8b 44 24 04     movl    4(%esp), %eax
      f4:       83 f8 00        cmpl    $0, %eax
      f7:       0f 84 05 00 00 00       je      5 <_XEP:setfoo+0x12>
      fd:       e8 00 00 00 00  calll   0 <_XEP:setfoo+0x12>
                        000000fe:  IMAGE_REL_I386_REL32 _typeError
     102:       e8 00 00 00 00  calll   0 <_XEP:setfoo+0x17>
                        00000103:  IMAGE_REL_I386_REL32 _setfoo
     107:       c3      retl
     108:       0f 1f 84 00 00 00 00 00         nopl    (%eax,%eax)
     110:       00 00   addb    %al, (%eax)
                        00000110:  IMAGE_REL_I386_DIR32 _XEP:getfoo
     112:       00 00   addb    %al, (%eax)

_getfoo:
     114:       50      pushl   %eax
     115:       89 e0   movl    %esp, %eax
     117:       c7 00 00 00 00 00       movl    $0, (%eax)
                        00000119:  IMAGE_REL_I386_DIR32 _foo
     11d:       e8 00 00 00 00  calll   0 <_getfoo+0xE>
                        0000011e:  IMAGE_REL_I386_REL32 _getGlobal
     122:       59      popl    %ecx
     123:       c3      retl
     124:       66 66 66 2e 0f 1f 84 00 00 00 00 00     nopw    %cs:(%eax,%eax)

_XEP:getfoo:
     130:       8b 44 24 04     movl    4(%esp), %eax
     134:       83 f8 00        cmpl    $0, %eax
     137:       0f 84 05 00 00 00       je      5 <_XEP:getfoo+0x12>
     13d:       e8 00 00 00 00  calll   0 <_XEP:getfoo+0x12>
                        0000013e:  IMAGE_REL_I386_REL32 _typeError
     142:       e8 00 00 00 00  calll   0 <_XEP:getfoo+0x17>
                        00000143:  IMAGE_REL_I386_REL32 _getfoo
     147:       c3      retl


On Tue, Jun 6, 2017 at 3:18 AM, Sean Silva <[hidden email]> wrote:


On Mon, Jun 5, 2017 at 1:34 PM, Nikodemus Siivola <[hidden email]> wrote:
Uh. Turns out that if I hide the pointer to @foo from LLVM by passing it through an opaque identity function ... then everything works fine.

Is this a bug in LLVM or is there some magic involving globals I'm misunderstanding?

This looks like a bug in the handling of constant GEP's. Specifically the `getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1)` used to calculate the address of the integer inside the struct. Your observation "The bizarre thing is that even this looks correct: the debugInt is called first with @foo, then @foo+4, and the stores seem to be going to the right addresses as well: @foo and @foo+4!" at the level of the MachineInstr dump rules out problems before that.

After MachineInstr comes MC to emit the object file, but `foo+4` is one of the most basic relocation types, so I doubt that there's a bug in the lowering there or else "everything" would be broken.
Just to verify though, checking assembly of a small example across 32-bit targets of all 3 object file formats looks fine at a glance (MC is getting the +4 addend, though you would need to run `llvm-objdump -d -r` to see the actual relocation in the binary) .

Beyond MC, you already have your static object file. If that is fine, then in a JIT context you might be running into issues with RuntimeDyld. The actual GEP's that clang generates are identical to the ones in your code, further suggesting that this is JIT specific and that static links are unaffected (if you could verify that, it would help to narrow down the possibilities).
Maybe look at the output of `llvm-objdump -d -r` on a static .o file generated from your IR and see where the relocation is handled in lib/ExecutionEngine/RuntimeDyld (this will depend on your platform; grepping for the name of the relocation shown by llvm-objdump should find the right code to look at).

By the way, what platform are you JIT'ing on? I noticed that it is a 32-bit target, and I suspect that the 32-bit support in the JIT infrastructure isn't as well tested / commonly used as the 64-bit code, possibly explaining why this sort of bug could sneak through.

-- Sean Silva
 

define { i8*, i32 } @"__anonToplevel/0"() prefix { i8*, i32 } (i32)* @"XEP:__anonToplevel/0" {
entry:
  %0 = call { i8*, i32 }* @identity({ i8*, i32 }* nonnull @foo)
  %1 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
  %2 = getelementptr { i8*, i32 }, { i8*, i32 }* %0, i32 0, i32 0
  %3 = ptrtoint { i8*, i32 }* %0 to i32
  %4 = call { i8*, i32 } @debugInt(i32 %3)
  store i8* @FixnumClass, i8** %2, align 4
  %5 = getelementptr { i8*, i32 }, { i8*, i32 }* %0, i32 0, i32 1
  %6 = ptrtoint i32* %5 to i32
  %7 = call { i8*, i32 } @debugInt(i32 %6)
  store i32 123, i32* %5, align 4
  %8 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
  store i8* @FixnumClass, i8** %2, align 4
  store i32 123, i32* %5, align 4
  %9 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
  call void @setGlobal({ i8*, i32 }* %0, { i8*, i32 } { i8* @FixnumClass, i32 123 })
  %10 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
  ret { i8*, i32 } { i8* @FixnumClass, i32 123 }
}

Output, now with correct addresses out of the GEPs, and memory being modified as expected:

p = 02F80000
  class: 00000000
  datum: 00000000
x = 02F80000
x = 02F80004
p = 02F80000
  class: 028D3E98
  datum: 0000007B
p = 02F80000
  class: 028D3E98
  datum: 0000007B
p = 02F80000
  class: 028D3E98
  datum: 0000007B

Cheers,

 -- nikodemus


On Mon, Jun 5, 2017 at 10:57 PM, Nikodemus Siivola <[hidden email]> wrote:
Since the getelementptrs were implicitly generated by the CreateStore/Load I'm not sure how to get access to them.

So I hacked the assignment to be done thrice: once using a manual decomposition into two GEPs and stores, once using the "big" CreateStore, once via the setGlobal function, printing addresses and memory contents at each point to the degree that I have access to them.

It seems the following GEPs compute the same address?! I can buy myself not understanding how GEP works and doing it wrong, but builder.CreateStore() creates what look like identical GEPs implicitly...

i8** getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0), align 4
i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1), align 4

The details.

This is the relevant part from my codegen:

            auto ty = val->getType();
            cout << "val type:" << endl;
            ty->dump();
            cout << "ptr type:" << endl;
            ptr->getType()->dump();
            // Print memory
            ctx.EmitCall1("debugPointer", ptr);
            // Set class pointer
            auto c = ctx.bld.CreateExtractValue(val, 0, "class");
            auto cp = ctx.bld.CreateConstGEP2_32(ty, ptr, 0, 0);
            auto cx = ctx.bld.CreatePtrToInt(cp, ctx.Int32Type());
            ctx.EmitCall1("debugInt", cx);
            ctx.bld.CreateStore(c, cp);
            // Set datum
            auto d = ctx.bld.CreateExtractValue(val, 1, "datum");
            auto dp = ctx.bld.CreateConstGEP2_32(ty, ptr, 0, 1);
            auto dx = ctx.bld.CreatePtrToInt(dp, ctx.Int32Type());
            ctx.EmitCall1("debugInt", dx);
            ctx.bld.CreateStore(d, dp);
            // Print memory
            ctx.EmitCall1("debugPointer", ptr);
            // Do the same with a single store
            ctx.bld.CreateStore(val, ptr);
            // Print memory
            ctx.EmitCall1("debugPointer", ptr);
            // Call out
            ctx.EmitCall2("setGlobal", ptr, val);
            // Print memory
            ctx.EmitCall1("debugPointer", ptr);

Here is the compile-time output showing types of the value and the pointer:

val type:
{ i8*, i32 }
ptr type:
{ i8*, i32 }*

Here is the IR dump for the function (after a couple of passes), right before it's fed to the JIT:

define { i8*, i32 } @"__anonToplevel/0"() prefix { i8*, i32 } (i32)* @"XEP:__anonToplevel/0" {
entry:
  %0 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
  %1 = call { i8*, i32 } @debugInt(i32 ptrtoint ({ i8*, i32 }* @foo to i32))
  store i8* @FixnumClass, i8** getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0), align 4
  %2 = call { i8*, i32 } @debugInt(i32 ptrtoint (i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1) to i32))
  store i32 123, i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1), align 4
  %3 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
  store i8* @FixnumClass, i8** getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0), align 4
  store i32 123, i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1), align 4
  %4 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
  call void @setGlobal({ i8*, i32 }* nonnull @foo, { i8*, i32 } { i8* @FixnumClass, i32 123 })
  %5 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
  ret { i8*, i32 } { i8* @FixnumClass, i32 123 }
}

​Here is the runtime from calling the JITed function, including memory addresses and contents, with my annotations:

# Before
p = 03C10000
  class: 00000000
  datum: 00000000
# Should be address of the class slot --> correct
x = 03C10000
# Should be address of the datum slot, ie address of class slot + 4 --> incorrect
x = 03C10000
# Yeah, both values want to class slot, so actual class pointer got clobbered
p = 03C10000
  class: 0000007B
  datum: 00000000
# Same result from the single CreateStore
p = 03C10000
  class: 0000007B
  datum: 00000000
# Calling out to setGlobal as in my first email works
p = 03C10000
  class: 039D2E98
  datum: 0000007B

Finally, I didn't manage nice disassembly yet, so here is the last output from --print-after-all for the function. The bizarre thing is that even this looks correct: the debugInt is called first with @foo, then @foo+4, and the stores seem to be going to the right addresses as well: @foo and @foo+4!

BB#0: derived from LLVM BB %entry
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugInt>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        MOV32mi %noreg, 1, %noreg, <ga:@foo>, %noreg, <ga:@JazzFixnumClass>; mem:ST4[getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0)]
        PUSHi32 <ga:@foo+4>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugInt>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        MOV32mi %noreg, 1, %noreg, <ga:@foo+4>, %noreg, 123; mem:ST4[getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1)]
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        MOV32mi %noreg, 1, %noreg, <ga:@foo>, %noreg, <ga:@JazzFixnumClass>; mem:ST4[getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0)]
        MOV32mi %noreg, 1, %noreg, <ga:@foo+4>, %noreg, 123; mem:ST4[getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1)]
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        PUSH32i8 123, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        PUSHi32 <ga:@JazzFixnumClass>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@setGlobal>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 12, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        %EAX<def> = MOV32ri <ga:@JazzFixnumClass>
        %EDX<def> = MOV32ri 123
        RETL %EAX<kill>, %EDX<kill>

Also, I have essentially identical code working perfectly fine when the memory being written to is from @alloca.

I am completely clueless. Any suggestions most welcome.

Cheers,

 -- nikodemus







_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [llvm-dev] [newbie] trouble with global variables and CreateLoad/Store in JIT

Hal Finkel via llvm-dev
My code was hinky, but only in the sense that I was accidentally duplicating the definition variable in the module where the function was. With only the declaration in the second module loading the bitcode reproduces the issue.

Managed an lli reproduction:

$ cat jit-0.ll
target datalayout = "e-m:x-p:32:32-i64:64-f80:32-n8:16:32-a:0:32-S32"
target triple = "i686-pc-windows-msvc"

@foo = global { i8*, i32 } undef

$ cat jit-1-clobber.ll
target datalayout = "e-m:x-p:32:32-i64:64-f80:32-n8:16:32-a:0:32-S32"
target triple = "i686-pc-windows-msvc"

@foo = external global { i8*, i32 }

define void @setfoo() {
entry:
  %p = inttoptr i32 42 to i8*
  store i8* %p, i8** getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0), align 4
  store i32 13, i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1), align 4
  ret void
}

$ cat jit-1-noclobber.ll
target datalayout = "e-m:x-p:32:32-i64:64-f80:32-n8:16:32-a:0:32-S32"
target triple = "i686-pc-windows-msvc"

@foo = external global { i8*, i32 }

define void @setfoo() {
entry:
  %p = inttoptr i32 42 to i8*
  store i8* %p, i8** getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0), align 4
  ret void
}

$ lli -jit-kind=orc-mcjit -extra-module=jit-0.ll -extra-module=jit-1-clobber.ll main.ll; echo $?
13

$ lli -jit-kind=orc-mcjit -extra-module=jit-0.ll -extra-module=jit-1-noclobber.ll main.ll; echo $?
42

(Same happens with -jit-kind=mcjit.)

Cheers,

 -- nikodemus


On Wed, Jun 7, 2017 at 12:41 AM, Nikodemus Siivola <[hidden email]> wrote:
I just managed a quick experiment today to dump and load the definition of the variable and the function that sets it into separate modules.

...loading those bitcode files into separate modules (and handing those modules to JIT) works as expected. What *should* be same code going directly into JIT does not work.

Which smells like the problem may be in my JIT hookup and not in RuntimeDyld.

I'll try to sort out my codepaths before digging into RuntimeDyld, so I can be sure I'm doing same things in "live" JIT and when dumping/loading bitcode.

I'll let you know what turns up.

Cheers,

 -- nikodemus


On Wed, Jun 7, 2017 at 12:16 AM, Sean Silva <[hidden email]> wrote:
That's useful to know that the static compilation code path works. Furthermore, as expected from that:

      52:       c7 05 04 00 00 00 d5 00 00 00   movl    $213, 4
                        00000054:  IMAGE_REL_I386_DIR32 _foo

It looks like the offset `4` of the second field of your struct is correct in the object file, so this does seem to be a problem in the JIT-specific linking/loading.

Can you try debugging into lib/ExecutionEngine/RuntimeDyld/Targets/RuntimeDyldCOFFI386.h to see if the relocation is getting applied correctly in the context of your JIT?

You may be able to repro this more easily using `lli`. It has a `-jit-kind` argument that should get you into the JIT codepath. (see test/ExecutionEngine/{MCJIT,ORCMCJIT}/)

-- Sean Silva


On Tue, Jun 6, 2017 at 1:09 AM, Nikodemus Siivola <[hidden email]> wrote:
This is on Windows 10: didn't yet manage to get a 64-bit toolchain set up that agreed on everything necessary.

Dumped bitcode, but when I did that everything landed in the same module (normally the global is defined in a different module then its uses) --> the relocations are different... different enough that when I loaded the bitcode back in and handed the single module to JIT it worked fine.

I'll try to dump a case where the definition is in a different module tomorrow. 

Anyhow, below is what clang-cl turned the bitcode from my IR into -- probably not very useful though as this code does what it should...

$ llvm-objdump.exe -r -d test.o

test.o: file format COFF-i386

Disassembly of section .text:
.text:
       0:       00 00   addb    %al, (%eax)
                        00000000:  IMAGE_REL_I386_DIR32 _XEP:setfoo
       2:       00 00   addb    %al, (%eax)

_setfoo:
       4:       56      pushl   %esi
       5:       83 ec 40        subl    $64, %esp
       8:       89 e0   movl    %esp, %eax
       a:       c7 00 00 00 00 00       movl    $0, (%eax)
                        0000000c:  IMAGE_REL_I386_DIR32 _foo
      10:       e8 00 00 00 00  calll   0 <_setfoo+0x11>
                        00000011:  IMAGE_REL_I386_REL32 _debugPointer
      15:       89 e1   movl    %esp, %ecx
      17:       c7 01 00 00 00 00       movl    $0, (%ecx)
                        00000019:  IMAGE_REL_I386_DIR32 _foo
      1d:       89 44 24 3c     movl    %eax, 60(%esp)
      21:       89 54 24 38     movl    %edx, 56(%esp)
      25:       e8 00 00 00 00  calll   0 <_setfoo+0x26>
                        00000026:  IMAGE_REL_I386_REL32 _debugInt
      2a:       c7 05 00 00 00 00 00 00 00 00   movl    $0, 0
                        0000002c:  IMAGE_REL_I386_DIR32 _foo
                        00000030:  IMAGE_REL_I386_DIR32 _JazzFixnumClass
      34:       b9 00 00 00 00  movl    $0, %ecx
                        00000035:  IMAGE_REL_I386_DIR32 _JazzFixnumClass
      39:       89 e6   movl    %esp, %esi
      3b:       c7 06 04 00 00 00       movl    $4, (%esi)
                        0000003d:  IMAGE_REL_I386_DIR32 _foo
      41:       89 44 24 34     movl    %eax, 52(%esp)
      45:       89 54 24 30     movl    %edx, 48(%esp)
      49:       89 4c 24 2c     movl    %ecx, 44(%esp)
      4d:       e8 00 00 00 00  calll   0 <_setfoo+0x4E>
                        0000004e:  IMAGE_REL_I386_REL32 _debugInt
      52:       c7 05 04 00 00 00 d5 00 00 00   movl    $213, 4
                        00000054:  IMAGE_REL_I386_DIR32 _foo
      5c:       89 e1   movl    %esp, %ecx
      5e:       c7 01 00 00 00 00       movl    $0, (%ecx)
                        00000060:  IMAGE_REL_I386_DIR32 _foo
      64:       89 44 24 28     movl    %eax, 40(%esp)
      68:       89 54 24 24     movl    %edx, 36(%esp)
      6c:       e8 00 00 00 00  calll   0 <_setfoo+0x6D>
                        0000006d:  IMAGE_REL_I386_REL32 _debugPointer
      71:       c7 05 00 00 00 00 00 00 00 00   movl    $0, 0
                        00000073:  IMAGE_REL_I386_DIR32 _foo
                        00000077:  IMAGE_REL_I386_DIR32 _JazzFixnumClass
      7b:       c7 05 04 00 00 00 d5 00 00 00   movl    $213, 4
                        0000007d:  IMAGE_REL_I386_DIR32 _foo
      85:       89 e1   movl    %esp, %ecx
      87:       c7 01 00 00 00 00       movl    $0, (%ecx)
                        00000089:  IMAGE_REL_I386_DIR32 _foo
      8d:       89 44 24 20     movl    %eax, 32(%esp)
      91:       89 54 24 1c     movl    %edx, 28(%esp)
      95:       e8 00 00 00 00  calll   0 <_setfoo+0x96>
                        00000096:  IMAGE_REL_I386_REL32 _debugPointer
      9a:       89 e1   movl    %esp, %ecx
      9c:       c7 41 08 d5 00 00 00    movl    $213, 8(%ecx)
      a3:       c7 41 04 00 00 00 00    movl    $0, 4(%ecx)
                        000000a6:  IMAGE_REL_I386_DIR32 _JazzFixnumClass
      aa:       c7 01 00 00 00 00       movl    $0, (%ecx)
                        000000ac:  IMAGE_REL_I386_DIR32 _foo
      b0:       89 44 24 18     movl    %eax, 24(%esp)
      b4:       89 54 24 14     movl    %edx, 20(%esp)
      b8:       e8 00 00 00 00  calll   0 <_setfoo+0xB9>
                        000000b9:  IMAGE_REL_I386_REL32 _setGlobal
      bd:       89 e0   movl    %esp, %eax
      bf:       c7 00 00 00 00 00       movl    $0, (%eax)
                        000000c1:  IMAGE_REL_I386_DIR32 _foo
      c5:       e8 00 00 00 00  calll   0 <_setfoo+0xC6>
                        000000c6:  IMAGE_REL_I386_REL32 _debugPointer
      ca:       b9 d5 00 00 00  movl    $213, %ecx
      cf:       8b 74 24 2c     movl    44(%esp), %esi
      d3:       89 44 24 10     movl    %eax, 16(%esp)
      d7:       89 f0   movl    %esi, %eax
      d9:       89 54 24 0c     movl    %edx, 12(%esp)
      dd:       89 ca   movl    %ecx, %edx
      df:       83 c4 40        addl    $64, %esp
      e2:       5e      popl    %esi
      e3:       c3      retl
      e4:       66 66 66 2e 0f 1f 84 00 00 00 00 00     nopw    %cs:(%eax,%eax)

_XEP:setfoo:
      f0:       8b 44 24 04     movl    4(%esp), %eax
      f4:       83 f8 00        cmpl    $0, %eax
      f7:       0f 84 05 00 00 00       je      5 <_XEP:setfoo+0x12>
      fd:       e8 00 00 00 00  calll   0 <_XEP:setfoo+0x12>
                        000000fe:  IMAGE_REL_I386_REL32 _typeError
     102:       e8 00 00 00 00  calll   0 <_XEP:setfoo+0x17>
                        00000103:  IMAGE_REL_I386_REL32 _setfoo
     107:       c3      retl
     108:       0f 1f 84 00 00 00 00 00         nopl    (%eax,%eax)
     110:       00 00   addb    %al, (%eax)
                        00000110:  IMAGE_REL_I386_DIR32 _XEP:getfoo
     112:       00 00   addb    %al, (%eax)

_getfoo:
     114:       50      pushl   %eax
     115:       89 e0   movl    %esp, %eax
     117:       c7 00 00 00 00 00       movl    $0, (%eax)
                        00000119:  IMAGE_REL_I386_DIR32 _foo
     11d:       e8 00 00 00 00  calll   0 <_getfoo+0xE>
                        0000011e:  IMAGE_REL_I386_REL32 _getGlobal
     122:       59      popl    %ecx
     123:       c3      retl
     124:       66 66 66 2e 0f 1f 84 00 00 00 00 00     nopw    %cs:(%eax,%eax)

_XEP:getfoo:
     130:       8b 44 24 04     movl    4(%esp), %eax
     134:       83 f8 00        cmpl    $0, %eax
     137:       0f 84 05 00 00 00       je      5 <_XEP:getfoo+0x12>
     13d:       e8 00 00 00 00  calll   0 <_XEP:getfoo+0x12>
                        0000013e:  IMAGE_REL_I386_REL32 _typeError
     142:       e8 00 00 00 00  calll   0 <_XEP:getfoo+0x17>
                        00000143:  IMAGE_REL_I386_REL32 _getfoo
     147:       c3      retl


On Tue, Jun 6, 2017 at 3:18 AM, Sean Silva <[hidden email]> wrote:


On Mon, Jun 5, 2017 at 1:34 PM, Nikodemus Siivola <[hidden email]> wrote:
Uh. Turns out that if I hide the pointer to @foo from LLVM by passing it through an opaque identity function ... then everything works fine.

Is this a bug in LLVM or is there some magic involving globals I'm misunderstanding?

This looks like a bug in the handling of constant GEP's. Specifically the `getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1)` used to calculate the address of the integer inside the struct. Your observation "The bizarre thing is that even this looks correct: the debugInt is called first with @foo, then @foo+4, and the stores seem to be going to the right addresses as well: @foo and @foo+4!" at the level of the MachineInstr dump rules out problems before that.

After MachineInstr comes MC to emit the object file, but `foo+4` is one of the most basic relocation types, so I doubt that there's a bug in the lowering there or else "everything" would be broken.
Just to verify though, checking assembly of a small example across 32-bit targets of all 3 object file formats looks fine at a glance (MC is getting the +4 addend, though you would need to run `llvm-objdump -d -r` to see the actual relocation in the binary) .

Beyond MC, you already have your static object file. If that is fine, then in a JIT context you might be running into issues with RuntimeDyld. The actual GEP's that clang generates are identical to the ones in your code, further suggesting that this is JIT specific and that static links are unaffected (if you could verify that, it would help to narrow down the possibilities).
Maybe look at the output of `llvm-objdump -d -r` on a static .o file generated from your IR and see where the relocation is handled in lib/ExecutionEngine/RuntimeDyld (this will depend on your platform; grepping for the name of the relocation shown by llvm-objdump should find the right code to look at).

By the way, what platform are you JIT'ing on? I noticed that it is a 32-bit target, and I suspect that the 32-bit support in the JIT infrastructure isn't as well tested / commonly used as the 64-bit code, possibly explaining why this sort of bug could sneak through.

-- Sean Silva
 

define { i8*, i32 } @"__anonToplevel/0"() prefix { i8*, i32 } (i32)* @"XEP:__anonToplevel/0" {
entry:
  %0 = call { i8*, i32 }* @identity({ i8*, i32 }* nonnull @foo)
  %1 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
  %2 = getelementptr { i8*, i32 }, { i8*, i32 }* %0, i32 0, i32 0
  %3 = ptrtoint { i8*, i32 }* %0 to i32
  %4 = call { i8*, i32 } @debugInt(i32 %3)
  store i8* @FixnumClass, i8** %2, align 4
  %5 = getelementptr { i8*, i32 }, { i8*, i32 }* %0, i32 0, i32 1
  %6 = ptrtoint i32* %5 to i32
  %7 = call { i8*, i32 } @debugInt(i32 %6)
  store i32 123, i32* %5, align 4
  %8 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
  store i8* @FixnumClass, i8** %2, align 4
  store i32 123, i32* %5, align 4
  %9 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
  call void @setGlobal({ i8*, i32 }* %0, { i8*, i32 } { i8* @FixnumClass, i32 123 })
  %10 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
  ret { i8*, i32 } { i8* @FixnumClass, i32 123 }
}

Output, now with correct addresses out of the GEPs, and memory being modified as expected:

p = 02F80000
  class: 00000000
  datum: 00000000
x = 02F80000
x = 02F80004
p = 02F80000
  class: 028D3E98
  datum: 0000007B
p = 02F80000
  class: 028D3E98
  datum: 0000007B
p = 02F80000
  class: 028D3E98
  datum: 0000007B

Cheers,

 -- nikodemus


On Mon, Jun 5, 2017 at 10:57 PM, Nikodemus Siivola <[hidden email]> wrote:
Since the getelementptrs were implicitly generated by the CreateStore/Load I'm not sure how to get access to them.

So I hacked the assignment to be done thrice: once using a manual decomposition into two GEPs and stores, once using the "big" CreateStore, once via the setGlobal function, printing addresses and memory contents at each point to the degree that I have access to them.

It seems the following GEPs compute the same address?! I can buy myself not understanding how GEP works and doing it wrong, but builder.CreateStore() creates what look like identical GEPs implicitly...

i8** getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0), align 4
i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1), align 4

The details.

This is the relevant part from my codegen:

            auto ty = val->getType();
            cout << "val type:" << endl;
            ty->dump();
            cout << "ptr type:" << endl;
            ptr->getType()->dump();
            // Print memory
            ctx.EmitCall1("debugPointer", ptr);
            // Set class pointer
            auto c = ctx.bld.CreateExtractValue(val, 0, "class");
            auto cp = ctx.bld.CreateConstGEP2_32(ty, ptr, 0, 0);
            auto cx = ctx.bld.CreatePtrToInt(cp, ctx.Int32Type());
            ctx.EmitCall1("debugInt", cx);
            ctx.bld.CreateStore(c, cp);
            // Set datum
            auto d = ctx.bld.CreateExtractValue(val, 1, "datum");
            auto dp = ctx.bld.CreateConstGEP2_32(ty, ptr, 0, 1);
            auto dx = ctx.bld.CreatePtrToInt(dp, ctx.Int32Type());
            ctx.EmitCall1("debugInt", dx);
            ctx.bld.CreateStore(d, dp);
            // Print memory
            ctx.EmitCall1("debugPointer", ptr);
            // Do the same with a single store
            ctx.bld.CreateStore(val, ptr);
            // Print memory
            ctx.EmitCall1("debugPointer", ptr);
            // Call out
            ctx.EmitCall2("setGlobal", ptr, val);
            // Print memory
            ctx.EmitCall1("debugPointer", ptr);

Here is the compile-time output showing types of the value and the pointer:

val type:
{ i8*, i32 }
ptr type:
{ i8*, i32 }*

Here is the IR dump for the function (after a couple of passes), right before it's fed to the JIT:

define { i8*, i32 } @"__anonToplevel/0"() prefix { i8*, i32 } (i32)* @"XEP:__anonToplevel/0" {
entry:
  %0 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
  %1 = call { i8*, i32 } @debugInt(i32 ptrtoint ({ i8*, i32 }* @foo to i32))
  store i8* @FixnumClass, i8** getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0), align 4
  %2 = call { i8*, i32 } @debugInt(i32 ptrtoint (i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1) to i32))
  store i32 123, i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1), align 4
  %3 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
  store i8* @FixnumClass, i8** getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0), align 4
  store i32 123, i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1), align 4
  %4 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
  call void @setGlobal({ i8*, i32 }* nonnull @foo, { i8*, i32 } { i8* @FixnumClass, i32 123 })
  %5 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
  ret { i8*, i32 } { i8* @FixnumClass, i32 123 }
}

​Here is the runtime from calling the JITed function, including memory addresses and contents, with my annotations:

# Before
p = 03C10000
  class: 00000000
  datum: 00000000
# Should be address of the class slot --> correct
x = 03C10000
# Should be address of the datum slot, ie address of class slot + 4 --> incorrect
x = 03C10000
# Yeah, both values want to class slot, so actual class pointer got clobbered
p = 03C10000
  class: 0000007B
  datum: 00000000
# Same result from the single CreateStore
p = 03C10000
  class: 0000007B
  datum: 00000000
# Calling out to setGlobal as in my first email works
p = 03C10000
  class: 039D2E98
  datum: 0000007B

Finally, I didn't manage nice disassembly yet, so here is the last output from --print-after-all for the function. The bizarre thing is that even this looks correct: the debugInt is called first with @foo, then @foo+4, and the stores seem to be going to the right addresses as well: @foo and @foo+4!

BB#0: derived from LLVM BB %entry
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugInt>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        MOV32mi %noreg, 1, %noreg, <ga:@foo>, %noreg, <ga:@JazzFixnumClass>; mem:ST4[getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0)]
        PUSHi32 <ga:@foo+4>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugInt>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        MOV32mi %noreg, 1, %noreg, <ga:@foo+4>, %noreg, 123; mem:ST4[getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1)]
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        MOV32mi %noreg, 1, %noreg, <ga:@foo>, %noreg, <ga:@JazzFixnumClass>; mem:ST4[getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0)]
        MOV32mi %noreg, 1, %noreg, <ga:@foo+4>, %noreg, 123; mem:ST4[getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1)]
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        PUSH32i8 123, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        PUSHi32 <ga:@JazzFixnumClass>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@setGlobal>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 12, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        %EAX<def> = MOV32ri <ga:@JazzFixnumClass>
        %EDX<def> = MOV32ri 123
        RETL %EAX<kill>, %EDX<kill>

Also, I have essentially identical code working perfectly fine when the memory being written to is from @alloca.

I am completely clueless. Any suggestions most welcome.

Cheers,

 -- nikodemus








_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [llvm-dev] [newbie] trouble with global variables and CreateLoad/Store in JIT

Hal Finkel via llvm-dev
Great work!

This is ready to post into a bug on llvm.org/bugs. If you're feeling a bit adventurous, feel free to also try to debug it and post any clues (setting breakpoints in the functions in  lib/ExecutionEngine/RuntimeDyld/Targets/RuntimeDyldCOFFI386.h is how I would start).

Lang (CC'd) may have some other tips for where to look. (I'm actually not very familiar with the JIT infrastructure myself, so take my advice with a grain of salt)

-- Sean Silva

On Tue, Jun 6, 2017 at 10:30 PM, Nikodemus Siivola <[hidden email]> wrote:
My code was hinky, but only in the sense that I was accidentally duplicating the definition variable in the module where the function was. With only the declaration in the second module loading the bitcode reproduces the issue.

Managed an lli reproduction:

$ cat jit-0.ll
target datalayout = "e-m:x-p:32:32-i64:64-f80:32-n8:16:32-a:0:32-S32"
target triple = "i686-pc-windows-msvc"

@foo = global { i8*, i32 } undef

$ cat jit-1-clobber.ll
target datalayout = "e-m:x-p:32:32-i64:64-f80:32-n8:16:32-a:0:32-S32"
target triple = "i686-pc-windows-msvc"

@foo = external global { i8*, i32 }

define void @setfoo() {
entry:
  %p = inttoptr i32 42 to i8*
  store i8* %p, i8** getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0), align 4
  store i32 13, i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1), align 4
  ret void
}

$ cat jit-1-noclobber.ll
target datalayout = "e-m:x-p:32:32-i64:64-f80:32-n8:16:32-a:0:32-S32"
target triple = "i686-pc-windows-msvc"

@foo = external global { i8*, i32 }

define void @setfoo() {
entry:
  %p = inttoptr i32 42 to i8*
  store i8* %p, i8** getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0), align 4
  ret void
}

$ lli -jit-kind=orc-mcjit -extra-module=jit-0.ll -extra-module=jit-1-clobber.ll main.ll; echo $?
13

$ lli -jit-kind=orc-mcjit -extra-module=jit-0.ll -extra-module=jit-1-noclobber.ll main.ll; echo $?
42

(Same happens with -jit-kind=mcjit.)

Cheers,

 -- nikodemus


On Wed, Jun 7, 2017 at 12:41 AM, Nikodemus Siivola <[hidden email]> wrote:
I just managed a quick experiment today to dump and load the definition of the variable and the function that sets it into separate modules.

...loading those bitcode files into separate modules (and handing those modules to JIT) works as expected. What *should* be same code going directly into JIT does not work.

Which smells like the problem may be in my JIT hookup and not in RuntimeDyld.

I'll try to sort out my codepaths before digging into RuntimeDyld, so I can be sure I'm doing same things in "live" JIT and when dumping/loading bitcode.

I'll let you know what turns up.

Cheers,

 -- nikodemus


On Wed, Jun 7, 2017 at 12:16 AM, Sean Silva <[hidden email]> wrote:
That's useful to know that the static compilation code path works. Furthermore, as expected from that:

      52:       c7 05 04 00 00 00 d5 00 00 00   movl    $213, 4
                        00000054:  IMAGE_REL_I386_DIR32 _foo

It looks like the offset `4` of the second field of your struct is correct in the object file, so this does seem to be a problem in the JIT-specific linking/loading.

Can you try debugging into lib/ExecutionEngine/RuntimeDyld/Targets/RuntimeDyldCOFFI386.h to see if the relocation is getting applied correctly in the context of your JIT?

You may be able to repro this more easily using `lli`. It has a `-jit-kind` argument that should get you into the JIT codepath. (see test/ExecutionEngine/{MCJIT,ORCMCJIT}/)

-- Sean Silva


On Tue, Jun 6, 2017 at 1:09 AM, Nikodemus Siivola <[hidden email]> wrote:
This is on Windows 10: didn't yet manage to get a 64-bit toolchain set up that agreed on everything necessary.

Dumped bitcode, but when I did that everything landed in the same module (normally the global is defined in a different module then its uses) --> the relocations are different... different enough that when I loaded the bitcode back in and handed the single module to JIT it worked fine.

I'll try to dump a case where the definition is in a different module tomorrow. 

Anyhow, below is what clang-cl turned the bitcode from my IR into -- probably not very useful though as this code does what it should...

$ llvm-objdump.exe -r -d test.o

test.o: file format COFF-i386

Disassembly of section .text:
.text:
       0:       00 00   addb    %al, (%eax)
                        00000000:  IMAGE_REL_I386_DIR32 _XEP:setfoo
       2:       00 00   addb    %al, (%eax)

_setfoo:
       4:       56      pushl   %esi
       5:       83 ec 40        subl    $64, %esp
       8:       89 e0   movl    %esp, %eax
       a:       c7 00 00 00 00 00       movl    $0, (%eax)
                        0000000c:  IMAGE_REL_I386_DIR32 _foo
      10:       e8 00 00 00 00  calll   0 <_setfoo+0x11>
                        00000011:  IMAGE_REL_I386_REL32 _debugPointer
      15:       89 e1   movl    %esp, %ecx
      17:       c7 01 00 00 00 00       movl    $0, (%ecx)
                        00000019:  IMAGE_REL_I386_DIR32 _foo
      1d:       89 44 24 3c     movl    %eax, 60(%esp)
      21:       89 54 24 38     movl    %edx, 56(%esp)
      25:       e8 00 00 00 00  calll   0 <_setfoo+0x26>
                        00000026:  IMAGE_REL_I386_REL32 _debugInt
      2a:       c7 05 00 00 00 00 00 00 00 00   movl    $0, 0
                        0000002c:  IMAGE_REL_I386_DIR32 _foo
                        00000030:  IMAGE_REL_I386_DIR32 _JazzFixnumClass
      34:       b9 00 00 00 00  movl    $0, %ecx
                        00000035:  IMAGE_REL_I386_DIR32 _JazzFixnumClass
      39:       89 e6   movl    %esp, %esi
      3b:       c7 06 04 00 00 00       movl    $4, (%esi)
                        0000003d:  IMAGE_REL_I386_DIR32 _foo
      41:       89 44 24 34     movl    %eax, 52(%esp)
      45:       89 54 24 30     movl    %edx, 48(%esp)
      49:       89 4c 24 2c     movl    %ecx, 44(%esp)
      4d:       e8 00 00 00 00  calll   0 <_setfoo+0x4E>
                        0000004e:  IMAGE_REL_I386_REL32 _debugInt
      52:       c7 05 04 00 00 00 d5 00 00 00   movl    $213, 4
                        00000054:  IMAGE_REL_I386_DIR32 _foo
      5c:       89 e1   movl    %esp, %ecx
      5e:       c7 01 00 00 00 00       movl    $0, (%ecx)
                        00000060:  IMAGE_REL_I386_DIR32 _foo
      64:       89 44 24 28     movl    %eax, 40(%esp)
      68:       89 54 24 24     movl    %edx, 36(%esp)
      6c:       e8 00 00 00 00  calll   0 <_setfoo+0x6D>
                        0000006d:  IMAGE_REL_I386_REL32 _debugPointer
      71:       c7 05 00 00 00 00 00 00 00 00   movl    $0, 0
                        00000073:  IMAGE_REL_I386_DIR32 _foo
                        00000077:  IMAGE_REL_I386_DIR32 _JazzFixnumClass
      7b:       c7 05 04 00 00 00 d5 00 00 00   movl    $213, 4
                        0000007d:  IMAGE_REL_I386_DIR32 _foo
      85:       89 e1   movl    %esp, %ecx
      87:       c7 01 00 00 00 00       movl    $0, (%ecx)
                        00000089:  IMAGE_REL_I386_DIR32 _foo
      8d:       89 44 24 20     movl    %eax, 32(%esp)
      91:       89 54 24 1c     movl    %edx, 28(%esp)
      95:       e8 00 00 00 00  calll   0 <_setfoo+0x96>
                        00000096:  IMAGE_REL_I386_REL32 _debugPointer
      9a:       89 e1   movl    %esp, %ecx
      9c:       c7 41 08 d5 00 00 00    movl    $213, 8(%ecx)
      a3:       c7 41 04 00 00 00 00    movl    $0, 4(%ecx)
                        000000a6:  IMAGE_REL_I386_DIR32 _JazzFixnumClass
      aa:       c7 01 00 00 00 00       movl    $0, (%ecx)
                        000000ac:  IMAGE_REL_I386_DIR32 _foo
      b0:       89 44 24 18     movl    %eax, 24(%esp)
      b4:       89 54 24 14     movl    %edx, 20(%esp)
      b8:       e8 00 00 00 00  calll   0 <_setfoo+0xB9>
                        000000b9:  IMAGE_REL_I386_REL32 _setGlobal
      bd:       89 e0   movl    %esp, %eax
      bf:       c7 00 00 00 00 00       movl    $0, (%eax)
                        000000c1:  IMAGE_REL_I386_DIR32 _foo
      c5:       e8 00 00 00 00  calll   0 <_setfoo+0xC6>
                        000000c6:  IMAGE_REL_I386_REL32 _debugPointer
      ca:       b9 d5 00 00 00  movl    $213, %ecx
      cf:       8b 74 24 2c     movl    44(%esp), %esi
      d3:       89 44 24 10     movl    %eax, 16(%esp)
      d7:       89 f0   movl    %esi, %eax
      d9:       89 54 24 0c     movl    %edx, 12(%esp)
      dd:       89 ca   movl    %ecx, %edx
      df:       83 c4 40        addl    $64, %esp
      e2:       5e      popl    %esi
      e3:       c3      retl
      e4:       66 66 66 2e 0f 1f 84 00 00 00 00 00     nopw    %cs:(%eax,%eax)

_XEP:setfoo:
      f0:       8b 44 24 04     movl    4(%esp), %eax
      f4:       83 f8 00        cmpl    $0, %eax
      f7:       0f 84 05 00 00 00       je      5 <_XEP:setfoo+0x12>
      fd:       e8 00 00 00 00  calll   0 <_XEP:setfoo+0x12>
                        000000fe:  IMAGE_REL_I386_REL32 _typeError
     102:       e8 00 00 00 00  calll   0 <_XEP:setfoo+0x17>
                        00000103:  IMAGE_REL_I386_REL32 _setfoo
     107:       c3      retl
     108:       0f 1f 84 00 00 00 00 00         nopl    (%eax,%eax)
     110:       00 00   addb    %al, (%eax)
                        00000110:  IMAGE_REL_I386_DIR32 _XEP:getfoo
     112:       00 00   addb    %al, (%eax)

_getfoo:
     114:       50      pushl   %eax
     115:       89 e0   movl    %esp, %eax
     117:       c7 00 00 00 00 00       movl    $0, (%eax)
                        00000119:  IMAGE_REL_I386_DIR32 _foo
     11d:       e8 00 00 00 00  calll   0 <_getfoo+0xE>
                        0000011e:  IMAGE_REL_I386_REL32 _getGlobal
     122:       59      popl    %ecx
     123:       c3      retl
     124:       66 66 66 2e 0f 1f 84 00 00 00 00 00     nopw    %cs:(%eax,%eax)

_XEP:getfoo:
     130:       8b 44 24 04     movl    4(%esp), %eax
     134:       83 f8 00        cmpl    $0, %eax
     137:       0f 84 05 00 00 00       je      5 <_XEP:getfoo+0x12>
     13d:       e8 00 00 00 00  calll   0 <_XEP:getfoo+0x12>
                        0000013e:  IMAGE_REL_I386_REL32 _typeError
     142:       e8 00 00 00 00  calll   0 <_XEP:getfoo+0x17>
                        00000143:  IMAGE_REL_I386_REL32 _getfoo
     147:       c3      retl


On Tue, Jun 6, 2017 at 3:18 AM, Sean Silva <[hidden email]> wrote:


On Mon, Jun 5, 2017 at 1:34 PM, Nikodemus Siivola <[hidden email]> wrote:
Uh. Turns out that if I hide the pointer to @foo from LLVM by passing it through an opaque identity function ... then everything works fine.

Is this a bug in LLVM or is there some magic involving globals I'm misunderstanding?

This looks like a bug in the handling of constant GEP's. Specifically the `getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1)` used to calculate the address of the integer inside the struct. Your observation "The bizarre thing is that even this looks correct: the debugInt is called first with @foo, then @foo+4, and the stores seem to be going to the right addresses as well: @foo and @foo+4!" at the level of the MachineInstr dump rules out problems before that.

After MachineInstr comes MC to emit the object file, but `foo+4` is one of the most basic relocation types, so I doubt that there's a bug in the lowering there or else "everything" would be broken.
Just to verify though, checking assembly of a small example across 32-bit targets of all 3 object file formats looks fine at a glance (MC is getting the +4 addend, though you would need to run `llvm-objdump -d -r` to see the actual relocation in the binary) .

Beyond MC, you already have your static object file. If that is fine, then in a JIT context you might be running into issues with RuntimeDyld. The actual GEP's that clang generates are identical to the ones in your code, further suggesting that this is JIT specific and that static links are unaffected (if you could verify that, it would help to narrow down the possibilities).
Maybe look at the output of `llvm-objdump -d -r` on a static .o file generated from your IR and see where the relocation is handled in lib/ExecutionEngine/RuntimeDyld (this will depend on your platform; grepping for the name of the relocation shown by llvm-objdump should find the right code to look at).

By the way, what platform are you JIT'ing on? I noticed that it is a 32-bit target, and I suspect that the 32-bit support in the JIT infrastructure isn't as well tested / commonly used as the 64-bit code, possibly explaining why this sort of bug could sneak through.

-- Sean Silva
 

define { i8*, i32 } @"__anonToplevel/0"() prefix { i8*, i32 } (i32)* @"XEP:__anonToplevel/0" {
entry:
  %0 = call { i8*, i32 }* @identity({ i8*, i32 }* nonnull @foo)
  %1 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
  %2 = getelementptr { i8*, i32 }, { i8*, i32 }* %0, i32 0, i32 0
  %3 = ptrtoint { i8*, i32 }* %0 to i32
  %4 = call { i8*, i32 } @debugInt(i32 %3)
  store i8* @FixnumClass, i8** %2, align 4
  %5 = getelementptr { i8*, i32 }, { i8*, i32 }* %0, i32 0, i32 1
  %6 = ptrtoint i32* %5 to i32
  %7 = call { i8*, i32 } @debugInt(i32 %6)
  store i32 123, i32* %5, align 4
  %8 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
  store i8* @FixnumClass, i8** %2, align 4
  store i32 123, i32* %5, align 4
  %9 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
  call void @setGlobal({ i8*, i32 }* %0, { i8*, i32 } { i8* @FixnumClass, i32 123 })
  %10 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
  ret { i8*, i32 } { i8* @FixnumClass, i32 123 }
}

Output, now with correct addresses out of the GEPs, and memory being modified as expected:

p = 02F80000
  class: 00000000
  datum: 00000000
x = 02F80000
x = 02F80004
p = 02F80000
  class: 028D3E98
  datum: 0000007B
p = 02F80000
  class: 028D3E98
  datum: 0000007B
p = 02F80000
  class: 028D3E98
  datum: 0000007B

Cheers,

 -- nikodemus


On Mon, Jun 5, 2017 at 10:57 PM, Nikodemus Siivola <[hidden email]> wrote:
Since the getelementptrs were implicitly generated by the CreateStore/Load I'm not sure how to get access to them.

So I hacked the assignment to be done thrice: once using a manual decomposition into two GEPs and stores, once using the "big" CreateStore, once via the setGlobal function, printing addresses and memory contents at each point to the degree that I have access to them.

It seems the following GEPs compute the same address?! I can buy myself not understanding how GEP works and doing it wrong, but builder.CreateStore() creates what look like identical GEPs implicitly...

i8** getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0), align 4
i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1), align 4

The details.

This is the relevant part from my codegen:

            auto ty = val->getType();
            cout << "val type:" << endl;
            ty->dump();
            cout << "ptr type:" << endl;
            ptr->getType()->dump();
            // Print memory
            ctx.EmitCall1("debugPointer", ptr);
            // Set class pointer
            auto c = ctx.bld.CreateExtractValue(val, 0, "class");
            auto cp = ctx.bld.CreateConstGEP2_32(ty, ptr, 0, 0);
            auto cx = ctx.bld.CreatePtrToInt(cp, ctx.Int32Type());
            ctx.EmitCall1("debugInt", cx);
            ctx.bld.CreateStore(c, cp);
            // Set datum
            auto d = ctx.bld.CreateExtractValue(val, 1, "datum");
            auto dp = ctx.bld.CreateConstGEP2_32(ty, ptr, 0, 1);
            auto dx = ctx.bld.CreatePtrToInt(dp, ctx.Int32Type());
            ctx.EmitCall1("debugInt", dx);
            ctx.bld.CreateStore(d, dp);
            // Print memory
            ctx.EmitCall1("debugPointer", ptr);
            // Do the same with a single store
            ctx.bld.CreateStore(val, ptr);
            // Print memory
            ctx.EmitCall1("debugPointer", ptr);
            // Call out
            ctx.EmitCall2("setGlobal", ptr, val);
            // Print memory
            ctx.EmitCall1("debugPointer", ptr);

Here is the compile-time output showing types of the value and the pointer:

val type:
{ i8*, i32 }
ptr type:
{ i8*, i32 }*

Here is the IR dump for the function (after a couple of passes), right before it's fed to the JIT:

define { i8*, i32 } @"__anonToplevel/0"() prefix { i8*, i32 } (i32)* @"XEP:__anonToplevel/0" {
entry:
  %0 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
  %1 = call { i8*, i32 } @debugInt(i32 ptrtoint ({ i8*, i32 }* @foo to i32))
  store i8* @FixnumClass, i8** getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0), align 4
  %2 = call { i8*, i32 } @debugInt(i32 ptrtoint (i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1) to i32))
  store i32 123, i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1), align 4
  %3 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
  store i8* @FixnumClass, i8** getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0), align 4
  store i32 123, i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1), align 4
  %4 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
  call void @setGlobal({ i8*, i32 }* nonnull @foo, { i8*, i32 } { i8* @FixnumClass, i32 123 })
  %5 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
  ret { i8*, i32 } { i8* @FixnumClass, i32 123 }
}

​Here is the runtime from calling the JITed function, including memory addresses and contents, with my annotations:

# Before
p = 03C10000
  class: 00000000
  datum: 00000000
# Should be address of the class slot --> correct
x = 03C10000
# Should be address of the datum slot, ie address of class slot + 4 --> incorrect
x = 03C10000
# Yeah, both values want to class slot, so actual class pointer got clobbered
p = 03C10000
  class: 0000007B
  datum: 00000000
# Same result from the single CreateStore
p = 03C10000
  class: 0000007B
  datum: 00000000
# Calling out to setGlobal as in my first email works
p = 03C10000
  class: 039D2E98
  datum: 0000007B

Finally, I didn't manage nice disassembly yet, so here is the last output from --print-after-all for the function. The bizarre thing is that even this looks correct: the debugInt is called first with @foo, then @foo+4, and the stores seem to be going to the right addresses as well: @foo and @foo+4!

BB#0: derived from LLVM BB %entry
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugInt>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        MOV32mi %noreg, 1, %noreg, <ga:@foo>, %noreg, <ga:@JazzFixnumClass>; mem:ST4[getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0)]
        PUSHi32 <ga:@foo+4>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugInt>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        MOV32mi %noreg, 1, %noreg, <ga:@foo+4>, %noreg, 123; mem:ST4[getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1)]
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        MOV32mi %noreg, 1, %noreg, <ga:@foo>, %noreg, <ga:@JazzFixnumClass>; mem:ST4[getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0)]
        MOV32mi %noreg, 1, %noreg, <ga:@foo+4>, %noreg, 123; mem:ST4[getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1)]
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        PUSH32i8 123, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        PUSHi32 <ga:@JazzFixnumClass>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@setGlobal>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 12, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        %EAX<def> = MOV32ri <ga:@JazzFixnumClass>
        %EDX<def> = MOV32ri 123
        RETL %EAX<kill>, %EDX<kill>

Also, I have essentially identical code working perfectly fine when the memory being written to is from @alloca.

I am completely clueless. Any suggestions most welcome.

Cheers,

 -- nikodemus









_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [llvm-dev] [newbie] trouble with global variables and CreateLoad/Store in JIT

Hal Finkel via llvm-dev
Done: https://bugs.llvm.org//show_bug.cgi?id=33344

Thanks for assistance!

Cheers,

 -- nikodemus


On Wed, Jun 7, 2017 at 8:40 AM, Sean Silva <[hidden email]> wrote:
Great work!

This is ready to post into a bug on llvm.org/bugs. If you're feeling a bit adventurous, feel free to also try to debug it and post any clues (setting breakpoints in the functions in  lib/ExecutionEngine/RuntimeDyld/Targets/RuntimeDyldCOFFI386.h is how I would start).

Lang (CC'd) may have some other tips for where to look. (I'm actually not very familiar with the JIT infrastructure myself, so take my advice with a grain of salt)

-- Sean Silva

On Tue, Jun 6, 2017 at 10:30 PM, Nikodemus Siivola <[hidden email]> wrote:
My code was hinky, but only in the sense that I was accidentally duplicating the definition variable in the module where the function was. With only the declaration in the second module loading the bitcode reproduces the issue.

Managed an lli reproduction:

$ cat jit-0.ll
target datalayout = "e-m:x-p:32:32-i64:64-f80:32-n8:16:32-a:0:32-S32"
target triple = "i686-pc-windows-msvc"

@foo = global { i8*, i32 } undef

$ cat jit-1-clobber.ll
target datalayout = "e-m:x-p:32:32-i64:64-f80:32-n8:16:32-a:0:32-S32"
target triple = "i686-pc-windows-msvc"

@foo = external global { i8*, i32 }

define void @setfoo() {
entry:
  %p = inttoptr i32 42 to i8*
  store i8* %p, i8** getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0), align 4
  store i32 13, i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1), align 4
  ret void
}

$ cat jit-1-noclobber.ll
target datalayout = "e-m:x-p:32:32-i64:64-f80:32-n8:16:32-a:0:32-S32"
target triple = "i686-pc-windows-msvc"

@foo = external global { i8*, i32 }

define void @setfoo() {
entry:
  %p = inttoptr i32 42 to i8*
  store i8* %p, i8** getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0), align 4
  ret void
}

$ lli -jit-kind=orc-mcjit -extra-module=jit-0.ll -extra-module=jit-1-clobber.ll main.ll; echo $?
13

$ lli -jit-kind=orc-mcjit -extra-module=jit-0.ll -extra-module=jit-1-noclobber.ll main.ll; echo $?
42

(Same happens with -jit-kind=mcjit.)

Cheers,

 -- nikodemus


On Wed, Jun 7, 2017 at 12:41 AM, Nikodemus Siivola <[hidden email]> wrote:
I just managed a quick experiment today to dump and load the definition of the variable and the function that sets it into separate modules.

...loading those bitcode files into separate modules (and handing those modules to JIT) works as expected. What *should* be same code going directly into JIT does not work.

Which smells like the problem may be in my JIT hookup and not in RuntimeDyld.

I'll try to sort out my codepaths before digging into RuntimeDyld, so I can be sure I'm doing same things in "live" JIT and when dumping/loading bitcode.

I'll let you know what turns up.

Cheers,

 -- nikodemus


On Wed, Jun 7, 2017 at 12:16 AM, Sean Silva <[hidden email]> wrote:
That's useful to know that the static compilation code path works. Furthermore, as expected from that:

      52:       c7 05 04 00 00 00 d5 00 00 00   movl    $213, 4
                        00000054:  IMAGE_REL_I386_DIR32 _foo

It looks like the offset `4` of the second field of your struct is correct in the object file, so this does seem to be a problem in the JIT-specific linking/loading.

Can you try debugging into lib/ExecutionEngine/RuntimeDyld/Targets/RuntimeDyldCOFFI386.h to see if the relocation is getting applied correctly in the context of your JIT?

You may be able to repro this more easily using `lli`. It has a `-jit-kind` argument that should get you into the JIT codepath. (see test/ExecutionEngine/{MCJIT,ORCMCJIT}/)

-- Sean Silva


On Tue, Jun 6, 2017 at 1:09 AM, Nikodemus Siivola <[hidden email]> wrote:
This is on Windows 10: didn't yet manage to get a 64-bit toolchain set up that agreed on everything necessary.

Dumped bitcode, but when I did that everything landed in the same module (normally the global is defined in a different module then its uses) --> the relocations are different... different enough that when I loaded the bitcode back in and handed the single module to JIT it worked fine.

I'll try to dump a case where the definition is in a different module tomorrow. 

Anyhow, below is what clang-cl turned the bitcode from my IR into -- probably not very useful though as this code does what it should...

$ llvm-objdump.exe -r -d test.o

test.o: file format COFF-i386

Disassembly of section .text:
.text:
       0:       00 00   addb    %al, (%eax)
                        00000000:  IMAGE_REL_I386_DIR32 _XEP:setfoo
       2:       00 00   addb    %al, (%eax)

_setfoo:
       4:       56      pushl   %esi
       5:       83 ec 40        subl    $64, %esp
       8:       89 e0   movl    %esp, %eax
       a:       c7 00 00 00 00 00       movl    $0, (%eax)
                        0000000c:  IMAGE_REL_I386_DIR32 _foo
      10:       e8 00 00 00 00  calll   0 <_setfoo+0x11>
                        00000011:  IMAGE_REL_I386_REL32 _debugPointer
      15:       89 e1   movl    %esp, %ecx
      17:       c7 01 00 00 00 00       movl    $0, (%ecx)
                        00000019:  IMAGE_REL_I386_DIR32 _foo
      1d:       89 44 24 3c     movl    %eax, 60(%esp)
      21:       89 54 24 38     movl    %edx, 56(%esp)
      25:       e8 00 00 00 00  calll   0 <_setfoo+0x26>
                        00000026:  IMAGE_REL_I386_REL32 _debugInt
      2a:       c7 05 00 00 00 00 00 00 00 00   movl    $0, 0
                        0000002c:  IMAGE_REL_I386_DIR32 _foo
                        00000030:  IMAGE_REL_I386_DIR32 _JazzFixnumClass
      34:       b9 00 00 00 00  movl    $0, %ecx
                        00000035:  IMAGE_REL_I386_DIR32 _JazzFixnumClass
      39:       89 e6   movl    %esp, %esi
      3b:       c7 06 04 00 00 00       movl    $4, (%esi)
                        0000003d:  IMAGE_REL_I386_DIR32 _foo
      41:       89 44 24 34     movl    %eax, 52(%esp)
      45:       89 54 24 30     movl    %edx, 48(%esp)
      49:       89 4c 24 2c     movl    %ecx, 44(%esp)
      4d:       e8 00 00 00 00  calll   0 <_setfoo+0x4E>
                        0000004e:  IMAGE_REL_I386_REL32 _debugInt
      52:       c7 05 04 00 00 00 d5 00 00 00   movl    $213, 4
                        00000054:  IMAGE_REL_I386_DIR32 _foo
      5c:       89 e1   movl    %esp, %ecx
      5e:       c7 01 00 00 00 00       movl    $0, (%ecx)
                        00000060:  IMAGE_REL_I386_DIR32 _foo
      64:       89 44 24 28     movl    %eax, 40(%esp)
      68:       89 54 24 24     movl    %edx, 36(%esp)
      6c:       e8 00 00 00 00  calll   0 <_setfoo+0x6D>
                        0000006d:  IMAGE_REL_I386_REL32 _debugPointer
      71:       c7 05 00 00 00 00 00 00 00 00   movl    $0, 0
                        00000073:  IMAGE_REL_I386_DIR32 _foo
                        00000077:  IMAGE_REL_I386_DIR32 _JazzFixnumClass
      7b:       c7 05 04 00 00 00 d5 00 00 00   movl    $213, 4
                        0000007d:  IMAGE_REL_I386_DIR32 _foo
      85:       89 e1   movl    %esp, %ecx
      87:       c7 01 00 00 00 00       movl    $0, (%ecx)
                        00000089:  IMAGE_REL_I386_DIR32 _foo
      8d:       89 44 24 20     movl    %eax, 32(%esp)
      91:       89 54 24 1c     movl    %edx, 28(%esp)
      95:       e8 00 00 00 00  calll   0 <_setfoo+0x96>
                        00000096:  IMAGE_REL_I386_REL32 _debugPointer
      9a:       89 e1   movl    %esp, %ecx
      9c:       c7 41 08 d5 00 00 00    movl    $213, 8(%ecx)
      a3:       c7 41 04 00 00 00 00    movl    $0, 4(%ecx)
                        000000a6:  IMAGE_REL_I386_DIR32 _JazzFixnumClass
      aa:       c7 01 00 00 00 00       movl    $0, (%ecx)
                        000000ac:  IMAGE_REL_I386_DIR32 _foo
      b0:       89 44 24 18     movl    %eax, 24(%esp)
      b4:       89 54 24 14     movl    %edx, 20(%esp)
      b8:       e8 00 00 00 00  calll   0 <_setfoo+0xB9>
                        000000b9:  IMAGE_REL_I386_REL32 _setGlobal
      bd:       89 e0   movl    %esp, %eax
      bf:       c7 00 00 00 00 00       movl    $0, (%eax)
                        000000c1:  IMAGE_REL_I386_DIR32 _foo
      c5:       e8 00 00 00 00  calll   0 <_setfoo+0xC6>
                        000000c6:  IMAGE_REL_I386_REL32 _debugPointer
      ca:       b9 d5 00 00 00  movl    $213, %ecx
      cf:       8b 74 24 2c     movl    44(%esp), %esi
      d3:       89 44 24 10     movl    %eax, 16(%esp)
      d7:       89 f0   movl    %esi, %eax
      d9:       89 54 24 0c     movl    %edx, 12(%esp)
      dd:       89 ca   movl    %ecx, %edx
      df:       83 c4 40        addl    $64, %esp
      e2:       5e      popl    %esi
      e3:       c3      retl
      e4:       66 66 66 2e 0f 1f 84 00 00 00 00 00     nopw    %cs:(%eax,%eax)

_XEP:setfoo:
      f0:       8b 44 24 04     movl    4(%esp), %eax
      f4:       83 f8 00        cmpl    $0, %eax
      f7:       0f 84 05 00 00 00       je      5 <_XEP:setfoo+0x12>
      fd:       e8 00 00 00 00  calll   0 <_XEP:setfoo+0x12>
                        000000fe:  IMAGE_REL_I386_REL32 _typeError
     102:       e8 00 00 00 00  calll   0 <_XEP:setfoo+0x17>
                        00000103:  IMAGE_REL_I386_REL32 _setfoo
     107:       c3      retl
     108:       0f 1f 84 00 00 00 00 00         nopl    (%eax,%eax)
     110:       00 00   addb    %al, (%eax)
                        00000110:  IMAGE_REL_I386_DIR32 _XEP:getfoo
     112:       00 00   addb    %al, (%eax)

_getfoo:
     114:       50      pushl   %eax
     115:       89 e0   movl    %esp, %eax
     117:       c7 00 00 00 00 00       movl    $0, (%eax)
                        00000119:  IMAGE_REL_I386_DIR32 _foo
     11d:       e8 00 00 00 00  calll   0 <_getfoo+0xE>
                        0000011e:  IMAGE_REL_I386_REL32 _getGlobal
     122:       59      popl    %ecx
     123:       c3      retl
     124:       66 66 66 2e 0f 1f 84 00 00 00 00 00     nopw    %cs:(%eax,%eax)

_XEP:getfoo:
     130:       8b 44 24 04     movl    4(%esp), %eax
     134:       83 f8 00        cmpl    $0, %eax
     137:       0f 84 05 00 00 00       je      5 <_XEP:getfoo+0x12>
     13d:       e8 00 00 00 00  calll   0 <_XEP:getfoo+0x12>
                        0000013e:  IMAGE_REL_I386_REL32 _typeError
     142:       e8 00 00 00 00  calll   0 <_XEP:getfoo+0x17>
                        00000143:  IMAGE_REL_I386_REL32 _getfoo
     147:       c3      retl


On Tue, Jun 6, 2017 at 3:18 AM, Sean Silva <[hidden email]> wrote:


On Mon, Jun 5, 2017 at 1:34 PM, Nikodemus Siivola <[hidden email]> wrote:
Uh. Turns out that if I hide the pointer to @foo from LLVM by passing it through an opaque identity function ... then everything works fine.

Is this a bug in LLVM or is there some magic involving globals I'm misunderstanding?

This looks like a bug in the handling of constant GEP's. Specifically the `getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1)` used to calculate the address of the integer inside the struct. Your observation "The bizarre thing is that even this looks correct: the debugInt is called first with @foo, then @foo+4, and the stores seem to be going to the right addresses as well: @foo and @foo+4!" at the level of the MachineInstr dump rules out problems before that.

After MachineInstr comes MC to emit the object file, but `foo+4` is one of the most basic relocation types, so I doubt that there's a bug in the lowering there or else "everything" would be broken.
Just to verify though, checking assembly of a small example across 32-bit targets of all 3 object file formats looks fine at a glance (MC is getting the +4 addend, though you would need to run `llvm-objdump -d -r` to see the actual relocation in the binary) .

Beyond MC, you already have your static object file. If that is fine, then in a JIT context you might be running into issues with RuntimeDyld. The actual GEP's that clang generates are identical to the ones in your code, further suggesting that this is JIT specific and that static links are unaffected (if you could verify that, it would help to narrow down the possibilities).
Maybe look at the output of `llvm-objdump -d -r` on a static .o file generated from your IR and see where the relocation is handled in lib/ExecutionEngine/RuntimeDyld (this will depend on your platform; grepping for the name of the relocation shown by llvm-objdump should find the right code to look at).

By the way, what platform are you JIT'ing on? I noticed that it is a 32-bit target, and I suspect that the 32-bit support in the JIT infrastructure isn't as well tested / commonly used as the 64-bit code, possibly explaining why this sort of bug could sneak through.

-- Sean Silva
 

define { i8*, i32 } @"__anonToplevel/0"() prefix { i8*, i32 } (i32)* @"XEP:__anonToplevel/0" {
entry:
  %0 = call { i8*, i32 }* @identity({ i8*, i32 }* nonnull @foo)
  %1 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
  %2 = getelementptr { i8*, i32 }, { i8*, i32 }* %0, i32 0, i32 0
  %3 = ptrtoint { i8*, i32 }* %0 to i32
  %4 = call { i8*, i32 } @debugInt(i32 %3)
  store i8* @FixnumClass, i8** %2, align 4
  %5 = getelementptr { i8*, i32 }, { i8*, i32 }* %0, i32 0, i32 1
  %6 = ptrtoint i32* %5 to i32
  %7 = call { i8*, i32 } @debugInt(i32 %6)
  store i32 123, i32* %5, align 4
  %8 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
  store i8* @FixnumClass, i8** %2, align 4
  store i32 123, i32* %5, align 4
  %9 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
  call void @setGlobal({ i8*, i32 }* %0, { i8*, i32 } { i8* @FixnumClass, i32 123 })
  %10 = call { i8*, i32 } @debugPointer({ i8*, i32 }* %0)
  ret { i8*, i32 } { i8* @FixnumClass, i32 123 }
}

Output, now with correct addresses out of the GEPs, and memory being modified as expected:

p = 02F80000
  class: 00000000
  datum: 00000000
x = 02F80000
x = 02F80004
p = 02F80000
  class: 028D3E98
  datum: 0000007B
p = 02F80000
  class: 028D3E98
  datum: 0000007B
p = 02F80000
  class: 028D3E98
  datum: 0000007B

Cheers,

 -- nikodemus


On Mon, Jun 5, 2017 at 10:57 PM, Nikodemus Siivola <[hidden email]> wrote:
Since the getelementptrs were implicitly generated by the CreateStore/Load I'm not sure how to get access to them.

So I hacked the assignment to be done thrice: once using a manual decomposition into two GEPs and stores, once using the "big" CreateStore, once via the setGlobal function, printing addresses and memory contents at each point to the degree that I have access to them.

It seems the following GEPs compute the same address?! I can buy myself not understanding how GEP works and doing it wrong, but builder.CreateStore() creates what look like identical GEPs implicitly...

i8** getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0), align 4
i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1), align 4

The details.

This is the relevant part from my codegen:

            auto ty = val->getType();
            cout << "val type:" << endl;
            ty->dump();
            cout << "ptr type:" << endl;
            ptr->getType()->dump();
            // Print memory
            ctx.EmitCall1("debugPointer", ptr);
            // Set class pointer
            auto c = ctx.bld.CreateExtractValue(val, 0, "class");
            auto cp = ctx.bld.CreateConstGEP2_32(ty, ptr, 0, 0);
            auto cx = ctx.bld.CreatePtrToInt(cp, ctx.Int32Type());
            ctx.EmitCall1("debugInt", cx);
            ctx.bld.CreateStore(c, cp);
            // Set datum
            auto d = ctx.bld.CreateExtractValue(val, 1, "datum");
            auto dp = ctx.bld.CreateConstGEP2_32(ty, ptr, 0, 1);
            auto dx = ctx.bld.CreatePtrToInt(dp, ctx.Int32Type());
            ctx.EmitCall1("debugInt", dx);
            ctx.bld.CreateStore(d, dp);
            // Print memory
            ctx.EmitCall1("debugPointer", ptr);
            // Do the same with a single store
            ctx.bld.CreateStore(val, ptr);
            // Print memory
            ctx.EmitCall1("debugPointer", ptr);
            // Call out
            ctx.EmitCall2("setGlobal", ptr, val);
            // Print memory
            ctx.EmitCall1("debugPointer", ptr);

Here is the compile-time output showing types of the value and the pointer:

val type:
{ i8*, i32 }
ptr type:
{ i8*, i32 }*

Here is the IR dump for the function (after a couple of passes), right before it's fed to the JIT:

define { i8*, i32 } @"__anonToplevel/0"() prefix { i8*, i32 } (i32)* @"XEP:__anonToplevel/0" {
entry:
  %0 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
  %1 = call { i8*, i32 } @debugInt(i32 ptrtoint ({ i8*, i32 }* @foo to i32))
  store i8* @FixnumClass, i8** getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0), align 4
  %2 = call { i8*, i32 } @debugInt(i32 ptrtoint (i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1) to i32))
  store i32 123, i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1), align 4
  %3 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
  store i8* @FixnumClass, i8** getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0), align 4
  store i32 123, i32* getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1), align 4
  %4 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
  call void @setGlobal({ i8*, i32 }* nonnull @foo, { i8*, i32 } { i8* @FixnumClass, i32 123 })
  %5 = call { i8*, i32 } @debugPointer({ i8*, i32 }* nonnull @foo)
  ret { i8*, i32 } { i8* @FixnumClass, i32 123 }
}

​Here is the runtime from calling the JITed function, including memory addresses and contents, with my annotations:

# Before
p = 03C10000
  class: 00000000
  datum: 00000000
# Should be address of the class slot --> correct
x = 03C10000
# Should be address of the datum slot, ie address of class slot + 4 --> incorrect
x = 03C10000
# Yeah, both values want to class slot, so actual class pointer got clobbered
p = 03C10000
  class: 0000007B
  datum: 00000000
# Same result from the single CreateStore
p = 03C10000
  class: 0000007B
  datum: 00000000
# Calling out to setGlobal as in my first email works
p = 03C10000
  class: 039D2E98
  datum: 0000007B

Finally, I didn't manage nice disassembly yet, so here is the last output from --print-after-all for the function. The bizarre thing is that even this looks correct: the debugInt is called first with @foo, then @foo+4, and the stores seem to be going to the right addresses as well: @foo and @foo+4!

BB#0: derived from LLVM BB %entry
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugInt>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        MOV32mi %noreg, 1, %noreg, <ga:@foo>, %noreg, <ga:@JazzFixnumClass>; mem:ST4[getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0)]
        PUSHi32 <ga:@foo+4>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugInt>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        MOV32mi %noreg, 1, %noreg, <ga:@foo+4>, %noreg, 123; mem:ST4[getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1)]
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        MOV32mi %noreg, 1, %noreg, <ga:@foo>, %noreg, <ga:@JazzFixnumClass>; mem:ST4[getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 0)]
        MOV32mi %noreg, 1, %noreg, <ga:@foo+4>, %noreg, 123; mem:ST4[getelementptr inbounds ({ i8*, i32 }, { i8*, i32 }* @foo, i32 0, i32 1)]
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        PUSH32i8 123, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        PUSHi32 <ga:@JazzFixnumClass>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@setGlobal>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 12, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        PUSHi32 <ga:@foo>, %ESP<imp-def>, %ESP<imp-use>
        CFI_INSTRUCTION <call frame instruction>
        CALLpcrel32 <ga:@debugPointer>, <regmask %BH %BL %BP %BPL %BX %DI %DIL %EBP %EBX %EDI %ESI %SI %SIL>, %ESP<imp-use>, %ESP<imp-def>, %EAX<imp-def,dead>, %EDX<imp-def,dead>
        %ESP<def,tied1> = ADD32ri8 %ESP<tied0>, 4, %EFLAGS<imp-def,dead>
        CFI_INSTRUCTION <call frame instruction>
        %EAX<def> = MOV32ri <ga:@JazzFixnumClass>
        %EDX<def> = MOV32ri 123
        RETL %EAX<kill>, %EDX<kill>

Also, I have essentially identical code working perfectly fine when the memory being written to is from @alloca.

I am completely clueless. Any suggestions most welcome.

Cheers,

 -- nikodemus










_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Loading...