Canonicalization of ptrtoint/inttoptr and getelementptr

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Canonicalization of ptrtoint/inttoptr and getelementptr

David Majnemer
Consider the two functions bellow:

define i8* @f(i8* %A) {  %pti = ptrtoint i8* %A to i64  %add = add i64 %pti, 5  %itp = inttoptr i64 %add to i8*  ret i8* %itp}
define i8* @g(i8* %A) {
  %gep = getelementptr i8* %A, i64 5  ret i8* %gep}
What, if anything, prevents us from canonicalizing @f to @g?I've heard that this might be in violation of http://llvm.org/docs/LangRef.html#pointeraliasing but I don't see how.

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Canonicalization of ptrtoint/inttoptr and getelementptr

Dan Gohman-3
An object can be allocated at virtual address 5 through extra-VM means (eg. mmap), and then one can (creatively) interpret the return value of @f as being associated with whatever %A was associated with *and* 5. The return value of @g can only be associated with exactly the same set that %A was associated with. Consequently, it's not always safe to replace @f with @g.

It looks a little silly to say this in the case of the integer constant 5, and there are some semantic gray areas around extra-VM allocation, but the same thing happens if the add were adding a dynamic integer value, and then it's difficult to find a way to separate that case from the constant 5 case.

In any case, the general advice is that people should prefer to use getelementptr to begin with. LLVM's own optimizers were converted to use getelementptr instead of ptrtoint+add+inttoptr even when they have to do raw byte arithmetic.


On Sat, Aug 30, 2014 at 6:01 PM, David Majnemer <[hidden email]> wrote:
Consider the two functions bellow:

define i8* @f(i8* %A) {  %pti = ptrtoint i8* %A to i64  %add = add i64 %pti, 5  %itp = inttoptr i64 %add to i8*  ret i8* %itp}
define i8* @g(i8* %A) {
  %gep = getelementptr i8* %A, i64 5  ret i8* %gep}
What, if anything, prevents us from canonicalizing @f to @g?I've heard that this might be in violation of http://llvm.org/docs/LangRef.html#pointeraliasing but I don't see how.


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Canonicalization of ptrtoint/inttoptr and getelementptr

Reid Kleckner-2
On Mon, Sep 8, 2014 at 4:22 PM, Dan Gohman <[hidden email]> wrote:
It looks a little silly to say this in the case of the integer constant 5, and there are some semantic gray areas around extra-VM allocation, but the same thing happens if the add were adding a dynamic integer value, and then it's difficult to find a way to separate that case from the constant 5 case.

Could we say that constant integers have no objects associated with them? If so, we need a way to bless constant integers that *do* refer to real objects, such as ASan's shadow memory base.

Then you should be able to take something like add a phi of constant ints to an inttoptr and transform that to a GEP, without explicitly calling out constant integers.
 
In any case, the general advice is that people should prefer to use getelementptr to begin with. LLVM's own optimizers were converted to use getelementptr instead of ptrtoint+add+inttoptr even when they have to do raw byte arithmetic.

I'm guessing the IR comes from C++ code that subtracts pointers, so it'd be good if we could figure this out.

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Canonicalization of ptrtoint/inttoptr and getelementptr

Dan Gohman-3
On Mon, Sep 8, 2014 at 8:36 PM, Reid Kleckner <[hidden email]> wrote:
On Mon, Sep 8, 2014 at 4:22 PM, Dan Gohman <[hidden email]> wrote:
It looks a little silly to say this in the case of the integer constant 5, and there are some semantic gray areas around extra-VM allocation, but the same thing happens if the add were adding a dynamic integer value, and then it's difficult to find a way to separate that case from the constant 5 case.

Could we say that constant integers have no objects associated with them? If so, we need a way to bless constant integers that *do* refer to real objects, such as ASan's shadow memory base.

Then you should be able to take something like add a phi of constant ints to an inttoptr and transform that to a GEP, without explicitly calling out constant integers.

It's not pretty to have situations where dynamic values permit more optimization than constants. If there's an expression which can be folded to a constant int, should the constant folder avoid doing so, because it might pessimize subsequent alias analysis?

Also, if you follow it to its semantic conclusion, even this isn't enough, because 5 may be associated with *any* fixed address, in the same way that p - 1000000 (which is valid to compute with a gep as long as you don't use inbounds) is still associated with p's set, even though it's pointing somewhere quite different. Associated-with is necessary but not sufficient for a valid dereference.
 
 
In any case, the general advice is that people should prefer to use getelementptr to begin with. LLVM's own optimizers were converted to use getelementptr instead of ptrtoint+add+inttoptr even when they have to do raw byte arithmetic.

I'm guessing the IR comes from C++ code that subtracts pointers, so it'd be good if we could figure this out.

A more complete testcase would be helpful here. This situation doesn't arise from simple pointer differences, so we should look at what other things it's interacting with to get here.

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Canonicalization of ptrtoint/inttoptr and getelementptr

Dan Gohman-3


On Tue, Sep 9, 2014 at 7:26 AM, Dan Gohman <[hidden email]> wrote:
On Mon, Sep 8, 2014 at 8:36 PM, Reid Kleckner <[hidden email]> wrote:
On Mon, Sep 8, 2014 at 4:22 PM, Dan Gohman <[hidden email]> wrote:
It looks a little silly to say this in the case of the integer constant 5, and there are some semantic gray areas around extra-VM allocation, but the same thing happens if the add were adding a dynamic integer value, and then it's difficult to find a way to separate that case from the constant 5 case.

Could we say that constant integers have no objects associated with them? If so, we need a way to bless constant integers that *do* refer to real objects, such as ASan's shadow memory base.

Then you should be able to take something like add a phi of constant ints to an inttoptr and transform that to a GEP, without explicitly calling out constant integers.

It's not pretty to have situations where dynamic values permit more optimization than constants. If there's an expression which can be folded to a constant int, should the constant folder avoid doing so, because it might pessimize subsequent alias analysis?
 

Actually, I got this backwards; in fact, it has the opposite problem. Having different rules for constant ints than for other expressions means the constant folder can't produce constant ints unless it can prove that they aren't ever used in a way that would violate the new rules.

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Canonicalization of ptrtoint/inttoptr and getelementptr

Philip Reames-4
In reply to this post by Dan Gohman-3
On 09/08/2014 04:22 PM, Dan Gohman wrote:
An object can be allocated at virtual address 5 through extra-VM means (eg. mmap), and then one can (creatively) interpret the return value of @f as being associated with whatever %A was associated with *and* 5. The return value of @g can only be associated with exactly the same set that %A was associated with. Consequently, it's not always safe to replace @f with @g.
Dan, I'm trying to follow your logic here and am not arriving at the same conclusion.  Can you point out the flaw in my reasoning here?

define i8* @f(i8* %A) { 
%pti = ptrtoint i8* %A to i64  <-- %pti is not a pointer and is thus not based on anything
%add = add i64 %pti, 5  <-- %add is not a pointer and is thus not based on anything, it is "associated with" the memory pointed to by %A
--- In particular, "5" is NOT a "an integer constant ... returned from a function not defined within LLVM".  It is not returned by a function.  As a result the pointer value of 5 is not associated with any address range.  
%itp = inttoptr i64 %add to i8*  %itp is based on %pti only
ret i8* %itp}

I'm guessing the key difference in our reasoning is about the constant 5.  :)  I'm also guessing that you have an example in mind which motivates the need for 5 to be considered associated with the address range.  Could you expand on why?



It looks a little silly to say this in the case of the integer constant 5, and there are some semantic gray areas around extra-VM allocation, but the same thing happens if the add were adding a dynamic integer value, and then it's difficult to find a way to separate that case from the constant 5 case.

In any case, the general advice is that people should prefer to use getelementptr to begin with. LLVM's own optimizers were converted to use getelementptr instead of ptrtoint+add+inttoptr even when they have to do raw byte arithmetic.
It would be nice to be able to canoncalize ptrtoint+add+inttoptr to geps.  Having seemingly reasonable-looking legal IR that simply doesn't optimize is not the best introduction for new frontend authors.  :)


On Sat, Aug 30, 2014 at 6:01 PM, David Majnemer <[hidden email]> wrote:
Consider the two functions bellow:

define i8* @f(i8* %A) {  %pti = ptrtoint i8* %A to i64  %add = add i64 %pti, 5  %itp = inttoptr i64 %add to i8*  ret i8* %itp}
define i8* @g(i8* %A) {
  %gep = getelementptr i8* %A, i64 5  ret i8* %gep}
What, if anything, prevents us from canonicalizing @f to @g?I've heard that this might be in violation of http://llvm.org/docs/LangRef.html#pointeraliasing but I don't see how.



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Canonicalization of ptrtoint/inttoptr and getelementptr

Kevin Modzelewski

On Tue, Sep 9, 2014 at 9:27 PM, Philip Reames <[hidden email]> wrote:

I'm guessing the key difference in our reasoning is about the constant 5.  :)  I'm also guessing that you have an example in mind which motivates the need for 5 to be considered associated with the address range.  Could you expand on why?


Can't speak for Dan, but in Pyston we certainly make use of these types of constructs to embed JIT-time constants (say, an interned string, or a reference to the current function object) into the function being compiled.  Heuristically, we can all see the different of intent between "ptr + 5" and "load (int*)0x2aaaaa0000", but it seems like it'd be difficult to come up with reasonable rules that would separate them.


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Canonicalization of ptrtoint/inttoptr and getelementptr

Philip Reames-4

On 09/10/2014 02:55 PM, Kevin Modzelewski wrote:

On Tue, Sep 9, 2014 at 9:27 PM, Philip Reames <[hidden email]> wrote:

I'm guessing the key difference in our reasoning is about the constant 5.  :)  I'm also guessing that you have an example in mind which motivates the need for 5 to be considered associated with the address range.  Could you expand on why?


Can't speak for Dan, but in Pyston we certainly make use of these types of constructs to embed JIT-time constants (say, an interned string, or a reference to the current function object) into the function being compiled.  Heuristically, we can all see the different of intent between "ptr + 5" and "load (int*)0x2aaaaa0000", but it seems like it'd be difficult to come up with reasonable rules that would separate them.

All of the cases I've seen in JITed code can be dealt with differently.  By emitting a global variable and then using the "link time" address resolution to map it to the right address, you get the same effect while remaining entirely within the well defined part of the IR.  I don't see this case as being worth restricting an otherwise reasonable optimization. 

One problem with Dan's interpretation of the current rules is that this otherwise legal transform becomes problematic:
%addr = inttoptr 0x2aaaaa0005 to %i32*
===>
%tmp = add i32 0x2aaaaa0000, i32 5
%addr = inttoptr %tmp to %i32*

We probably wouldn't do this at the IR level, but we definitely do perform this transform in the backends.  There's no reason it *shouldn't* be valid at the IR level either. 

Philip


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Canonicalization of ptrtoint/inttoptr and getelementptr

David Majnemer
LLVM doesn't appear to respect this.

consider:
target datalayout = "e-i64:64-f80:128-n8:16:32:64-S128"

define i8* @h(i8* %x, i8* %y) {
  %pti = ptrtoint i8* %y to i64
  %sub = sub i64 0, %pti
  %gep = getelementptr i8* %x, i64 %sub
  ret i8* %gep
}

run it with -instcombine and we get:

define i8* @h(i8* %x, i8* %y) {
  %pti = ptrtoint i8* %y to i64
  %1 = ptrtoint i8* %x to i64
  %2 = sub i64 %1, %pti
  %gep = inttoptr i64 %2 to i8*
  ret i8* %gep
}


On Wed, Sep 10, 2014 at 6:16 PM, Philip Reames <[hidden email]> wrote:

On 09/10/2014 02:55 PM, Kevin Modzelewski wrote:

On Tue, Sep 9, 2014 at 9:27 PM, Philip Reames <[hidden email]> wrote:

I'm guessing the key difference in our reasoning is about the constant 5.  :)  I'm also guessing that you have an example in mind which motivates the need for 5 to be considered associated with the address range.  Could you expand on why?


Can't speak for Dan, but in Pyston we certainly make use of these types of constructs to embed JIT-time constants (say, an interned string, or a reference to the current function object) into the function being compiled.  Heuristically, we can all see the different of intent between "ptr + 5" and "load (int*)0x2aaaaa0000", but it seems like it'd be difficult to come up with reasonable rules that would separate them.

All of the cases I've seen in JITed code can be dealt with differently.  By emitting a global variable and then using the "link time" address resolution to map it to the right address, you get the same effect while remaining entirely within the well defined part of the IR.  I don't see this case as being worth restricting an otherwise reasonable optimization. 

One problem with Dan's interpretation of the current rules is that this otherwise legal transform becomes problematic:
%addr = inttoptr 0x2aaaaa0005 to %i32*
===>
%tmp = add i32 0x2aaaaa0000, i32 5
%addr = inttoptr %tmp to %i32*

We probably wouldn't do this at the IR level, but we definitely do perform this transform in the backends.  There's no reason it *shouldn't* be valid at the IR level either. 

Philip


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Canonicalization of ptrtoint/inttoptr and getelementptr

Dan Gohman-3
In reply to this post by Philip Reames-4


On Wed, Sep 10, 2014 at 3:16 PM, Philip Reames <[hidden email]> wrote:

On 09/10/2014 02:55 PM, Kevin Modzelewski wrote:

On Tue, Sep 9, 2014 at 9:27 PM, Philip Reames <[hidden email]> wrote:

I'm guessing the key difference in our reasoning is about the constant 5.  :)  I'm also guessing that you have an example in mind which motivates the need for 5 to be considered associated with the address range.  Could you expand on why?


Can't speak for Dan, but in Pyston we certainly make use of these types of constructs to embed JIT-time constants (say, an interned string, or a reference to the current function object) into the function being compiled.  Heuristically, we can all see the different of intent between "ptr + 5" and "load (int*)0x2aaaaa0000", but it seems like it'd be difficult to come up with reasonable rules that would separate them.

All of the cases I've seen in JITed code can be dealt with differently.  By emitting a global variable and then using the "link time" address resolution to map it to the right address, you get the same effect while remaining entirely within the well defined part of the IR.  I don't see this case as being worth restricting an otherwise reasonable optimization. 

One problem with Dan's interpretation of the current rules is that this otherwise legal transform becomes problematic:
%addr = inttoptr 0x2aaaaa0005 to %i32*
===>
%tmp = add i32 0x2aaaaa0000, i32 5
%addr = inttoptr %tmp to %i32*

We probably wouldn't do this at the IR level, but we definitely do perform this transform in the backends.  There's no reason it *shouldn't* be valid at the IR level either. 

I don't quite follow your example here. However, there's a key difference between what happens in the backends and what happens at the mid-level IR: The mid-level IR does serious alias analysis. The backends do a much more limited form of alias analysis, and they supplement it by calling back into the middle-end for the hard stuff. There's no question that a properly formed ptrtoint+add+inttoptr computes the exact same bits as a corresponding getelementptr, on all platforms and in all circumstances. The difference is in the extra aliasing rules that are activated when the getelementptr instruction is used. The backends don't have a reason to preserve those rules, so they don't bother.

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Canonicalization of ptrtoint/inttoptr and getelementptr

Dan Gohman-3
In reply to this post by Philip Reames-4


On Tue, Sep 9, 2014 at 9:27 PM, Philip Reames <[hidden email]> wrote:
On 09/08/2014 04:22 PM, Dan Gohman wrote:
An object can be allocated at virtual address 5 through extra-VM means (eg. mmap), and then one can (creatively) interpret the return value of @f as being associated with whatever %A was associated with *and* 5. The return value of @g can only be associated with exactly the same set that %A was associated with. Consequently, it's not always safe to replace @f with @g.
Dan, I'm trying to follow your logic here and am not arriving at the same conclusion.  Can you point out the flaw in my reasoning here?

define i8* @f(i8* %A) { 
%pti = ptrtoint i8* %A to i64  <-- %pti is not a pointer and is thus not based on anything
%add = add i64 %pti, 5  <-- %add is not a pointer and is thus not based on anything, it is "associated with" the memory pointed to by %A
--- In particular, "5" is NOT a "an integer constant ... returned from a function not defined within LLVM".  It is not returned by a function.  As a result the pointer value of 5 is not associated with any address range.  

I believe you misinterpreted the text here. 5 is "an integer constant other than zero", so it "may be associated with address ranges allocated through mechanisms other than those provided by LLVM".

%itp = inttoptr i64 %add to i8*  %itp is based on %pti only
ret i8* %itp}

I'm guessing the key difference in our reasoning is about the constant 5.  :)  I'm also guessing that you have an example in mind which motivates the need for 5 to be considered associated with the address range.  Could you expand on why?

LLVM is used in a wide variety of contexts. In some of them, objects are statically allocated at known fixed addresses. In others, the JIT runs after objects are allocated, so it knows the address of allocated objects. In others, mmap is used to dynamically allocate objects at fixed addresses. The current rules attempt to accommodate all of these use cases, and more.

To respond to your suggestion elsewhere about using symbolic addresses that are resolved at link time, that's indeed a great technique, but not one that LLVM can require all its front-ends to use, because the practice of using integer constants is very widespread. It's even common enough at the C/C++ level. Also, in a JIT context, using symbolic addresses could require expensive and otherwise unnecessary relocation processing.




It looks a little silly to say this in the case of the integer constant 5, and there are some semantic gray areas around extra-VM allocation, but the same thing happens if the add were adding a dynamic integer value, and then it's difficult to find a way to separate that case from the constant 5 case.

In any case, the general advice is that people should prefer to use getelementptr to begin with. LLVM's own optimizers were converted to use getelementptr instead of ptrtoint+add+inttoptr even when they have to do raw byte arithmetic.
It would be nice to be able to canoncalize ptrtoint+add+inttoptr to geps.  Having seemingly reasonable-looking legal IR that simply doesn't optimize is not the best introduction for new frontend authors.  :)

I don't know if bitcast+getelementptr+bitcast is really worse than ptrtoint+add+inttoptr here. It's also my own experience writing front-ends that one most often gets into array and struct field accesses pretty quickly, and raw byte offsets only after getting into it a ways, so getelementptr shouldn't that foreign.

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev