Clang devirtualization proposal

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
36 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Clang devirtualization proposal

Piotr Padlewski
Hi folks,
this summer I will work with Richard Smith on clang devirtualization. Check out our proposal:

https://docs.google.com/document/d/1f2SGa4TIPuBGm6y6YO768GrQsA8awNfGEJSBFukLhYA/edit?usp=sharing

And modified LangRef

You can also check out previous disscussion that was started before our proposal was ready - http://lists.cs.uiuc.edu/pipermail/cfe-dev/2015-July/044052.html

Regards
Piotr Padlewski

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Clang devirtualization proposal

Geoff Berry

Hi Piotr,

 

You may be interested in a recent patch I posted: http://reviews.llvm.org/D11043

This patch addresses a de-virtualization case that I’m not sure would be handled by your current proposal, namely that of a virtual call where the ‘this’ object is a global variable.

For example:

 

struct A {

  A();

  virtual void foo();

};

void g(A * a) {

  a->foo();

}

 

A a;

int main() {

  g(&a);

}

 

It might be worth considering if your approach could be extended to handle this case.

 

--

Geoff Berry

Employee of Qualcomm Innovation Center, Inc.

Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project

 

From: [hidden email] [mailto:[hidden email]] On Behalf Of Piotr Padlewski
Sent: Wednesday, July 22, 2015 5:56 PM
To: [hidden email] Developers; [hidden email]
Subject: [LLVMdev] Clang devirtualization proposal

 

Hi folks,

this summer I will work with Richard Smith on clang devirtualization. Check out our proposal:

 

https://docs.google.com/document/d/1f2SGa4TIPuBGm6y6YO768GrQsA8awNfGEJSBFukLhYA/edit?usp=sharing

 

And modified LangRef

 

You can also check out previous disscussion that was started before our proposal was ready - http://lists.cs.uiuc.edu/pipermail/cfe-dev/2015-July/044052.html

 

Regards

Piotr Padlewski


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Clang devirtualization proposal

Piotr Padlewski
HI,
Yep, our proposal doesn't cover it, because this load ; icmp ; assume; will land global initilizer function, and main will not see it. 
At least if foo would be called multiple times, then we would only have one load from vtable, but unfortunatelly we will not be able to inline, or make direct call to it with this approach.
I think that this case is rare enough to solve it right now.

Piotr

On Thu, Jul 23, 2015 at 10:30 AM, Geoff Berry <[hidden email]> wrote:

Hi Piotr,

 

You may be interested in a recent patch I posted: http://reviews.llvm.org/D11043

This patch addresses a de-virtualization case that I’m not sure would be handled by your current proposal, namely that of a virtual call where the ‘this’ object is a global variable.

For example:

 

struct A {

  A();

  virtual void foo();

};

void g(A * a) {

  a->foo();

}

 

A a;

int main() {

  g(&a);

}

 

It might be worth considering if your approach could be extended to handle this case.

 

--

Geoff Berry

Employee of Qualcomm Innovation Center, Inc.

Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project

 

From: [hidden email] [mailto:[hidden email]] On Behalf Of Piotr Padlewski
Sent: Wednesday, July 22, 2015 5:56 PM
To: [hidden email] Developers; [hidden email]
Subject: [LLVMdev] Clang devirtualization proposal

 

Hi folks,

this summer I will work with Richard Smith on clang devirtualization. Check out our proposal:

 

https://docs.google.com/document/d/1f2SGa4TIPuBGm6y6YO768GrQsA8awNfGEJSBFukLhYA/edit?usp=sharing

 

And modified LangRef

 

You can also check out previous disscussion that was started before our proposal was ready - http://lists.cs.uiuc.edu/pipermail/cfe-dev/2015-July/044052.html

 

Regards

Piotr Padlewski



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [cfe-dev] Clang devirtualization proposal

Richard Smith-33
On Thu, Jul 23, 2015 at 11:42 AM, Piotr Padlewski <[hidden email]> wrote:
HI,
Yep, our proposal doesn't cover it, because this load ; icmp ; assume; will land global initilizer function, and main will not see it. 
At least if foo would be called multiple times, then we would only have one load from vtable, but unfortunatelly we will not be able to inline, or make direct call to it with this approach.
I think that this case is rare enough to solve it right now.

Piotr

On Thu, Jul 23, 2015 at 10:30 AM, Geoff Berry <[hidden email]> wrote:

Hi Piotr,

 

You may be interested in a recent patch I posted: http://reviews.llvm.org/D11043

This patch addresses a de-virtualization case that I’m not sure would be handled by your current proposal, namely that of a virtual call where the ‘this’ object is a global variable.

For example:

 

struct A {

  A();

  virtual void foo();

};

void g(A * a) {

  a->foo();

}

 

A a;

int main() {

  g(&a);


We could in principle emit the load/icmp/assume for a's vptr here, but I'm concerned that this is not actually sound in general. Consider:

struct A;
extern A a;
void *p = &a; // emit load/icmp/assume here?
struct A { virtual void f(); };
A a;

If we emit an assume on the vptr at the point where the address of a is taken, we get:

@a = global %A zeroinitializer
void @global_init() {
  load @a
  %x = icmp @a, @A.vtable
  call @llvm.assume(%x)
  store @a, @p
  call @A::A(@a)
}

... where the assume call can be folded to "call @llvm.assume(i1 false)", which (presumably) can be reduced further to "unreachable" (oops). (In Geoff's example, this is sound only because we're guaranteed that 'a' is initialized before we enter 'main', but an optimization that only fires for code within 'main' doesn't sound very useful.)

We could support this with something weaker than the current load/icmp/assume pattern: we could imagine an instruction/intrinsic that says "any later load of pointer %p with !invariant.group !g will load the value %q", but that does not imply that the value %q is currently stored in %p. It's not clear to me whether that has sufficient value to be worthwhile.

}

 

It might be worth considering if your approach could be extended to handle this case.

 

--

Geoff Berry

Employee of Qualcomm Innovation Center, Inc.

Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project

 

From: [hidden email] [mailto:[hidden email]] On Behalf Of Piotr Padlewski
Sent: Wednesday, July 22, 2015 5:56 PM
To: [hidden email] Developers; [hidden email]
Subject: [LLVMdev] Clang devirtualization proposal

 

Hi folks,

this summer I will work with Richard Smith on clang devirtualization. Check out our proposal:

 

https://docs.google.com/document/d/1f2SGa4TIPuBGm6y6YO768GrQsA8awNfGEJSBFukLhYA/edit?usp=sharing

 

And modified LangRef

 

You can also check out previous disscussion that was started before our proposal was ready - http://lists.cs.uiuc.edu/pipermail/cfe-dev/2015-July/044052.html

 

Regards

Piotr Padlewski



_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [cfe-dev] Clang devirtualization proposal

Hal Finkel
----- Original Message -----

> From: "Richard Smith" <[hidden email]>
> To: "Piotr Padlewski" <[hidden email]>
> Cc: "[hidden email] Mailing List" <[hidden email]>, "[hidden email] Developers" <[hidden email]>
> Sent: Thursday, July 23, 2015 4:21:42 PM
> Subject: Re: [cfe-dev] [LLVMdev] Clang devirtualization proposal
>
> On Thu, Jul 23, 2015 at 11:42 AM, Piotr Padlewski < [hidden email]
> > wrote:
>
>
>
> HI,
> Yep, our proposal doesn't cover it, because this load ; icmp ;
> assume; will land global initilizer function, and main will not see
> it.
> At least if foo would be called multiple times, then we would only
> have one load from vtable, but unfortunatelly we will not be able to
> inline, or make direct call to it with this approach.
> I think that this case is rare enough to solve it right now.
>
>
> Piotr
>
>
> On Thu, Jul 23, 2015 at 10:30 AM, Geoff Berry < [hidden email]
> > wrote:
>
>
>
>
>
>
> Hi Piotr,
>
>
>
> You may be interested in a recent patch I posted:
> http://reviews.llvm.org/D11043
>
>
>
> This patch addresses a de-virtualization case that I’m not sure would
> be handled by your current proposal, namely that of a virtual call
> where the ‘this’ object is a global variable.
>
> For example:
>
>
>
> struct A {
>
> A();
>
> virtual void foo();
>
> };
>
> void g(A * a) {
>
> a->foo();
>
> }
>
>
>
> A a;
>
> int main() {
>
> g(&a);
>
>
> We could in principle emit the load/icmp/assume for a's vptr here,
> but I'm concerned that this is not actually sound in general.
> Consider:
>
>
> struct A;
> extern A a;
> void *p = &a; // emit load/icmp/assume here?
> struct A { virtual void f(); };
> A a;
>
>
> If we emit an assume on the vptr at the point where the address of a
> is taken, we get:
>
>
> @a = global %A zeroinitializer
> void @global_init() {
> load @a
> %x = icmp @a, @A.vtable
> call @llvm.assume(%x)
> store @a, @p
> call @A::A(@a)
> }
>
>
> ... where the assume call can be folded to "call @llvm.assume(i1
> false)", which (presumably) can be reduced further to "unreachable"

To confirm, yes, @llvm.assume(false) -> unreachable by llvm::removeUnreachableBlocks.

 -Hal

> (oops). (In Geoff's example, this is sound only because we're
> guaranteed that 'a' is initialized before we enter 'main', but an
> optimization that only fires for code within 'main' doesn't sound
> very useful.)
>
>
> We could support this with something weaker than the current
> load/icmp/assume pattern: we could imagine an instruction/intrinsic
> that says "any later load of pointer %p with !invariant.group !g
> will load the value %q", but that does not imply that the value %q
> is currently stored in %p. It's not clear to me whether that has
> sufficient value to be worthwhile.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> }
>
>
>
> It might be worth considering if your approach could be extended to
> handle this case.
>
>
>
>
> --
>
> Geoff Berry
>
> Employee of Qualcomm Innovation Center, Inc.
>
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a
> Linux Foundation Collaborative Project
>
>
>
> From: [hidden email] [mailto:
> [hidden email] ] On Behalf Of Piotr Padlewski
> Sent: Wednesday, July 22, 2015 5:56 PM
> To: [hidden email] Developers; [hidden email]
> Subject: [LLVMdev] Clang devirtualization proposal
>
>
>
>
>
> Hi folks,
>
>
> this summer I will work with Richard Smith on clang devirtualization.
> Check out our proposal:
>
>
>
>
> https://docs.google.com/document/d/1f2SGa4TIPuBGm6y6YO768GrQsA8awNfGEJSBFukLhYA/edit?usp=sharing
>
>
>
>
>
> And modified LangRef
>
>
> http://reviews.llvm.org/D11399
>
>
>
>
>
> You can also check out previous disscussion that was started before
> our proposal was ready -
> http://lists.cs.uiuc.edu/pipermail/cfe-dev/2015-July/044052.html
>
>
>
>
>
> Regards
>
>
> Piotr Padlewski
>
> _______________________________________________
> cfe-dev mailing list
> [hidden email]
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>
>
>
> _______________________________________________
> cfe-dev mailing list
> [hidden email]
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>

--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [cfe-dev] Clang devirtualization proposal

Hal Finkel
In reply to this post by Piotr Padlewski
Hi Piotr,

Thanks for posting this! First, a question. When you say, regarding i8* @llvm.invariant.group.barrier(i8*):

"Required to handle destructors, placement new and std::launder. Call of this function will be put on the end of each of this functions"

I completely understand placement new and std::launder. I don't understand destructors, could you explain?

Also, am I correct in saying that we could handle the case of 'final' classes I highlighted in initial e-mail by inserting these assumptions whenever a pointer/reference to a class of such a type came into scope?

struct A {
  virtual void foo() = 0;
};

struct B final : public A {
  void foo();
};

void entry(B *b) {
 // emit assumptions about vtbl of 'b' here?
}

Thanks again,
Hal

----- Original Message -----

> From: "Piotr Padlewski" <[hidden email]>
> To: "[hidden email] Developers" <[hidden email]>, [hidden email]
> Cc: "Richard Smith" <[hidden email]>
> Sent: Wednesday, July 22, 2015 4:55:43 PM
> Subject: [cfe-dev] Clang devirtualization proposal
>
>
>
>
> Hi folks,
> this summer I will work with Richard Smith on clang devirtualization.
> Check out our proposal:
>
> https://docs.google.com/document/d/1f2SGa4TIPuBGm6y6YO768GrQsA8awNfGEJSBFukLhYA/edit?usp=sharing
>
>
>
> And modified LangRef
> http://reviews.llvm.org/D11399
>
>
>
> You can also check out previous disscussion that was started before
> our proposal was ready -
> http://lists.cs.uiuc.edu/pipermail/cfe-dev/2015-July/044052.html
>
>
> Regards
> Piotr Padlewski
> _______________________________________________
> cfe-dev mailing list
> [hidden email]
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>

--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [cfe-dev] Clang devirtualization proposal

Richard Smith-33
On Sat, Jul 25, 2015 at 12:39 PM, Hal Finkel <[hidden email]> wrote:
Hi Piotr,

Thanks for posting this! First, a question. When you say, regarding i8* @llvm.invariant.group.barrier(i8*):

"Required to handle destructors, placement new and std::launder. Call of this function will be put on the end of each of this functions"

I completely understand placement new and std::launder. I don't understand destructors, could you explain?

When a derived class destructor invokes a base class destructor, the dynamic type of the object changes (as does the vptr), so we need an invariant barrier to prevent the derived class's vptr being used for virtual calls in an inlined base class destructor.
 
Also, am I correct in saying that we could handle the case of 'final' classes I highlighted in initial e-mail by inserting these assumptions whenever a pointer/reference to a class of such a type came into scope?

Yes, it would be correct to insert these assumptions anywhere where the language standard guarantees that there exists an object of a known (most-derived) dynamic type (and in particular, we can do this whenever we know there exists an object of a known final type).

CodeGen already invokes EmitTypeCheck in many of the places where it's guaranteed that an object of a known type exists; we could experiment with adding an assumption from it any time that type is final.

struct A {
  virtual void foo() = 0;
};

struct B final : public A {
  void foo();
};

void entry(B *b) {
 // emit assumptions about vtbl of 'b' here?

This case is tricky. We don't currently have a way of saying "assume that a load of %b would load %B.vtbl" without also saying "assume that %b is dereferenceable". We've seen other cases where that would be beneficial, so perhaps that's something we should consider adding.
 
}

Thanks again,
Hal

----- Original Message -----
> From: "Piotr Padlewski" <[hidden email]>
> To: "[hidden email] Developers" <[hidden email]>, [hidden email]
> Cc: "Richard Smith" <[hidden email]>
> Sent: Wednesday, July 22, 2015 4:55:43 PM
> Subject: [cfe-dev] Clang devirtualization proposal
>
>
>
>
> Hi folks,
> this summer I will work with Richard Smith on clang devirtualization.
> Check out our proposal:
>
> https://docs.google.com/document/d/1f2SGa4TIPuBGm6y6YO768GrQsA8awNfGEJSBFukLhYA/edit?usp=sharing
>
>
>
> And modified LangRef
> http://reviews.llvm.org/D11399
>
>
>
> You can also check out previous disscussion that was started before
> our proposal was ready -
> http://lists.cs.uiuc.edu/pipermail/cfe-dev/2015-July/044052.html
>
>
> Regards
> Piotr Padlewski
> _______________________________________________
> cfe-dev mailing list
> [hidden email]
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>

--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [cfe-dev] Clang devirtualization proposal

Hal Finkel
----- Original Message -----

> From: "Richard Smith" <[hidden email]>
> To: "Hal Finkel" <[hidden email]>
> Cc: "Piotr Padlewski" <[hidden email]>, "[hidden email] Developers" <[hidden email]>,
> "[hidden email] Mailing List" <[hidden email]>
> Sent: Saturday, July 25, 2015 7:43:37 PM
> Subject: Re: [cfe-dev] Clang devirtualization proposal
>
>
>
>
> On Sat, Jul 25, 2015 at 12:39 PM, Hal Finkel < [hidden email] >
> wrote:
>
>
> Hi Piotr,
>
> Thanks for posting this! First, a question. When you say, regarding
> i8* @llvm.invariant.group.barrier(i8*):
>
> "Required to handle destructors, placement new and std::launder. Call
> of this function will be put on the end of each of this functions"
>
> I completely understand placement new and std::launder. I don't
> understand destructors, could you explain?
>
> When a derived class destructor invokes a base class destructor, the
> dynamic type of the object changes (as does the vptr), so we need an
> invariant barrier to prevent the derived class's vptr being used for
> virtual calls in an inlined base class destructor.

Interesting. So we'll launder the 'this' pointer through @llvm.invariant.group.barrier before changing the vptr and calling the base class destructor?

>
>
> Also, am I correct in saying that we could handle the case of 'final'
> classes I highlighted in initial e-mail by inserting these
> assumptions whenever a pointer/reference to a class of such a type
> came into scope?
>
>
>
> Yes, it would be correct to insert these assumptions anywhere where
> the language standard guarantees that there exists an object of a
> known (most-derived) dynamic type (and in particular, we can do this
> whenever we know there exists an object of a known final type).
>
>
> CodeGen already invokes EmitTypeCheck in many of the places where
> it's guaranteed that an object of a known type exists; we could
> experiment with adding an assumption from it any time that type is
> final.

Sounds good.
 

>
> struct A {
> virtual void foo() = 0;
> };
>
> struct B final : public A {
> void foo();
> };
>
> void entry(B *b) {
> // emit assumptions about vtbl of 'b' here?
>
>
> This case is tricky. We don't currently have a way of saying "assume
> that a load of %b would load %B.vtbl" without also saying "assume
> that %b is dereferenceable". We've seen other cases where that would
> be beneficial, so perhaps that's something we should consider
> adding.

Ah, good point. We could only do it for references currently. I think adding something in this direction makes sense, something that says, perhaps, "we don't know if %b is deferenceable, but if it is, and you were to load from it, you'd get the following."

Thanks again,
Hal

>
>
> }
>
> Thanks again,
> Hal
>
> ----- Original Message -----
> > From: "Piotr Padlewski" < [hidden email] >
> > To: " [hidden email] Developers" < [hidden email] >,
> > [hidden email]
> > Cc: "Richard Smith" < [hidden email] >
> > Sent: Wednesday, July 22, 2015 4:55:43 PM
> > Subject: [cfe-dev] Clang devirtualization proposal
> >
> >
> >
> >
> > Hi folks,
> > this summer I will work with Richard Smith on clang
> > devirtualization.
> > Check out our proposal:
> >
> > https://docs.google.com/document/d/1f2SGa4TIPuBGm6y6YO768GrQsA8awNfGEJSBFukLhYA/edit?usp=sharing
> >
> >
> >
> > And modified LangRef
> > http://reviews.llvm.org/D11399
> >
> >
> >
> > You can also check out previous disscussion that was started before
> > our proposal was ready -
> > http://lists.cs.uiuc.edu/pipermail/cfe-dev/2015-July/044052.html
> >
> >
> > Regards
> > Piotr Padlewski
> > _______________________________________________
> > cfe-dev mailing list
> > [hidden email]
> > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
> >
>
>
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
>
>

--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Clang devirtualization proposal

Philip Reames-4
In reply to this post by Piotr Padlewski
Having read through the proposal, I feel like I missing some of the background to understand the problem you're trying to solve.

My mental model is that construction of an object creates a new abstract location in an infinite heap with each object infinitely far apart.  Destruction of the object destroys the abstract location.  As a result, destructing one object and constructing another produce unique incomparable abstract locations.  The fact the two abstract locations might happen to share a physical address is irrelevant. 

If I'm understanding the proposal correctly, this model works for most code.  The key optimization you appear to want to perform is to recognize the fact that these two abstract locations occupy the same memory.  In particular, you want to be able to return mustalias for alias(loc1, loc2).  Another way of saying this is that you want to reason about abstract locations as defined by allocation/deallocation events rather than construction/destruction events.  Is that a fair summary?

What I'm not clear on is *why* recognizing the two abstract locations share a physical address is important.  Given that the contents of the abstract location before construction or after destruction are undefined (right?), what optimization does recognizing the mustalias relation enable?

Philip


On 07/22/2015 02:55 PM, Piotr Padlewski wrote:
Hi folks,
this summer I will work with Richard Smith on clang devirtualization. Check out our proposal:

https://docs.google.com/document/d/1f2SGa4TIPuBGm6y6YO768GrQsA8awNfGEJSBFukLhYA/edit?usp=sharing

And modified LangRef

You can also check out previous disscussion that was started before our proposal was ready - http://lists.cs.uiuc.edu/pipermail/cfe-dev/2015-July/044052.html

Regards
Piotr Padlewski


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [cfe-dev] Clang devirtualization proposal

Reid Kleckner-2
On Tue, Jul 28, 2015 at 10:58 AM, Philip Reames <[hidden email]> wrote:
Having read through the proposal, I feel like I missing some of the background to understand the problem you're trying to solve.

My mental model is that construction of an object creates a new abstract location in an infinite heap with each object infinitely far apart.  Destruction of the object destroys the abstract location.  As a result, destructing one object and constructing another produce unique incomparable abstract locations.  The fact the two abstract locations might happen to share a physical address is irrelevant. 

If I'm understanding the proposal correctly, this model works for most code.  The key optimization you appear to want to perform is to recognize the fact that these two abstract locations occupy the same memory.  In particular, you want to be able to return mustalias for alias(loc1, loc2).  Another way of saying this is that you want to reason about abstract locations as defined by allocation/deallocation events rather than construction/destruction events.  Is that a fair summary?

What I'm not clear on is *why* recognizing the two abstract locations share a physical address is important.  Given that the contents of the abstract location before construction or after destruction are undefined (right?), what optimization does recognizing the mustalias relation enable?

I think this is incorrect. LLVM's model is closer to the second model, and we need something like the first model to prevent erroneous devirtualization.

The corner case for C++ is when the optimizer observes that two abstract objects share the same physical memory location. In practice, this could happen if the memory allocator becomes visible to the optimizer through inlining. For illustration, do placement new into the stack memory of another object. This is illustrated in example 2 of the proposal:

struct MyClass {
  virtual void foo();
};
struct MyOtherClass : MyClass {
  virtual void foo();
};
int main() {
  MyClass c;
  c.foo();
  // Reuse the storage temporarily.  UB to access the object through ‘c’
  c.~MyClass();
  auto c2 = new (&c) MyOtherClass();
  c2->foo(); //fine, we have new pointer
  // c.foo() // UB, the type has changed
  
  // The storage has to contain a ‘MyClass’ when it goes out of scope.
  c2->~MyOtherClass();
  new (&c) MyClass(); // we have to get back to previous type because calling destructor using c would be UB
}

Without @llvm.invariant.group.barrier, LLVM will probably replace %c2 with %c here, since they are trivially the same.

With @llvm.invariant.group.barrier, the result of placement new will be a distinct SSA value that LLVM can't reason about, and we won't accidentally devirtualize c2->foo() to MyClass::foo.

There is, however, a small problem with this model. If the code happened to do this:

  ...
  auto c2 = new (&c) MyOtherClass();
  assert(c2 == &c);
  ...

LLVM might once again replace %c2 with %c, causing bad devirtualization.

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [cfe-dev] Clang devirtualization proposal

Philip Reames-4


On 07/29/2015 09:32 AM, Reid Kleckner wrote:
On Tue, Jul 28, 2015 at 10:58 AM, Philip Reames <[hidden email]> wrote:
Having read through the proposal, I feel like I missing some of the background to understand the problem you're trying to solve.

My mental model is that construction of an object creates a new abstract location in an infinite heap with each object infinitely far apart.  Destruction of the object destroys the abstract location.  As a result, destructing one object and constructing another produce unique incomparable abstract locations.  The fact the two abstract locations might happen to share a physical address is irrelevant. 

If I'm understanding the proposal correctly, this model works for most code.  The key optimization you appear to want to perform is to recognize the fact that these two abstract locations occupy the same memory.  In particular, you want to be able to return mustalias for alias(loc1, loc2).  Another way of saying this is that you want to reason about abstract locations as defined by allocation/deallocation events rather than construction/destruction events.  Is that a fair summary?

What I'm not clear on is *why* recognizing the two abstract locations share a physical address is important.  Given that the contents of the abstract location before construction or after destruction are undefined (right?), what optimization does recognizing the mustalias relation enable?

I think this is incorrect. LLVM's model is closer to the second model, and we need something like the first model to prevent erroneous devirtualization.

The corner case for C++ is when the optimizer observes that two abstract objects share the same physical memory location. In practice, this could happen if the memory allocator becomes visible to the optimizer through inlining. For illustration, do placement new into the stack memory of another object. This is illustrated in example 2 of the proposal:

struct MyClass {
  virtual void foo();
};
struct MyOtherClass : MyClass {
  virtual void foo();
};
int main() {
  MyClass c;
  c.foo();
  // Reuse the storage temporarily.  UB to access the object through ‘c’
  c.~MyClass();
  auto c2 = new (&c) MyOtherClass();
  c2->foo(); //fine, we have new pointer
  // c.foo() // UB, the type has changed
  
  // The storage has to contain a ‘MyClass’ when it goes out of scope.
  c2->~MyOtherClass();
  new (&c) MyClass(); // we have to get back to previous type because calling destructor using c would be UB
}

Without @llvm.invariant.group.barrier, LLVM will probably replace %c2 with %c here, since they are trivially the same.

With @llvm.invariant.group.barrier, the result of placement new will be a distinct SSA value that LLVM can't reason about, and we won't accidentally devirtualize c2->foo() to MyClass::foo.
So, to phrase this differently, the @llvm.invariant.group.barrier is responsible for introducing a new abstract memory location and the optimizer is agreeing to never exploit the fact the new abstract memory location is in fact at the same address?

Where in the execution would this new abstract location be introduced?  Is it immediately before the placement new?  If so, that would seem consistent. 


There is, however, a small problem with this model. If the code happened to do this:

  ...
  auto c2 = new (&c) MyOtherClass();
  assert(c2 == &c);
  ...

LLVM might once again replace %c2 with %c, causing bad devirtualization.
Is this well defined C++?  My reading would be that it is.  If this was realloc, it clearly wouldn't be, but I'm not sure placement new has the same restrictions. 

Assuming that it is well defined C, this was exactly the counter example I was missing and the reason that reasoning about abstract locations per object doesn't work.  Thanks.

Given that, I can see why we're stuck with a single abstract location for the storage and need to add and remove the invariantness of a particular location.  I'll go take another read through the proposal with that in mind. 

Philip


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [cfe-dev] Clang devirtualization proposal

Philip Reames-4


On 07/31/2015 02:47 PM, Philip Reames wrote:


On 07/29/2015 09:32 AM, Reid Kleckner wrote:
On Tue, Jul 28, 2015 at 10:58 AM, Philip Reames <[hidden email]> wrote:
Having read through the proposal, I feel like I missing some of the background to understand the problem you're trying to solve.

My mental model is that construction of an object creates a new abstract location in an infinite heap with each object infinitely far apart.  Destruction of the object destroys the abstract location.  As a result, destructing one object and constructing another produce unique incomparable abstract locations.  The fact the two abstract locations might happen to share a physical address is irrelevant. 

If I'm understanding the proposal correctly, this model works for most code.  The key optimization you appear to want to perform is to recognize the fact that these two abstract locations occupy the same memory.  In particular, you want to be able to return mustalias for alias(loc1, loc2).  Another way of saying this is that you want to reason about abstract locations as defined by allocation/deallocation events rather than construction/destruction events.  Is that a fair summary?

What I'm not clear on is *why* recognizing the two abstract locations share a physical address is important.  Given that the contents of the abstract location before construction or after destruction are undefined (right?), what optimization does recognizing the mustalias relation enable?

I think this is incorrect. LLVM's model is closer to the second model, and we need something like the first model to prevent erroneous devirtualization.

The corner case for C++ is when the optimizer observes that two abstract objects share the same physical memory location. In practice, this could happen if the memory allocator becomes visible to the optimizer through inlining. For illustration, do placement new into the stack memory of another object. This is illustrated in example 2 of the proposal:

struct MyClass {
  virtual void foo();
};
struct MyOtherClass : MyClass {
  virtual void foo();
};
int main() {
  MyClass c;
  c.foo();
  // Reuse the storage temporarily.  UB to access the object through ‘c’
  c.~MyClass();
  auto c2 = new (&c) MyOtherClass();
  c2->foo(); //fine, we have new pointer
  // c.foo() // UB, the type has changed
  
  // The storage has to contain a ‘MyClass’ when it goes out of scope.
  c2->~MyOtherClass();
  new (&c) MyClass(); // we have to get back to previous type because calling destructor using c would be UB
}

Without @llvm.invariant.group.barrier, LLVM will probably replace %c2 with %c here, since they are trivially the same.

With @llvm.invariant.group.barrier, the result of placement new will be a distinct SSA value that LLVM can't reason about, and we won't accidentally devirtualize c2->foo() to MyClass::foo.
So, to phrase this differently, the @llvm.invariant.group.barrier is responsible for introducing a new abstract memory location and the optimizer is agreeing to never exploit the fact the new abstract memory location is in fact at the same address?

Where in the execution would this new abstract location be introduced?  Is it immediately before the placement new?  If so, that would seem consistent. 
Sorry, ignore the above.  I got to the part below and forgot to update this before sending.  I hadn't yet internalized the bit about not being able to reason about disjoint abstract locations when I wrote this. 


There is, however, a small problem with this model. If the code happened to do this:

  ...
  auto c2 = new (&c) MyOtherClass();
  assert(c2 == &c);
  ...

LLVM might once again replace %c2 with %c, causing bad devirtualization.
Is this well defined C++?  My reading would be that it is.  If this was realloc, it clearly wouldn't be, but I'm not sure placement new has the same restrictions. 

Assuming that it is well defined C, this was exactly the counter example I was missing and the reason that reasoning about abstract locations per object doesn't work.  Thanks.

Given that, I can see why we're stuck with a single abstract location for the storage and need to add and remove the invariantness of a particular location.  I'll go take another read through the proposal with that in mind. 

Philip



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Clang devirtualization proposal

Philip Reames-4
In reply to this post by Piotr Padlewski
Ok, replying anew now that I understand why reasoning about abstract locations for each object doesn't work.

The general idea of describing a set of load and stores which belong to a particular invariant group seems reasonable.  I've got some questions/comments on the specifics, but the overall direction seems entirely workable for the specific problem you're trying to solve. 


Quoting from the google doc: "If we don’t know definition of some function, we assume that it will not call @llvm.invariant.group.barrier()."
This part really really bugs me.  We generally try to assume minimal knowledge of external functions (i.e. they can do anything) and this assumption would invert that.  Is there a way we can rephrase the proposal which avoids the need for this?  I'm not quite clear what this assumption buys us.


Is there a particular reason why a load or store must belong to a single invariant group rather than be a member of several?  I don't have an immediate use case in mind, but it seems potentially useful. 


"i8* @llvm.invariant.group.barrier(i8*): Given a pointer, produces another pointer that aliases the first but which is considered different for the purposes of load !invariant.group metadata."
The definition here minorly bugs me.  This might just be a matter of wordsmithing, but it seems strange to me that this can't be defined in terms of the assumptions allowed about the memory through the two pointers.  If I'm looking at this instruction in memory dependence analysis, what am I allowed to assume, not assume?  The current definition doesn't make this obvious.  One option: "produces another pointer which aliases the first, but where any invariant assumptions introduced by invariant.group metadata have been striped away."


The notion of using the assume seems to make sense.  I could see an argument for extending the invariant.group metadata with a way to express the assumed value, but we could also make that extension at a later time if needed.


I'm wondering if there's a problematic interaction with CSE here.  Consider this example is pseudo LLVM IR:
v1 = load i64, %p, !invariant.group !Type1
; I called destructor/placement new for the same type, but that optimized entirely away
p2 = invariant.group.barrier(p1)
if (p1 != p2) return.
store i64 0, %p2, !invariant.group !Type1
v2 = load i64, %p2, !invariant.group !Type1
ret i64 v1 - v2

(Assume that !Type is used to describe a write once integer field within some class.  Not all instances have the same integer value.)

Having CSE turn this into:
v1 = load i64, %p, !invariant.group !Type1
p2 = invariant.group.barrier(p1)
if (p1 != p2) return.
store i64 0, %p1, !invariant.group !Type1
v2 = load i64, %p1, !invariant.group !Type1
ret i64 v1 - v2

And then GVN turn this into:
v1 = load i64, %p, !invariant.group !Type1
p2 = invariant.group.barrier(p1)
if (p1 != p2) return.
ret i64 v1 - v1 (-> 0)

This doesn't seem like the result I'd expect.  Is there something about my initial IR which is wrong/invalid in some way?  Is the invariant.group required to be specific to a single bitpattern across all usages within a function/module/context?  That would be reasonable, but I don't think is explicit said right now.  It also makes !invariant.group effectively useless for describing constant fields which are constant per instance rather than per-class. 


Philip



On 07/22/2015 02:55 PM, Piotr Padlewski wrote:
Hi folks,
this summer I will work with Richard Smith on clang devirtualization. Check out our proposal:

https://docs.google.com/document/d/1f2SGa4TIPuBGm6y6YO768GrQsA8awNfGEJSBFukLhYA/edit?usp=sharing

And modified LangRef

You can also check out previous disscussion that was started before our proposal was ready - http://lists.cs.uiuc.edu/pipermail/cfe-dev/2015-July/044052.html

Regards
Piotr Padlewski


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Clang devirtualization proposal

Piotr Padlewski


On Fri, Jul 31, 2015 at 3:53 PM, Philip Reames <[hidden email]> wrote:
Quoting from the google doc: "If we don’t know definition of some function, we assume that it will not call @llvm.invariant.group.barrier()."
This part really really bugs me.  We generally try to assume minimal knowledge of external functions (i.e. they can do anything) and this assumption would invert that.  Is there a way we can rephrase the proposal which avoids the need for this?  I'm not quite clear what this assumption buys us.

This is because without it the optimization will be useless. For example:
A* a = new A;
a->foo(); //outline virtual
a->foo();

If we will assume that foo calls @llvm.invariant.barrier, then we will not be able to optimize the second call. 


Piotr

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Clang devirtualization proposal

Philip Reames-4


On 07/31/2015 04:05 PM, Piotr Padlewski wrote:


On Fri, Jul 31, 2015 at 3:53 PM, Philip Reames <[hidden email]> wrote:
Quoting from the google doc: "If we don’t know definition of some function, we assume that it will not call @llvm.invariant.group.barrier()."
This part really really bugs me.  We generally try to assume minimal knowledge of external functions (i.e. they can do anything) and this assumption would invert that.  Is there a way we can rephrase the proposal which avoids the need for this?  I'm not quite clear what this assumption buys us.

This is because without it the optimization will be useless. For example:
A* a = new A;
a->foo(); //outline virtual
a->foo();

If we will assume that foo calls @llvm.invariant.barrier, then we will not be able to optimize the second call.
Why not?  If foo calls @llvm.invariant.group.barrier, then it would have to produce a new SSA value to accomplish anything which might effect the second call.  Given the call is on "a", not some return value from foo or a global variable, we know that any SSA value created inside foo isn't relevant.  We should end up a with two loads of the vtable using the same SSA value and the same invariant.group metadata.  The later can be forwarded from the former without issue right?

%a = ...;
%vtable1 = load %a + Y !invariant.group !0
%foo1 = load %vtable1 + X, !invariant.group !1
call %foo1(%a)
%vtable2 = load %a + Y !invariant.group !0 <-- Per state rules, this value forwards from previous vtable load
%foo2 = load %vtable2 + X, !invariant.group !1
call %foo2(%a)

Philip



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Clang devirtualization proposal

Sanjoy Das
In reply to this post by Piotr Padlewski
On Fri, Jul 31, 2015 at 4:05 PM, Piotr Padlewski <[hidden email]> wrote:

>
>
> On Fri, Jul 31, 2015 at 3:53 PM, Philip Reames <[hidden email]>
> wrote:
>>
>> Quoting from the google doc: "If we don’t know definition of some
>> function, we assume that it will not call @llvm.invariant.group.barrier()."
>> This part really really bugs me.  We generally try to assume minimal
>> knowledge of external functions (i.e. they can do anything) and this
>> assumption would invert that.  Is there a way we can rephrase the proposal
>> which avoids the need for this?  I'm not quite clear what this assumption
>> buys us.
>>
> This is because without it the optimization will be useless. For example:
> A* a = new A;
> a->foo(); //outline virtual
> a->foo();
>
> If we will assume that foo calls @llvm.invariant.barrier, then we will not
> be able to optimize the second call.

IIUC, given how @llvm.invariant.barrier is specified, you don't have
to actually assume that.  Since the second load from `a` will be
through the same SSA value, it could not have been piped through
@llvm.invariant.barrier.

OTOH, if you have

A *a = new A;
a->foo();
bar(&a);
a->foo();

Then you'll (rightly) have to forgo optimizing the second call to a->foo().

-- Sanjoy


>
>
> Piotr
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [cfe-dev] Clang devirtualization proposal

Reid Kleckner-2
In reply to this post by Philip Reames-4
On Fri, Jul 31, 2015 at 5:03 PM, Philip Reames <[hidden email]> wrote:
On 07/31/2015 04:05 PM, Piotr Padlewski wrote:
On Fri, Jul 31, 2015 at 3:53 PM, Philip Reames <[hidden email]> wrote:
Quoting from the google doc: "If we don’t know definition of some function, we assume that it will not call @llvm.invariant.group.barrier()."
This part really really bugs me.  We generally try to assume minimal knowledge of external functions (i.e. they can do anything) and this assumption would invert that.  Is there a way we can rephrase the proposal which avoids the need for this?  I'm not quite clear what this assumption buys us.

This is because without it the optimization will be useless. For example:
A* a = new A;
a->foo(); //outline virtual
a->foo();

If we will assume that foo calls @llvm.invariant.barrier, then we will not be able to optimize the second call.
Why not?  If foo calls @llvm.invariant.group.barrier, then it would have to produce a new SSA value to accomplish anything which might effect the second call.  Given the call is on "a", not some return value from foo or a global variable, we know that any SSA value created inside foo isn't relevant.  We should end up a with two loads of the vtable using the same SSA value and the same invariant.group metadata.  The later can be forwarded from the former without issue right?

%a = ...;
%vtable1 = load %a + Y !invariant.group !0
%foo1 = load %vtable1 + X, !invariant.group !1
call %foo1(%a)
%vtable2 = load %a + Y !invariant.group !0 <-- Per state rules, this value forwards from previous vtable load
%foo2 = load %vtable2 + X, !invariant.group !1
call %foo2(%a)

Right, you got the intention of the design. The foo call can placement new into %a if it wants to, but it better put things back the way they were before we return, because %a isn't changing.

We probably missed that sentence when reviewing the proposal. :) 

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [cfe-dev] Clang devirtualization proposal

Reid Kleckner-2
In reply to this post by Philip Reames-4
On Fri, Jul 31, 2015 at 3:53 PM, Philip Reames <[hidden email]> wrote:
I'm wondering if there's a problematic interaction with CSE here.  Consider this example is pseudo LLVM IR:
v1 = load i64, %p, !invariant.group !Type1
; I called destructor/placement new for the same type, but that optimized entirely away
p2 = invariant.group.barrier(p1)
if (p1 != p2) return.
store i64 0, %p2, !invariant.group !Type1
v2 = load i64, %p2, !invariant.group !Type1
ret i64 v1 - v2

(Assume that !Type is used to describe a write once integer field within some class.  Not all instances have the same integer value.)

Having CSE turn this into:
v1 = load i64, %p, !invariant.group !Type1
p2 = invariant.group.barrier(p1)
if (p1 != p2) return.
store i64 0, %p1, !invariant.group !Type1
v2 = load i64, %p1, !invariant.group !Type1
ret i64 v1 - v2

And then GVN turn this into:
v1 = load i64, %p, !invariant.group !Type1
p2 = invariant.group.barrier(p1)
if (p1 != p2) return.
ret i64 v1 - v1 (-> 0)

This doesn't seem like the result I'd expect.  Is there something about my initial IR which is wrong/invalid in some way?  Is the invariant.group required to be specific to a single bitpattern across all usages within a function/module/context?  That would be reasonable, but I don't think is explicit said right now.  It also makes !invariant.group effectively useless for describing constant fields which are constant per instance rather than per-class.  

Yes, this family of examples scares me. :) It seems we've discovered a new device testing IR soundness. We used it to build a test case that shows that 'readonly' on arguments without 'nocapture' doesn't let you forward stores across such a call.

Consider this pseudo-IR and some possible transforms that I would expect to be semantics preserving:

void f(i32* readonly %a, i32* %b) {
  llvm.assume(%a == %b)
  store i32 42, i32* %b
}
  ...
  %p = alloca i32
  store i32 13, i32* %p
  call f(i32* readonly %p, i32* %p)
  %r = load i32, i32* %p

; Propagate llvm.assume info
void f(i32* readonly %a, i32* %b) {
  store i32 42, i32* %a
}
  ...
  %p = alloca i32
  store i32 13, i32* %p
  call f(i32* readonly %p, i32* %p)
  %r = load i32, i32* %p

; Delete dead args
void f(i32* readonly %a) {
  store i32 42
}
  ...
  %p = alloca i32
  store i32 13, i32* %p
  call f(i32* readonly %p)
  %r = load i32, i32* %p

; Forward store %p to load %p, since the only use of %p is readonly
void f(i32* readonly %a) {
  store i32 42
}
  ...
  %p = alloca i32
  call f(i32* readonly %p)
  %r = i32 13

Today LLVM will not do the final transform because it requires readonly on the entire function, or nocapture on the argument. nocapture cannot be inferred due to the assume comparison.

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [cfe-dev] Clang devirtualization proposal

Sanjoy Das
On Fri, Jul 31, 2015 at 6:18 PM, Reid Kleckner <[hidden email]> wrote:

> Consider this pseudo-IR and some possible transforms that I would expect to
> be semantics preserving:
>
> void f(i32* readonly %a, i32* %b) {
>   llvm.assume(%a == %b)
>   store i32 42, i32* %b
> }
>   ...
>   %p = alloca i32
>   store i32 13, i32* %p
>   call f(i32* readonly %p, i32* %p)
>   %r = load i32, i32* %p
>
> ; Propagate llvm.assume info
> void f(i32* readonly %a, i32* %b) {
>   store i32 42, i32* %a
> }
>   ...
>   %p = alloca i32
>   store i32 13, i32* %p
>   call f(i32* readonly %p, i32* %p)
>   %r = load i32, i32* %p

I'd say this first transformation is incorrect.  `readonly` is
effectively part of `%a`'s "type" as it constrains and affects the
operations you can do on `%a`. Even if `%b` is bitwise equivalent to
`%a` at runtime, it is "type incompatible" to replace `%a` with `%b`.

This is similar to how you cannot replace `store i32 42, i32
addrspace(1)* %a` with `store i32 42, i32 addrspace(2)* %b`, even if
you can prove `ptrtoint %a` == `ptrtoint %b` -- the nature of `store`
is dependent on the type of the pointer you store through.

The glitch in LLVM IR right now is that the `readonly`ness of `%a` is
not modeled in the type system, when I think it should be. An `i32
readonly*` should be a different type from `i32*`.  In practice this
may be non-trivial to get right (for instance `phi`s and `selects`
will either have to do a type merge, or we'd have to have explicit
type operators at the IR level).

-- Sanjoy
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [cfe-dev] Clang devirtualization proposal

Hal Finkel
----- Original Message -----

> From: "Sanjoy Das" <[hidden email]>
> To: "Reid Kleckner" <[hidden email]>
> Cc: "Piotr Padlewski" <[hidden email]>, "[hidden email] Developers" <[hidden email]>, "LLVM Developers
> Mailing List" <[hidden email]>
> Sent: Saturday, August 1, 2015 1:22:50 AM
> Subject: Re: [LLVMdev] [cfe-dev] Clang devirtualization proposal
>
> On Fri, Jul 31, 2015 at 6:18 PM, Reid Kleckner <[hidden email]>
> wrote:
> > Consider this pseudo-IR and some possible transforms that I would
> > expect to
> > be semantics preserving:
> >
> > void f(i32* readonly %a, i32* %b) {
> >   llvm.assume(%a == %b)
> >   store i32 42, i32* %b
> > }
> >   ...
> >   %p = alloca i32
> >   store i32 13, i32* %p
> >   call f(i32* readonly %p, i32* %p)
> >   %r = load i32, i32* %p
> >
> > ; Propagate llvm.assume info
> > void f(i32* readonly %a, i32* %b) {
> >   store i32 42, i32* %a
> > }
> >   ...
> >   %p = alloca i32
> >   store i32 13, i32* %p
> >   call f(i32* readonly %p, i32* %p)
> >   %r = load i32, i32* %p
>
> I'd say this first transformation is incorrect.  `readonly` is
> effectively part of `%a`'s "type" as it constrains and affects the
> operations you can do on `%a`. Even if `%b` is bitwise equivalent to
> `%a` at runtime, it is "type incompatible" to replace `%a` with `%b`.
>
> This is similar to how you cannot replace `store i32 42, i32
> addrspace(1)* %a` with `store i32 42, i32 addrspace(2)* %b`, even if
> you can prove `ptrtoint %a` == `ptrtoint %b` -- the nature of `store`
> is dependent on the type of the pointer you store through.
>
> The glitch in LLVM IR right now is that the `readonly`ness of `%a` is
> not modeled in the type system, when I think it should be. An `i32
> readonly*` should be a different type from `i32*`.  In practice this
> may be non-trivial to get right (for instance `phi`s and `selects`
> will either have to do a type merge, or we'd have to have explicit
> type operators at the IR level).

We could do this, but then we'd need to promote these things to first-class parts of the type system (and I'd need to put further thought about how this interacts with dynamically-true properties at callsites and inlining).

The alternative way of looking at it, which is true today, is that @llvm.assume is not removed even when its information is 'used'. It appears, given this example, that this is actually required for correctness, and that dead-argument elimination needs to specifically not ignore effectively-ephemeral values/arguments.

 -Hal

>
> -- Sanjoy
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
12