extractelement causes memory access violation - what to do?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
18 messages Options
Reply | Threaded
Open this post in threaded view
|

extractelement causes memory access violation - what to do?

Paweł Bylica
Hi,

Let's have a simple program:
define i32 @main(i32 %n, i64 %idx) {
  %idxSafe = trunc i64 %idx to i5
  %r = extractelement <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>, i64 %idx
  ret i32 %r
}

The assembly of that would be:
pcmpeqd %xmm0, %xmm0
movdqa %xmm0, -24(%rsp)
movl -24(%rsp,%rsi,4), %eax
retq

The language reference states that the extractelement instruction produces undefined value in case the index argument is invalid (our case). But the implementation simply dumps the vector to the stack memory, calculates the memory offset out of the index value and tries to access the memory. That causes the crash.

The workaround is to trunc the index value before extractelement (see %idxSafe). But what should be the ultimate solution?

- PB

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: extractelement causes memory access violation - what to do?

David Majnemer


On Fri, Jun 26, 2015 at 7:00 AM, Paweł Bylica <[hidden email]> wrote:
Hi,

Let's have a simple program:
define i32 @main(i32 %n, i64 %idx) {
  %idxSafe = trunc i64 %idx to i5
  %r = extractelement <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>, i64 %idx
  ret i32 %r
}

The assembly of that would be:
pcmpeqd %xmm0, %xmm0
movdqa %xmm0, -24(%rsp)
movl -24(%rsp,%rsi,4), %eax
retq

The language reference states that the extractelement instruction produces undefined value in case the index argument is invalid (our case). But the implementation simply dumps the vector to the stack memory, calculates the memory offset out of the index value and tries to access the memory. That causes the crash.

The workaround is to trunc the index value before extractelement (see %idxSafe). But what should be the ultimate solution?

We could fix this by specifying that out of bounds access on an extractelement leads to full-on undefined behavior, no need to force everyone to eat the cost of a mask.
 

- PB

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: extractelement causes memory access violation - what to do?

Philip Reames-4
On 06/26/2015 08:42 AM, David Majnemer wrote:


On Fri, Jun 26, 2015 at 7:00 AM, Paweł Bylica <[hidden email]> wrote:
Hi,

Let's have a simple program:
define i32 @main(i32 %n, i64 %idx) {
  %idxSafe = trunc i64 %idx to i5
  %r = extractelement <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>, i64 %idx
  ret i32 %r
}

The assembly of that would be:
pcmpeqd %xmm0, %xmm0
movdqa %xmm0, -24(%rsp)
movl -24(%rsp,%rsi,4), %eax
retq

The language reference states that the extractelement instruction produces undefined value in case the index argument is invalid (our case). But the implementation simply dumps the vector to the stack memory, calculates the memory offset out of the index value and tries to access the memory. That causes the crash.

The workaround is to trunc the index value before extractelement (see %idxSafe). But what should be the ultimate solution?

We could fix this by specifying that out of bounds access on an extractelement leads to full-on undefined behavior, no need to force everyone to eat the cost of a mask.
This seems like the appropriate decision to me.  It's closely in line with existing practice and assumptions. 

Philip

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: extractelement causes memory access violation - what to do?

David Majnemer


On Fri, Jun 26, 2015 at 9:38 AM, Philip Reames <[hidden email]> wrote:
On 06/26/2015 08:42 AM, David Majnemer wrote:


On Fri, Jun 26, 2015 at 7:00 AM, Paweł Bylica <[hidden email]> wrote:
Hi,

Let's have a simple program:
define i32 @main(i32 %n, i64 %idx) {
  %idxSafe = trunc i64 %idx to i5
  %r = extractelement <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>, i64 %idx
  ret i32 %r
}

The assembly of that would be:
pcmpeqd %xmm0, %xmm0
movdqa %xmm0, -24(%rsp)
movl -24(%rsp,%rsi,4), %eax
retq

The language reference states that the extractelement instruction produces undefined value in case the index argument is invalid (our case). But the implementation simply dumps the vector to the stack memory, calculates the memory offset out of the index value and tries to access the memory. That causes the crash.

The workaround is to trunc the index value before extractelement (see %idxSafe). But what should be the ultimate solution?

We could fix this by specifying that out of bounds access on an extractelement leads to full-on undefined behavior, no need to force everyone to eat the cost of a mask.
This seems like the appropriate decision to me.  It's closely in line with existing practice and assumptions. 

The only problem that I can see by specifying it this way is that they cannot be speculatively executed, isSafeToSpeculativelyExecute believes it is currently safe to do so.  I can see why speculating this instruction might be good. Perhaps we should emit a mask...
 

Philip


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: extractelement causes memory access violation - what to do?

Philip Reames-4


On 06/26/2015 11:07 AM, David Majnemer wrote:


On Fri, Jun 26, 2015 at 9:38 AM, Philip Reames <[hidden email]> wrote:
On 06/26/2015 08:42 AM, David Majnemer wrote:


On Fri, Jun 26, 2015 at 7:00 AM, Paweł Bylica <[hidden email]> wrote:
Hi,

Let's have a simple program:
define i32 @main(i32 %n, i64 %idx) {
  %idxSafe = trunc i64 %idx to i5
  %r = extractelement <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>, i64 %idx
  ret i32 %r
}

The assembly of that would be:
pcmpeqd %xmm0, %xmm0
movdqa %xmm0, -24(%rsp)
movl -24(%rsp,%rsi,4), %eax
retq

The language reference states that the extractelement instruction produces undefined value in case the index argument is invalid (our case). But the implementation simply dumps the vector to the stack memory, calculates the memory offset out of the index value and tries to access the memory. That causes the crash.

The workaround is to trunc the index value before extractelement (see %idxSafe). But what should be the ultimate solution?

We could fix this by specifying that out of bounds access on an extractelement leads to full-on undefined behavior, no need to force everyone to eat the cost of a mask.
This seems like the appropriate decision to me.  It's closely in line with existing practice and assumptions. 

The only problem that I can see by specifying it this way is that they cannot be speculatively executed, isSafeToSpeculativelyExecute believes it is currently safe to do so.  I can see why speculating this instruction might be good. Perhaps we should emit a mask...
Hm, yuck.  Hadn't thought about that one. 

One option would to let extractelements with provably in bounds entries be speculated, but not others. 

Another option might be to have a mask emitted by the code that is speculating it.

I'm not sure how bad either scheme would actually be in practice.  Almost all of the extractelements I see in optimized IR have constant indices. 

Philip

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: extractelement causes memory access violation - what to do?

Paweł Bylica
In reply to this post by David Majnemer
On Fri, Jun 26, 2015 at 5:42 PM David Majnemer <[hidden email]> wrote:
On Fri, Jun 26, 2015 at 7:00 AM, Paweł Bylica <[hidden email]> wrote:
Hi,

Let's have a simple program:
define i32 @main(i32 %n, i64 %idx) {
  %idxSafe = trunc i64 %idx to i5
  %r = extractelement <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>, i64 %idx
  ret i32 %r
}

The assembly of that would be:
pcmpeqd %xmm0, %xmm0
movdqa %xmm0, -24(%rsp)
movl -24(%rsp,%rsi,4), %eax
retq

The language reference states that the extractelement instruction produces undefined value in case the index argument is invalid (our case). But the implementation simply dumps the vector to the stack memory, calculates the memory offset out of the index value and tries to access the memory. That causes the crash.

The workaround is to trunc the index value before extractelement (see %idxSafe). But what should be the ultimate solution?

We could fix this by specifying that out of bounds access on an extractelement leads to full-on undefined behavior, no need to force everyone to eat the cost of a mask.

I don't have preference for any of the solutions.

I have a side question. It is not stated explicitly in the reference but I would assume the index of extractelement is processed as an unsigned value. However, the DAG Builder extends the index with sext. Is it correct?

- PB

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: extractelement causes memory access violation - what to do?

Hal Finkel
In reply to this post by Philip Reames-4
----- Original Message -----

> From: "Philip Reames" <[hidden email]>
> To: "David Majnemer" <[hidden email]>
> Cc: "LLVMdev" <[hidden email]>
> Sent: Friday, June 26, 2015 1:20:32 PM
> Subject: Re: [LLVMdev] extractelement causes memory access violation - what to do?
>
>
>
>
>
> On 06/26/2015 11:07 AM, David Majnemer wrote:
>
>
>
>
>
>
>
> On Fri, Jun 26, 2015 at 9:38 AM, Philip Reames <
> [hidden email] > wrote:
>
>
>
>
> On 06/26/2015 08:42 AM, David Majnemer wrote:
>
>
>
>
>
>
>
> On Fri, Jun 26, 2015 at 7:00 AM, Paweł Bylica < [hidden email] >
> wrote:
>
>
>
> Hi,
>
>
> Let's have a simple program:
>
> define i32 @main(i32 %n, i64 %idx) {
> %idxSafe = trunc i64 %idx to i5
> %r = extractelement <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>, i64
> %idx
> ret i32 %r
> }
>
>
> The assembly of that would be:
>
> pcmpeqd %xmm0, %xmm0
> movdqa %xmm0, -24(%rsp)
> movl -24(%rsp,%rsi,4), %eax
> retq
>
>
> The language reference states that the extractelement instruction
> produces undefined value in case the index argument is invalid (our
> case). But the implementation simply dumps the vector to the stack
> memory, calculates the memory offset out of the index value and
> tries to access the memory. That causes the crash.
>
>
> The workaround is to trunc the index value before extractelement (see
> %idxSafe). But what should be the ultimate solution?
>
>
> We could fix this by specifying that out of bounds access on an
> extractelement leads to full-on undefined behavior, no need to force
> everyone to eat the cost of a mask.
> This seems like the appropriate decision to me. It's closely in line
> with existing practice and assumptions.
>
>
>
> The only problem that I can see by specifying it this way is that
> they cannot be speculatively executed, isSafeToSpeculativelyExecute
> believes it is currently safe to do so. I can see why speculating
> this instruction might be good. Perhaps we should emit a mask...
> Hm, yuck. Hadn't thought about that one.
>
> One option would to let extractelements with provably in bounds
> entries be speculated, but not others.
>
> Another option might be to have a mask emitted by the code that is
> speculating it.
>
> I'm not sure how bad either scheme would actually be in practice.
> Almost all of the extractelements I see in optimized IR have
> constant indices.

Same here. isSafeToSpeculativelyExecute will be able to handle the constant-index case (any anything else for which ComputeKnownBits is helpful for range reduction). For the variable-index case, the frontend can either emit a mask or leave it as UB. I think that's the best solution.

 -Hal

>
> Philip
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: extractelement causes memory access violation - what to do?

Hal Finkel
In reply to this post by Paweł Bylica
----- Original Message -----

> From: "Paweł Bylica" <[hidden email]>
> To: "David Majnemer" <[hidden email]>
> Cc: "LLVMdev" <[hidden email]>
> Sent: Tuesday, June 30, 2015 5:42:24 AM
> Subject: Re: [LLVMdev] extractelement causes memory access violation - what to do?
>
>
>
>
>
> On Fri, Jun 26, 2015 at 5:42 PM David Majnemer <
> [hidden email] > wrote:
>
>
>
>
>
> On Fri, Jun 26, 2015 at 7:00 AM, Paweł Bylica < [hidden email] >
> wrote:
>
>
>
> Hi,
>
>
> Let's have a simple program:
>
> define i32 @main(i32 %n, i64 %idx) {
> %idxSafe = trunc i64 %idx to i5
> %r = extractelement <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>, i64
> %idx
> ret i32 %r
> }
>
>
> The assembly of that would be:
>
> pcmpeqd %xmm0, %xmm0
> movdqa %xmm0, -24(%rsp)
> movl -24(%rsp,%rsi,4), %eax
> retq
>
>
> The language reference states that the extractelement instruction
> produces undefined value in case the index argument is invalid (our
> case). But the implementation simply dumps the vector to the stack
> memory, calculates the memory offset out of the index value and
> tries to access the memory. That causes the crash.
>
>
> The workaround is to trunc the index value before extractelement (see
> %idxSafe). But what should be the ultimate solution?
>
>
>
>
>
> We could fix this by specifying that out of bounds access on an
> extractelement leads to full-on undefined behavior, no need to force
> everyone to eat the cost of a mask.
>
>
> I don't have preference for any of the solutions.
>
>
> I have a side question. It is not stated explicitly in the reference
> but I would assume the index of extractelement is processed as an
> unsigned value. However, the DAG Builder extends the index with
> sext. Is it correct?

Hrmm. Given that only (small) positive numbers are valid, it shouldn't matter. Unless we can find a reason that it works better to be sext, it seems conceptually cleaner to make it zext.

 -Hal

>
>
> - PB
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: extractelement causes memory access violation - what to do?

Paweł Bylica
On Tue, Jun 30, 2015 at 11:03 PM Hal Finkel <[hidden email]> wrote:
----- Original Message -----
> From: "Paweł Bylica" <[hidden email]>
> To: "David Majnemer" <[hidden email]>
> Cc: "LLVMdev" <[hidden email]>
> Sent: Tuesday, June 30, 2015 5:42:24 AM
> Subject: Re: [LLVMdev] extractelement causes memory access violation - what   to do?
>
>
>
>
>
> On Fri, Jun 26, 2015 at 5:42 PM David Majnemer <
> [hidden email] > wrote:
>
>
>
>
>
> On Fri, Jun 26, 2015 at 7:00 AM, Paweł Bylica < [hidden email] >
> wrote:
>
>
>
> Hi,
>
>
> Let's have a simple program:
>
> define i32 @main(i32 %n, i64 %idx) {
> %idxSafe = trunc i64 %idx to i5
> %r = extractelement <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>, i64
> %idx
> ret i32 %r
> }
>
>
> The assembly of that would be:
>
> pcmpeqd %xmm0, %xmm0
> movdqa %xmm0, -24(%rsp)
> movl -24(%rsp,%rsi,4), %eax
> retq
>
>
> The language reference states that the extractelement instruction
> produces undefined value in case the index argument is invalid (our
> case). But the implementation simply dumps the vector to the stack
> memory, calculates the memory offset out of the index value and
> tries to access the memory. That causes the crash.
>
>
> The workaround is to trunc the index value before extractelement (see
> %idxSafe). But what should be the ultimate solution?
>
>
>
>
>
> We could fix this by specifying that out of bounds access on an
> extractelement leads to full-on undefined behavior, no need to force
> everyone to eat the cost of a mask.
>
>
> I don't have preference for any of the solutions.
>
>
> I have a side question. It is not stated explicitly in the reference
> but I would assume the index of extractelement is processed as an
> unsigned value. However, the DAG Builder extends the index with
> sext. Is it correct?

Hrmm. Given that only (small) positive numbers are valid, it shouldn't matter. Unless we can find a reason that it works better to be sext, it seems conceptually cleaner to make it zext.

I have tried to change it to zext. 2 Mips test have failed. I haven't checked the details though.
sext looks pretty wrong to me because i5 -1 does not mean 31 any more.

- PB
 

 -Hal

>
>
> - PB
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: extractelement causes memory access violation - what to do?

Pete Cooper
Sorry for chiming in so late in this.

So I agree that negative indices are UB, I don’t think thats contentious.

However, I think the issue here is the DAG expansion.  That is the point at which we go from UB which would just be hidden in the instruction to something which can crash.  I think its at that point where we should mask to ensure that the in memory expansion doesn’t read out of bounds.  On architectures which do the variable extract in an instruction, they won’t be penalized by a mask, only the memory expansion will be which should be rarer,

The point about speculation at the IR level is interesting.  Personally i’m ok with constant indices being speculated and variable not.  If we later want to find a good way to ask TTI whether a variable extract is cheap then we can find a way to do so.

Anyway, just my 2c.

Cheers,
Pete
On Jun 30, 2015, at 2:07 PM, Paweł Bylica <[hidden email]> wrote:

On Tue, Jun 30, 2015 at 11:03 PM Hal Finkel <[hidden email]> wrote:
----- Original Message -----
> From: "Paweł Bylica" <[hidden email]>
> To: "David Majnemer" <[hidden email]>
> Cc: "LLVMdev" <[hidden email]>
> Sent: Tuesday, June 30, 2015 5:42:24 AM
> Subject: Re: [LLVMdev] extractelement causes memory access violation - what   to do?
>
>
>
>
>
> On Fri, Jun 26, 2015 at 5:42 PM David Majnemer <
> [hidden email] > wrote:
>
>
>
>
>
> On Fri, Jun 26, 2015 at 7:00 AM, Paweł Bylica < [hidden email] >
> wrote:
>
>
>
> Hi,
>
>
> Let's have a simple program:
>
> define i32 @main(i32 %n, i64 %idx) {
> %idxSafe = trunc i64 %idx to i5
> %r = extractelement <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>, i64
> %idx
> ret i32 %r
> }
>
>
> The assembly of that would be:
>
> pcmpeqd %xmm0, %xmm0
> movdqa %xmm0, -24(%rsp)
> movl -24(%rsp,%rsi,4), %eax
> retq
>
>
> The language reference states that the extractelement instruction
> produces undefined value in case the index argument is invalid (our
> case). But the implementation simply dumps the vector to the stack
> memory, calculates the memory offset out of the index value and
> tries to access the memory. That causes the crash.
>
>
> The workaround is to trunc the index value before extractelement (see
> %idxSafe). But what should be the ultimate solution?
>
>
>
>
>
> We could fix this by specifying that out of bounds access on an
> extractelement leads to full-on undefined behavior, no need to force
> everyone to eat the cost of a mask.
>
>
> I don't have preference for any of the solutions.
>
>
> I have a side question. It is not stated explicitly in the reference
> but I would assume the index of extractelement is processed as an
> unsigned value. However, the DAG Builder extends the index with
> sext. Is it correct?

Hrmm. Given that only (small) positive numbers are valid, it shouldn't matter. Unless we can find a reason that it works better to be sext, it seems conceptually cleaner to make it zext.

I have tried to change it to zext. 2 Mips test have failed. I haven't checked the details though.
sext looks pretty wrong to me because i5 -1 does not mean 31 any more.

- PB
 

 -Hal

>
>
> - PB
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: extractelement causes memory access violation - what to do?

Hal Finkel
----- Original Message -----

> From: "Pete Cooper" <[hidden email]>
> To: "Paweł Bylica" <[hidden email]>
> Cc: "Hal Finkel" <[hidden email]>, "LLVMdev" <[hidden email]>
> Sent: Wednesday, July 1, 2015 12:08:37 PM
> Subject: Re: [LLVMdev] extractelement causes memory access violation - what to do?
>
> Sorry for chiming in so late in this.
>
> So I agree that negative indices are UB, I don’t think thats
> contentious.
>
> However, I think the issue here is the DAG expansion. That is the
> point at which we go from UB which would just be hidden in the
> instruction to something which can crash. I think its at that point
> where we should mask to ensure that the in memory expansion doesn’t
> read out of bounds. On architectures which do the variable extract
> in an instruction, they won’t be penalized by a mask,

Why do you feel that they won't be penalized by the mask? Or are you assuming will adjust the patterns to match the index plus the mask?

> only the
> memory expansion will be which should be rarer,

On some architectures expansion in memory is not particularly expensive because they have very good store-to-load forwarding. Adding additional masking instructions into the critical path of correct code will not be a welcome change.

>
> The point about speculation at the IR level is interesting.
> Personally i’m ok with constant indices being speculated and
> variable not. If we later want to find a good way to ask TTI whether
> a variable extract is cheap then we can find a way to do so.

It is not about expense, it is about not introducing UB when speculating the instruction.

 -Hal

>
> Anyway, just my 2c.
>
>
> Cheers,
> Pete
>
>
>
>
> On Jun 30, 2015, at 2:07 PM, Paweł Bylica < [hidden email] > wrote:
>
>
>
>
> On Tue, Jun 30, 2015 at 11:03 PM Hal Finkel < [hidden email] >
> wrote:
>
>
> ----- Original Message -----
> > From: "Paweł Bylica" < [hidden email] >
> > To: "David Majnemer" < [hidden email] >
> > Cc: "LLVMdev" < [hidden email] >
> > Sent: Tuesday, June 30, 2015 5:42:24 AM
> > Subject: Re: [LLVMdev] extractelement causes memory access
> > violation - what to do?
> >
> >
> >
> >
> >
> > On Fri, Jun 26, 2015 at 5:42 PM David Majnemer <
> > [hidden email] > wrote:
> >
> >
> >
> >
> >
> > On Fri, Jun 26, 2015 at 7:00 AM, Paweł Bylica < [hidden email] >
> > wrote:
> >
> >
> >
> > Hi,
> >
> >
> > Let's have a simple program:
> >
> > define i32 @main(i32 %n, i64 %idx) {
> > %idxSafe = trunc i64 %idx to i5
> > %r = extractelement <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>, i64
> > %idx
> > ret i32 %r
> > }
> >
> >
> > The assembly of that would be:
> >
> > pcmpeqd %xmm0, %xmm0
> > movdqa %xmm0, -24(%rsp)
> > movl -24(%rsp,%rsi,4), %eax
> > retq
> >
> >
> > The language reference states that the extractelement instruction
> > produces undefined value in case the index argument is invalid (our
> > case). But the implementation simply dumps the vector to the stack
> > memory, calculates the memory offset out of the index value and
> > tries to access the memory. That causes the crash.
> >
> >
> > The workaround is to trunc the index value before extractelement
> > (see
> > %idxSafe). But what should be the ultimate solution?
> >
> >
> >
> >
> >
> > We could fix this by specifying that out of bounds access on an
> > extractelement leads to full-on undefined behavior, no need to
> > force
> > everyone to eat the cost of a mask.
> >
> >
> > I don't have preference for any of the solutions.
> >
> >
> > I have a side question. It is not stated explicitly in the
> > reference
> > but I would assume the index of extractelement is processed as an
> > unsigned value. However, the DAG Builder extends the index with
> > sext. Is it correct?
>
> Hrmm. Given that only (small) positive numbers are valid, it
> shouldn't matter. Unless we can find a reason that it works better
> to be sext, it seems conceptually cleaner to make it zext.
>
>
>
> I have tried to change it to zext. 2 Mips test have failed. I haven't
> checked the details though.
> sext looks pretty wrong to me because i5 -1 does not mean 31 any
> more.
>
>
> - PB
>
>
>
> -Hal
>
> >
> >
> > - PB
> > _______________________________________________
> > LLVM Developers mailing list
> > [hidden email] http://llvm.cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
> _______________________________________________
> LLVM Developers mailing list
> [hidden email] http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>

--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: extractelement causes memory access violation - what to do?

Pete Cooper

> On Jul 1, 2015, at 3:45 PM, Hal Finkel <[hidden email]> wrote:
>
> ----- Original Message -----
>> From: "Pete Cooper" <[hidden email]>
>> To: "Paweł Bylica" <[hidden email]>
>> Cc: "Hal Finkel" <[hidden email]>, "LLVMdev" <[hidden email]>
>> Sent: Wednesday, July 1, 2015 12:08:37 PM
>> Subject: Re: [LLVMdev] extractelement causes memory access violation - what to do?
>>
>> Sorry for chiming in so late in this.
>>
>> So I agree that negative indices are UB, I don’t think thats
>> contentious.
>>
>> However, I think the issue here is the DAG expansion. That is the
>> point at which we go from UB which would just be hidden in the
>> instruction to something which can crash. I think its at that point
>> where we should mask to ensure that the in memory expansion doesn’t
>> read out of bounds. On architectures which do the variable extract
>> in an instruction, they won’t be penalized by a mask,
>
> Why do you feel that they won't be penalized by the mask? Or are you assuming will adjust the patterns to match the index plus the mask?
Ah, should have explained better.  What I meant was that if i can do the variable extract in a register without going to memory at all (so have suitable legal instructions to do so), then we won’t generate a mask at all.  The mask would only be generated when the legalizer moves the data to memory which we don’t do if its legal.
>
>> only the
>> memory expansion will be which should be rarer,
>
> On some architectures expansion in memory is not particularly expensive because they have very good store-to-load forwarding. Adding additional masking instructions into the critical path of correct code will not be a welcome change.
Thats true, so i guess it depends how many architectures need to do variable extracts in memory.  I have no idea if any architectures we support are able to do a variable extract in a register, or if all use memory.  If most use a register, then penalizing the few who do use memory by inserting a mask seems reasonable.
>
>>
>> The point about speculation at the IR level is interesting.
>> Personally i’m ok with constant indices being speculated and
>> variable not. If we later want to find a good way to ask TTI whether
>> a variable extract is cheap then we can find a way to do so.
>
> It is not about expense, it is about not introducing UB when speculating the instruction.
Yeah, I see what you mean here.  So the user could have written

if (i >= 0) x = extract v[i]

but if we speculate then we aren’t guarded and have UB.  Having the backend insert the mask would fix this, but I agree that someone, somewhere needs to put in the mask if we want to allow speculation here, and the target can’t do the variable extract in a register.

Pete

>
> -Hal
>
>>
>> Anyway, just my 2c.
>>
>>
>> Cheers,
>> Pete
>>
>>
>>
>>
>> On Jun 30, 2015, at 2:07 PM, Paweł Bylica < [hidden email] > wrote:
>>
>>
>>
>>
>> On Tue, Jun 30, 2015 at 11:03 PM Hal Finkel < [hidden email] >
>> wrote:
>>
>>
>> ----- Original Message -----
>>> From: "Paweł Bylica" < [hidden email] >
>>> To: "David Majnemer" < [hidden email] >
>>> Cc: "LLVMdev" < [hidden email] >
>>> Sent: Tuesday, June 30, 2015 5:42:24 AM
>>> Subject: Re: [LLVMdev] extractelement causes memory access
>>> violation - what to do?
>>>
>>>
>>>
>>>
>>>
>>> On Fri, Jun 26, 2015 at 5:42 PM David Majnemer <
>>> [hidden email] > wrote:
>>>
>>>
>>>
>>>
>>>
>>> On Fri, Jun 26, 2015 at 7:00 AM, Paweł Bylica < [hidden email] >
>>> wrote:
>>>
>>>
>>>
>>> Hi,
>>>
>>>
>>> Let's have a simple program:
>>>
>>> define i32 @main(i32 %n, i64 %idx) {
>>> %idxSafe = trunc i64 %idx to i5
>>> %r = extractelement <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>, i64
>>> %idx
>>> ret i32 %r
>>> }
>>>
>>>
>>> The assembly of that would be:
>>>
>>> pcmpeqd %xmm0, %xmm0
>>> movdqa %xmm0, -24(%rsp)
>>> movl -24(%rsp,%rsi,4), %eax
>>> retq
>>>
>>>
>>> The language reference states that the extractelement instruction
>>> produces undefined value in case the index argument is invalid (our
>>> case). But the implementation simply dumps the vector to the stack
>>> memory, calculates the memory offset out of the index value and
>>> tries to access the memory. That causes the crash.
>>>
>>>
>>> The workaround is to trunc the index value before extractelement
>>> (see
>>> %idxSafe). But what should be the ultimate solution?
>>>
>>>
>>>
>>>
>>>
>>> We could fix this by specifying that out of bounds access on an
>>> extractelement leads to full-on undefined behavior, no need to
>>> force
>>> everyone to eat the cost of a mask.
>>>
>>>
>>> I don't have preference for any of the solutions.
>>>
>>>
>>> I have a side question. It is not stated explicitly in the
>>> reference
>>> but I would assume the index of extractelement is processed as an
>>> unsigned value. However, the DAG Builder extends the index with
>>> sext. Is it correct?
>>
>> Hrmm. Given that only (small) positive numbers are valid, it
>> shouldn't matter. Unless we can find a reason that it works better
>> to be sext, it seems conceptually cleaner to make it zext.
>>
>>
>>
>> I have tried to change it to zext. 2 Mips test have failed. I haven't
>> checked the details though.
>> sext looks pretty wrong to me because i5 -1 does not mean 31 any
>> more.
>>
>>
>> - PB
>>
>>
>>
>> -Hal
>>
>>>
>>>
>>> - PB
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> [hidden email] http://llvm.cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>
>>
>> --
>> Hal Finkel
>> Assistant Computational Scientist
>> Leadership Computing Facility
>> Argonne National Laboratory
>> _______________________________________________
>> LLVM Developers mailing list
>> [hidden email] http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>>
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: extractelement causes memory access violation - what to do?

Hal Finkel
----- Original Message -----

> From: "Pete Cooper" <[hidden email]>
> To: "Hal Finkel" <[hidden email]>
> Cc: "LLVMdev" <[hidden email]>, "Paweł Bylica" <[hidden email]>
> Sent: Wednesday, July 1, 2015 6:42:41 PM
> Subject: Re: [LLVMdev] extractelement causes memory access violation - what to do?
>
>
> > On Jul 1, 2015, at 3:45 PM, Hal Finkel <[hidden email]> wrote:
> >
> > ----- Original Message -----
> >> From: "Pete Cooper" <[hidden email]>
> >> To: "Paweł Bylica" <[hidden email]>
> >> Cc: "Hal Finkel" <[hidden email]>, "LLVMdev"
> >> <[hidden email]>
> >> Sent: Wednesday, July 1, 2015 12:08:37 PM
> >> Subject: Re: [LLVMdev] extractelement causes memory access
> >> violation - what to do?
> >>
> >> Sorry for chiming in so late in this.
> >>
> >> So I agree that negative indices are UB, I don’t think thats
> >> contentious.
> >>
> >> However, I think the issue here is the DAG expansion. That is the
> >> point at which we go from UB which would just be hidden in the
> >> instruction to something which can crash. I think its at that
> >> point
> >> where we should mask to ensure that the in memory expansion
> >> doesn’t
> >> read out of bounds. On architectures which do the variable extract
> >> in an instruction, they won’t be penalized by a mask,
> >
> > Why do you feel that they won't be penalized by the mask? Or are
> > you assuming will adjust the patterns to match the index plus the
> > mask?
> Ah, should have explained better.  What I meant was that if i can do
> the variable extract in a register without going to memory at all
> (so have suitable legal instructions to do so), then we won’t
> generate a mask at all.  The mask would only be generated when the
> legalizer moves the data to memory which we don’t do if its legal.

Ah, alright.

> >
> >> only the
> >> memory expansion will be which should be rarer,
> >
> > On some architectures expansion in memory is not particularly
> > expensive because they have very good store-to-load forwarding.
> > Adding additional masking instructions into the critical path of
> > correct code will not be a welcome change.
> Thats true, so i guess it depends how many architectures need to do
> variable extracts in memory.  I have no idea if any architectures we
> support are able to do a variable extract in a register, or if all
> use memory.

At least on PowerPC, when using QPX, we can do this using instructions.

>  If most use a register, then penalizing the few who do
> use memory by inserting a mask seems reasonable.
> >
> >>
> >> The point about speculation at the IR level is interesting.
> >> Personally i’m ok with constant indices being speculated and
> >> variable not. If we later want to find a good way to ask TTI
> >> whether
> >> a variable extract is cheap then we can find a way to do so.
> >
> > It is not about expense, it is about not introducing UB when
> > speculating the instruction.
> Yeah, I see what you mean here.  So the user could have written
>
> if (i >= 0) x = extract v[i]
>
> but if we speculate then we aren’t guarded and have UB.  Having the
> backend insert the mask would fix this, but I agree that someone,
> somewhere needs to put in the mask if we want to allow speculation
> here, and the target can’t do the variable extract in a register.

I'd rather the frontend do this if the language wants it. We can use ComputeKnownBits when we do the speculation check, and so if there is a mask, we'll be fine.

>
> Pete
> >
> > -Hal
> >
> >>
> >> Anyway, just my 2c.
> >>
> >>
> >> Cheers,
> >> Pete
> >>
> >>
> >>
> >>
> >> On Jun 30, 2015, at 2:07 PM, Paweł Bylica < [hidden email] >
> >> wrote:
> >>
> >>
> >>
> >>
> >> On Tue, Jun 30, 2015 at 11:03 PM Hal Finkel < [hidden email] >
> >> wrote:
> >>
> >>
> >> ----- Original Message -----
> >>> From: "Paweł Bylica" < [hidden email] >
> >>> To: "David Majnemer" < [hidden email] >
> >>> Cc: "LLVMdev" < [hidden email] >
> >>> Sent: Tuesday, June 30, 2015 5:42:24 AM
> >>> Subject: Re: [LLVMdev] extractelement causes memory access
> >>> violation - what to do?
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On Fri, Jun 26, 2015 at 5:42 PM David Majnemer <
> >>> [hidden email] > wrote:
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On Fri, Jun 26, 2015 at 7:00 AM, Paweł Bylica < [hidden email]
> >>> >
> >>> wrote:
> >>>
> >>>
> >>>
> >>> Hi,
> >>>
> >>>
> >>> Let's have a simple program:
> >>>
> >>> define i32 @main(i32 %n, i64 %idx) {
> >>> %idxSafe = trunc i64 %idx to i5
> >>> %r = extractelement <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>,
> >>> i64
> >>> %idx
> >>> ret i32 %r
> >>> }
> >>>
> >>>
> >>> The assembly of that would be:
> >>>
> >>> pcmpeqd %xmm0, %xmm0
> >>> movdqa %xmm0, -24(%rsp)
> >>> movl -24(%rsp,%rsi,4), %eax
> >>> retq
> >>>
> >>>
> >>> The language reference states that the extractelement instruction
> >>> produces undefined value in case the index argument is invalid
> >>> (our
> >>> case). But the implementation simply dumps the vector to the
> >>> stack
> >>> memory, calculates the memory offset out of the index value and
> >>> tries to access the memory. That causes the crash.
> >>>
> >>>
> >>> The workaround is to trunc the index value before extractelement
> >>> (see
> >>> %idxSafe). But what should be the ultimate solution?
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> We could fix this by specifying that out of bounds access on an
> >>> extractelement leads to full-on undefined behavior, no need to
> >>> force
> >>> everyone to eat the cost of a mask.
> >>>
> >>>
> >>> I don't have preference for any of the solutions.
> >>>
> >>>
> >>> I have a side question. It is not stated explicitly in the
> >>> reference
> >>> but I would assume the index of extractelement is processed as an
> >>> unsigned value. However, the DAG Builder extends the index with
> >>> sext. Is it correct?
> >>
> >> Hrmm. Given that only (small) positive numbers are valid, it
> >> shouldn't matter. Unless we can find a reason that it works better
> >> to be sext, it seems conceptually cleaner to make it zext.
> >>
> >>
> >>
> >> I have tried to change it to zext. 2 Mips test have failed. I
> >> haven't
> >> checked the details though.
> >> sext looks pretty wrong to me because i5 -1 does not mean 31 any
> >> more.
> >>
> >>
> >> - PB
> >>
> >>
> >>
> >> -Hal
> >>
> >>>
> >>>
> >>> - PB
> >>> _______________________________________________
> >>> LLVM Developers mailing list
> >>> [hidden email] http://llvm.cs.uiuc.edu
> >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >>>
> >>
> >> --
> >> Hal Finkel
> >> Assistant Computational Scientist
> >> Leadership Computing Facility
> >> Argonne National Laboratory
> >> _______________________________________________
> >> LLVM Developers mailing list
> >> [hidden email] http://llvm.cs.uiuc.edu
> >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >>
> >>
> >
> > --
> > Hal Finkel
> > Assistant Computational Scientist
> > Leadership Computing Facility
> > Argonne National Laboratory
>
>

--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: extractelement causes memory access violation - what to do?

Pete Cooper

> On Jul 1, 2015, at 4:48 PM, Hal Finkel <[hidden email]> wrote:
>
> ----- Original Message -----
>> From: "Pete Cooper" <[hidden email]>
>> To: "Hal Finkel" <[hidden email]>
>> Cc: "LLVMdev" <[hidden email]>, "Paweł Bylica" <[hidden email]>
>> Sent: Wednesday, July 1, 2015 6:42:41 PM
>> Subject: Re: [LLVMdev] extractelement causes memory access violation - what to do?
>>
>>
>>> On Jul 1, 2015, at 3:45 PM, Hal Finkel <[hidden email]> wrote:
>>>
>>> ----- Original Message -----
>>>> From: "Pete Cooper" <[hidden email]>
>>>> To: "Paweł Bylica" <[hidden email]>
>>>> Cc: "Hal Finkel" <[hidden email]>, "LLVMdev"
>>>> <[hidden email]>
>>>> Sent: Wednesday, July 1, 2015 12:08:37 PM
>>>> Subject: Re: [LLVMdev] extractelement causes memory access
>>>> violation - what to do?
>>>>
>>>> Sorry for chiming in so late in this.
>>>>
>>>> So I agree that negative indices are UB, I don’t think thats
>>>> contentious.
>>>>
>>>> However, I think the issue here is the DAG expansion. That is the
>>>> point at which we go from UB which would just be hidden in the
>>>> instruction to something which can crash. I think its at that
>>>> point
>>>> where we should mask to ensure that the in memory expansion
>>>> doesn’t
>>>> read out of bounds. On architectures which do the variable extract
>>>> in an instruction, they won’t be penalized by a mask,
>>>
>>> Why do you feel that they won't be penalized by the mask? Or are
>>> you assuming will adjust the patterns to match the index plus the
>>> mask?
>> Ah, should have explained better.  What I meant was that if i can do
>> the variable extract in a register without going to memory at all
>> (so have suitable legal instructions to do so), then we won’t
>> generate a mask at all.  The mask would only be generated when the
>> legalizer moves the data to memory which we don’t do if its legal.
>
> Ah, alright.
>
>>>
>>>> only the
>>>> memory expansion will be which should be rarer,
>>>
>>> On some architectures expansion in memory is not particularly
>>> expensive because they have very good store-to-load forwarding.
>>> Adding additional masking instructions into the critical path of
>>> correct code will not be a welcome change.
>> Thats true, so i guess it depends how many architectures need to do
>> variable extracts in memory.  I have no idea if any architectures we
>> support are able to do a variable extract in a register, or if all
>> use memory.
>
> At least on PowerPC, when using QPX, we can do this using instructions.
>
>> If most use a register, then penalizing the few who do
>> use memory by inserting a mask seems reasonable.
>>>
>>>>
>>>> The point about speculation at the IR level is interesting.
>>>> Personally i’m ok with constant indices being speculated and
>>>> variable not. If we later want to find a good way to ask TTI
>>>> whether
>>>> a variable extract is cheap then we can find a way to do so.
>>>
>>> It is not about expense, it is about not introducing UB when
>>> speculating the instruction.
>> Yeah, I see what you mean here.  So the user could have written
>>
>> if (i >= 0) x = extract v[i]
>>
>> but if we speculate then we aren’t guarded and have UB.  Having the
>> backend insert the mask would fix this, but I agree that someone,
>> somewhere needs to put in the mask if we want to allow speculation
>> here, and the target can’t do the variable extract in a register.
>
> I'd rather the frontend do this if the language wants it. We can use ComputeKnownBits when we do the speculation check, and so if there is a mask, we'll be fine.
Sounds good.

I assume that if we taught SROA to promote vectors from the stack to registers, and if the vector has a variable extract index, then we’d have to insert the mask there too?

Of course, for SROA to do this, it would have to assume that the vector being in a register is cheaper than memory, which may not be true if the backend has to push/pop the vector to memory multiple times for multiple variable extracts.

Pete

>
>>
>> Pete
>>>
>>> -Hal
>>>
>>>>
>>>> Anyway, just my 2c.
>>>>
>>>>
>>>> Cheers,
>>>> Pete
>>>>
>>>>
>>>>
>>>>
>>>> On Jun 30, 2015, at 2:07 PM, Paweł Bylica < [hidden email] >
>>>> wrote:
>>>>
>>>>
>>>>
>>>>
>>>> On Tue, Jun 30, 2015 at 11:03 PM Hal Finkel < [hidden email] >
>>>> wrote:
>>>>
>>>>
>>>> ----- Original Message -----
>>>>> From: "Paweł Bylica" < [hidden email] >
>>>>> To: "David Majnemer" < [hidden email] >
>>>>> Cc: "LLVMdev" < [hidden email] >
>>>>> Sent: Tuesday, June 30, 2015 5:42:24 AM
>>>>> Subject: Re: [LLVMdev] extractelement causes memory access
>>>>> violation - what to do?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Jun 26, 2015 at 5:42 PM David Majnemer <
>>>>> [hidden email] > wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Jun 26, 2015 at 7:00 AM, Paweł Bylica < [hidden email]
>>>>>>
>>>>> wrote:
>>>>>
>>>>>
>>>>>
>>>>> Hi,
>>>>>
>>>>>
>>>>> Let's have a simple program:
>>>>>
>>>>> define i32 @main(i32 %n, i64 %idx) {
>>>>> %idxSafe = trunc i64 %idx to i5
>>>>> %r = extractelement <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>,
>>>>> i64
>>>>> %idx
>>>>> ret i32 %r
>>>>> }
>>>>>
>>>>>
>>>>> The assembly of that would be:
>>>>>
>>>>> pcmpeqd %xmm0, %xmm0
>>>>> movdqa %xmm0, -24(%rsp)
>>>>> movl -24(%rsp,%rsi,4), %eax
>>>>> retq
>>>>>
>>>>>
>>>>> The language reference states that the extractelement instruction
>>>>> produces undefined value in case the index argument is invalid
>>>>> (our
>>>>> case). But the implementation simply dumps the vector to the
>>>>> stack
>>>>> memory, calculates the memory offset out of the index value and
>>>>> tries to access the memory. That causes the crash.
>>>>>
>>>>>
>>>>> The workaround is to trunc the index value before extractelement
>>>>> (see
>>>>> %idxSafe). But what should be the ultimate solution?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> We could fix this by specifying that out of bounds access on an
>>>>> extractelement leads to full-on undefined behavior, no need to
>>>>> force
>>>>> everyone to eat the cost of a mask.
>>>>>
>>>>>
>>>>> I don't have preference for any of the solutions.
>>>>>
>>>>>
>>>>> I have a side question. It is not stated explicitly in the
>>>>> reference
>>>>> but I would assume the index of extractelement is processed as an
>>>>> unsigned value. However, the DAG Builder extends the index with
>>>>> sext. Is it correct?
>>>>
>>>> Hrmm. Given that only (small) positive numbers are valid, it
>>>> shouldn't matter. Unless we can find a reason that it works better
>>>> to be sext, it seems conceptually cleaner to make it zext.
>>>>
>>>>
>>>>
>>>> I have tried to change it to zext. 2 Mips test have failed. I
>>>> haven't
>>>> checked the details though.
>>>> sext looks pretty wrong to me because i5 -1 does not mean 31 any
>>>> more.
>>>>
>>>>
>>>> - PB
>>>>
>>>>
>>>>
>>>> -Hal
>>>>
>>>>>
>>>>>
>>>>> - PB
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> [hidden email] http://llvm.cs.uiuc.edu
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>
>>>>
>>>> --
>>>> Hal Finkel
>>>> Assistant Computational Scientist
>>>> Leadership Computing Facility
>>>> Argonne National Laboratory
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> [hidden email] http://llvm.cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>
>>>>
>>>
>>> --
>>> Hal Finkel
>>> Assistant Computational Scientist
>>> Leadership Computing Facility
>>> Argonne National Laboratory
>>
>>
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: extractelement causes memory access violation - what to do?

David Majnemer
In reply to this post by Hal Finkel


On Wed, Jul 1, 2015 at 4:48 PM, Hal Finkel <[hidden email]> wrote:
----- Original Message -----
> From: "Pete Cooper" <[hidden email]>
> To: "Hal Finkel" <[hidden email]>
> Cc: "LLVMdev" <[hidden email]>, "Paweł Bylica" <[hidden email]>
> Sent: Wednesday, July 1, 2015 6:42:41 PM
> Subject: Re: [LLVMdev] extractelement causes memory access violation - what   to do?
>
>
> > On Jul 1, 2015, at 3:45 PM, Hal Finkel <[hidden email]> wrote:
> >
> > ----- Original Message -----
> >> From: "Pete Cooper" <[hidden email]>
> >> To: "Paweł Bylica" <[hidden email]>
> >> Cc: "Hal Finkel" <[hidden email]>, "LLVMdev"
> >> <[hidden email]>
> >> Sent: Wednesday, July 1, 2015 12:08:37 PM
> >> Subject: Re: [LLVMdev] extractelement causes memory access
> >> violation - what   to do?
> >>
> >> Sorry for chiming in so late in this.
> >>
> >> So I agree that negative indices are UB, I don’t think thats
> >> contentious.
> >>
> >> However, I think the issue here is the DAG expansion. That is the
> >> point at which we go from UB which would just be hidden in the
> >> instruction to something which can crash. I think its at that
> >> point
> >> where we should mask to ensure that the in memory expansion
> >> doesn’t
> >> read out of bounds. On architectures which do the variable extract
> >> in an instruction, they won’t be penalized by a mask,
> >
> > Why do you feel that they won't be penalized by the mask? Or are
> > you assuming will adjust the patterns to match the index plus the
> > mask?
> Ah, should have explained better.  What I meant was that if i can do
> the variable extract in a register without going to memory at all
> (so have suitable legal instructions to do so), then we won’t
> generate a mask at all.  The mask would only be generated when the
> legalizer moves the data to memory which we don’t do if its legal.

Ah, alright.

> >
> >> only the
> >> memory expansion will be which should be rarer,
> >
> > On some architectures expansion in memory is not particularly
> > expensive because they have very good store-to-load forwarding.
> > Adding additional masking instructions into the critical path of
> > correct code will not be a welcome change.
> Thats true, so i guess it depends how many architectures need to do
> variable extracts in memory.  I have no idea if any architectures we
> support are able to do a variable extract in a register, or if all
> use memory.

At least on PowerPC, when using QPX, we can do this using instructions.

>  If most use a register, then penalizing the few who do
> use memory by inserting a mask seems reasonable.
> >
> >>
> >> The point about speculation at the IR level is interesting.
> >> Personally i’m ok with constant indices being speculated and
> >> variable not. If we later want to find a good way to ask TTI
> >> whether
> >> a variable extract is cheap then we can find a way to do so.
> >
> > It is not about expense, it is about not introducing UB when
> > speculating the instruction.
> Yeah, I see what you mean here.  So the user could have written
>
> if (i >= 0) x = extract v[i]
>
> but if we speculate then we aren’t guarded and have UB.  Having the
> backend insert the mask would fix this, but I agree that someone,
> somewhere needs to put in the mask if we want to allow speculation
> here, and the target can’t do the variable extract in a register.

I'd rather the frontend do this if the language wants it. We can use ComputeKnownBits when we do the speculation check, and so if there is a mask, we'll be fine.

I don't think we can rely on ComputeKnownBits for this sort of thing.
Consider:
  %and = and i64 %x, 1
  %idx = lshr exact i64 %and, 1
  %z = extractelement <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>, i64 %idx
 
ComputeKnownBits doesn't take 'exact' into account and will report that %idx must be all zeros.  However, %idx might turn into undef if %x has the bottom bit set.  In fact, it doesn't appear to be conservative at all in the face of flags.  If anything, it takes advantage of them: http://llvm.org/docs/doxygen/html/ValueTracking_8cpp_source.html#l00275

This tells me that we have three options if we want the instruction to stay speculatable:
1. Make a variant of ComputeKnownBits (or something along those lines) which is explicitly pessimistic in the face of flags.
2. Kick the can to the backend.
3. Make ComputeKnowBits pessimistic in the face of flags.


>
> Pete
> >
> > -Hal
> >
> >>
> >> Anyway, just my 2c.
> >>
> >>
> >> Cheers,
> >> Pete
> >>
> >>
> >>
> >>
> >> On Jun 30, 2015, at 2:07 PM, Paweł Bylica < [hidden email] >
> >> wrote:
> >>
> >>
> >>
> >>
> >> On Tue, Jun 30, 2015 at 11:03 PM Hal Finkel < [hidden email] >
> >> wrote:
> >>
> >>
> >> ----- Original Message -----
> >>> From: "Paweł Bylica" < [hidden email] >
> >>> To: "David Majnemer" < [hidden email] >
> >>> Cc: "LLVMdev" < [hidden email] >
> >>> Sent: Tuesday, June 30, 2015 5:42:24 AM
> >>> Subject: Re: [LLVMdev] extractelement causes memory access
> >>> violation - what to do?
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On Fri, Jun 26, 2015 at 5:42 PM David Majnemer <
> >>> [hidden email] > wrote:
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On Fri, Jun 26, 2015 at 7:00 AM, Paweł Bylica < [hidden email]
> >>> >
> >>> wrote:
> >>>
> >>>
> >>>
> >>> Hi,
> >>>
> >>>
> >>> Let's have a simple program:
> >>>
> >>> define i32 @main(i32 %n, i64 %idx) {
> >>> %idxSafe = trunc i64 %idx to i5
> >>> %r = extractelement <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>,
> >>> i64
> >>> %idx
> >>> ret i32 %r
> >>> }
> >>>
> >>>
> >>> The assembly of that would be:
> >>>
> >>> pcmpeqd %xmm0, %xmm0
> >>> movdqa %xmm0, -24(%rsp)
> >>> movl -24(%rsp,%rsi,4), %eax
> >>> retq
> >>>
> >>>
> >>> The language reference states that the extractelement instruction
> >>> produces undefined value in case the index argument is invalid
> >>> (our
> >>> case). But the implementation simply dumps the vector to the
> >>> stack
> >>> memory, calculates the memory offset out of the index value and
> >>> tries to access the memory. That causes the crash.
> >>>
> >>>
> >>> The workaround is to trunc the index value before extractelement
> >>> (see
> >>> %idxSafe). But what should be the ultimate solution?
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> We could fix this by specifying that out of bounds access on an
> >>> extractelement leads to full-on undefined behavior, no need to
> >>> force
> >>> everyone to eat the cost of a mask.
> >>>
> >>>
> >>> I don't have preference for any of the solutions.
> >>>
> >>>
> >>> I have a side question. It is not stated explicitly in the
> >>> reference
> >>> but I would assume the index of extractelement is processed as an
> >>> unsigned value. However, the DAG Builder extends the index with
> >>> sext. Is it correct?
> >>
> >> Hrmm. Given that only (small) positive numbers are valid, it
> >> shouldn't matter. Unless we can find a reason that it works better
> >> to be sext, it seems conceptually cleaner to make it zext.
> >>
> >>
> >>
> >> I have tried to change it to zext. 2 Mips test have failed. I
> >> haven't
> >> checked the details though.
> >> sext looks pretty wrong to me because i5 -1 does not mean 31 any
> >> more.
> >>
> >>
> >> - PB
> >>
> >>
> >>
> >> -Hal
> >>
> >>>
> >>>
> >>> - PB
> >>> _______________________________________________
> >>> LLVM Developers mailing list
> >>> [hidden email] http://llvm.cs.uiuc.edu
> >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >>>
> >>
> >> --
> >> Hal Finkel
> >> Assistant Computational Scientist
> >> Leadership Computing Facility
> >> Argonne National Laboratory
> >> _______________________________________________
> >> LLVM Developers mailing list
> >> [hidden email] http://llvm.cs.uiuc.edu
> >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >>
> >>
> >
> > --
> > Hal Finkel
> > Assistant Computational Scientist
> > Leadership Computing Facility
> > Argonne National Laboratory
>
>

--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: extractelement causes memory access violation - what to do?

Hal Finkel
----- Original Message -----

> From: "David Majnemer" <[hidden email]>
> To: "Hal Finkel" <[hidden email]>
> Cc: "Pete Cooper" <[hidden email]>, "LLVMdev" <[hidden email]>
> Sent: Wednesday, July 1, 2015 7:17:19 PM
> Subject: Re: [LLVMdev] extractelement causes memory access violation - what to do?
>
>
> On Wed, Jul 1, 2015 at 4:48 PM, Hal Finkel < [hidden email] > wrote:
>
>
> ----- Original Message -----
> > From: "Pete Cooper" < [hidden email] >
>
>
> > To: "Hal Finkel" < [hidden email] >
> > Cc: "LLVMdev" < [hidden email] >, "Paweł Bylica" <
> > [hidden email] >
> > Sent: Wednesday, July 1, 2015 6:42:41 PM
> > Subject: Re: [LLVMdev] extractelement causes memory access
> > violation - what to do?
> >
> >
> > > On Jul 1, 2015, at 3:45 PM, Hal Finkel < [hidden email] > wrote:
> > >
> > > ----- Original Message -----
> > >> From: "Pete Cooper" < [hidden email] >
> > >> To: "Paweł Bylica" < [hidden email] >
> > >> Cc: "Hal Finkel" < [hidden email] >, "LLVMdev"
> > >> < [hidden email] >
> > >> Sent: Wednesday, July 1, 2015 12:08:37 PM
> > >> Subject: Re: [LLVMdev] extractelement causes memory access
> > >> violation - what to do?
> > >>
> > >> Sorry for chiming in so late in this.
> > >>
> > >> So I agree that negative indices are UB, I don’t think thats
> > >> contentious.
> > >>
> > >> However, I think the issue here is the DAG expansion. That is
> > >> the
> > >> point at which we go from UB which would just be hidden in the
> > >> instruction to something which can crash. I think its at that
> > >> point
> > >> where we should mask to ensure that the in memory expansion
> > >> doesn’t
> > >> read out of bounds. On architectures which do the variable
> > >> extract
> > >> in an instruction, they won’t be penalized by a mask,
> > >
> > > Why do you feel that they won't be penalized by the mask? Or are
> > > you assuming will adjust the patterns to match the index plus the
> > > mask?
> > Ah, should have explained better. What I meant was that if i can do
> > the variable extract in a register without going to memory at all
> > (so have suitable legal instructions to do so), then we won’t
> > generate a mask at all. The mask would only be generated when the
> > legalizer moves the data to memory which we don’t do if its legal.
>
> Ah, alright.
>
> > >
> > >> only the
> > >> memory expansion will be which should be rarer,
> > >
> > > On some architectures expansion in memory is not particularly
> > > expensive because they have very good store-to-load forwarding.
> > > Adding additional masking instructions into the critical path of
> > > correct code will not be a welcome change.
> > Thats true, so i guess it depends how many architectures need to do
> > variable extracts in memory. I have no idea if any architectures we
> > support are able to do a variable extract in a register, or if all
> > use memory.
>
> At least on PowerPC, when using QPX, we can do this using
> instructions.
>
> > If most use a register, then penalizing the few who do
> > use memory by inserting a mask seems reasonable.
> > >
> > >>
> > >> The point about speculation at the IR level is interesting.
> > >> Personally i’m ok with constant indices being speculated and
> > >> variable not. If we later want to find a good way to ask TTI
> > >> whether
> > >> a variable extract is cheap then we can find a way to do so.
> > >
> > > It is not about expense, it is about not introducing UB when
> > > speculating the instruction.
> > Yeah, I see what you mean here. So the user could have written
> >
> > if (i >= 0) x = extract v[i]
> >
> > but if we speculate then we aren’t guarded and have UB. Having the
> > backend insert the mask would fix this, but I agree that someone,
> > somewhere needs to put in the mask if we want to allow speculation
> > here, and the target can’t do the variable extract in a register.
>
> I'd rather the frontend do this if the language wants it. We can use
> ComputeKnownBits when we do the speculation check, and so if there
> is a mask, we'll be fine.
>
>
>
> I don't think we can rely on ComputeKnownBits for this sort of thing.
> Consider:
> %and = and i64 %x, 1
> %idx = lshr exact i64 %and, 1
> %z = extractelement <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>, i64
> %idx
>
> ComputeKnownBits doesn't take 'exact' into account and will report
> that %idx must be all zeros. However, %idx might turn into undef if
> %x has the bottom bit set. In fact, it doesn't appear to be
> conservative at all in the face of flags. If anything, it takes
> advantage of them:
> http://llvm.org/docs/doxygen/html/ValueTracking_8cpp_source.html#l00275
>
>
> This tells me that we have three options if we want the instruction
> to stay speculatable:
> 1. Make a variant of ComputeKnownBits (or something along those
> lines) which is explicitly pessimistic in the face of flags.
>
> 2. Kick the can to the backend.
> 3. Make ComputeKnowBits pessimistic in the face of flags.
>

If we do the speculation correctly, then this is not a problem. Because the 'exact' flag might have control dependencies we cannot speculate the lshr without dropping the flag. Only at that point can we check (or confirm) that we can still speculate the extractelement.

 -Hal

>
> >
> > Pete
> > >
> > > -Hal
> > >
> > >>
> > >> Anyway, just my 2c.
> > >>
> > >>
> > >> Cheers,
> > >> Pete
> > >>
> > >>
> > >>
> > >>
> > >> On Jun 30, 2015, at 2:07 PM, Paweł Bylica < [hidden email] >
> > >> wrote:
> > >>
> > >>
> > >>
> > >>
> > >> On Tue, Jun 30, 2015 at 11:03 PM Hal Finkel < [hidden email] >
> > >> wrote:
> > >>
> > >>
> > >> ----- Original Message -----
> > >>> From: "Paweł Bylica" < [hidden email] >
> > >>> To: "David Majnemer" < [hidden email] >
> > >>> Cc: "LLVMdev" < [hidden email] >
> > >>> Sent: Tuesday, June 30, 2015 5:42:24 AM
> > >>> Subject: Re: [LLVMdev] extractelement causes memory access
> > >>> violation - what to do?
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> On Fri, Jun 26, 2015 at 5:42 PM David Majnemer <
> > >>> [hidden email] > wrote:
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> On Fri, Jun 26, 2015 at 7:00 AM, Paweł Bylica <
> > >>> [hidden email]
> > >>> >
> > >>> wrote:
> > >>>
> > >>>
> > >>>
> > >>> Hi,
> > >>>
> > >>>
> > >>> Let's have a simple program:
> > >>>
> > >>> define i32 @main(i32 %n, i64 %idx) {
> > >>> %idxSafe = trunc i64 %idx to i5
> > >>> %r = extractelement <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>,
> > >>> i64
> > >>> %idx
> > >>> ret i32 %r
> > >>> }
> > >>>
> > >>>
> > >>> The assembly of that would be:
> > >>>
> > >>> pcmpeqd %xmm0, %xmm0
> > >>> movdqa %xmm0, -24(%rsp)
> > >>> movl -24(%rsp,%rsi,4), %eax
> > >>> retq
> > >>>
> > >>>
> > >>> The language reference states that the extractelement
> > >>> instruction
> > >>> produces undefined value in case the index argument is invalid
> > >>> (our
> > >>> case). But the implementation simply dumps the vector to the
> > >>> stack
> > >>> memory, calculates the memory offset out of the index value and
> > >>> tries to access the memory. That causes the crash.
> > >>>
> > >>>
> > >>> The workaround is to trunc the index value before
> > >>> extractelement
> > >>> (see
> > >>> %idxSafe). But what should be the ultimate solution?
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> We could fix this by specifying that out of bounds access on an
> > >>> extractelement leads to full-on undefined behavior, no need to
> > >>> force
> > >>> everyone to eat the cost of a mask.
> > >>>
> > >>>
> > >>> I don't have preference for any of the solutions.
> > >>>
> > >>>
> > >>> I have a side question. It is not stated explicitly in the
> > >>> reference
> > >>> but I would assume the index of extractelement is processed as
> > >>> an
> > >>> unsigned value. However, the DAG Builder extends the index with
> > >>> sext. Is it correct?
> > >>
> > >> Hrmm. Given that only (small) positive numbers are valid, it
> > >> shouldn't matter. Unless we can find a reason that it works
> > >> better
> > >> to be sext, it seems conceptually cleaner to make it zext.
> > >>
> > >>
> > >>
> > >> I have tried to change it to zext. 2 Mips test have failed. I
> > >> haven't
> > >> checked the details though.
> > >> sext looks pretty wrong to me because i5 -1 does not mean 31 any
> > >> more.
> > >>
> > >>
> > >> - PB
> > >>
> > >>
> > >>
> > >> -Hal
> > >>
> > >>>
> > >>>
> > >>> - PB
> > >>> _______________________________________________
> > >>> LLVM Developers mailing list
> > >>> [hidden email] http://llvm.cs.uiuc.edu
> > >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> > >>>
> > >>
> > >> --
> > >> Hal Finkel
> > >> Assistant Computational Scientist
> > >> Leadership Computing Facility
> > >> Argonne National Laboratory
> > >> _______________________________________________
> > >> LLVM Developers mailing list
> > >> [hidden email] http://llvm.cs.uiuc.edu
> > >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> > >>
> > >>
> > >
> > > --
> > > Hal Finkel
> > > Assistant Computational Scientist
> > > Leadership Computing Facility
> > > Argonne National Laboratory
> >
> >
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email] http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>

--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: extractelement causes memory access violation - what to do?

David Majnemer


On Thu, Jul 2, 2015 at 3:57 AM, Hal Finkel <[hidden email]> wrote:
----- Original Message -----
> From: "David Majnemer" <[hidden email]>
> To: "Hal Finkel" <[hidden email]>
> Cc: "Pete Cooper" <[hidden email]>, "LLVMdev" <[hidden email]>
> Sent: Wednesday, July 1, 2015 7:17:19 PM
> Subject: Re: [LLVMdev] extractelement causes memory access violation - what to do?
>
>
> On Wed, Jul 1, 2015 at 4:48 PM, Hal Finkel < [hidden email] > wrote:
>
>
> ----- Original Message -----
> > From: "Pete Cooper" < [hidden email] >
>
>
> > To: "Hal Finkel" < [hidden email] >
> > Cc: "LLVMdev" < [hidden email] >, "Paweł Bylica" <
> > [hidden email] >
> > Sent: Wednesday, July 1, 2015 6:42:41 PM
> > Subject: Re: [LLVMdev] extractelement causes memory access
> > violation - what to do?
> >
> >
> > > On Jul 1, 2015, at 3:45 PM, Hal Finkel < [hidden email] > wrote:
> > >
> > > ----- Original Message -----
> > >> From: "Pete Cooper" < [hidden email] >
> > >> To: "Paweł Bylica" < [hidden email] >
> > >> Cc: "Hal Finkel" < [hidden email] >, "LLVMdev"
> > >> < [hidden email] >
> > >> Sent: Wednesday, July 1, 2015 12:08:37 PM
> > >> Subject: Re: [LLVMdev] extractelement causes memory access
> > >> violation - what to do?
> > >>
> > >> Sorry for chiming in so late in this.
> > >>
> > >> So I agree that negative indices are UB, I don’t think thats
> > >> contentious.
> > >>
> > >> However, I think the issue here is the DAG expansion. That is
> > >> the
> > >> point at which we go from UB which would just be hidden in the
> > >> instruction to something which can crash. I think its at that
> > >> point
> > >> where we should mask to ensure that the in memory expansion
> > >> doesn’t
> > >> read out of bounds. On architectures which do the variable
> > >> extract
> > >> in an instruction, they won’t be penalized by a mask,
> > >
> > > Why do you feel that they won't be penalized by the mask? Or are
> > > you assuming will adjust the patterns to match the index plus the
> > > mask?
> > Ah, should have explained better. What I meant was that if i can do
> > the variable extract in a register without going to memory at all
> > (so have suitable legal instructions to do so), then we won’t
> > generate a mask at all. The mask would only be generated when the
> > legalizer moves the data to memory which we don’t do if its legal.
>
> Ah, alright.
>
> > >
> > >> only the
> > >> memory expansion will be which should be rarer,
> > >
> > > On some architectures expansion in memory is not particularly
> > > expensive because they have very good store-to-load forwarding.
> > > Adding additional masking instructions into the critical path of
> > > correct code will not be a welcome change.
> > Thats true, so i guess it depends how many architectures need to do
> > variable extracts in memory. I have no idea if any architectures we
> > support are able to do a variable extract in a register, or if all
> > use memory.
>
> At least on PowerPC, when using QPX, we can do this using
> instructions.
>
> > If most use a register, then penalizing the few who do
> > use memory by inserting a mask seems reasonable.
> > >
> > >>
> > >> The point about speculation at the IR level is interesting.
> > >> Personally i’m ok with constant indices being speculated and
> > >> variable not. If we later want to find a good way to ask TTI
> > >> whether
> > >> a variable extract is cheap then we can find a way to do so.
> > >
> > > It is not about expense, it is about not introducing UB when
> > > speculating the instruction.
> > Yeah, I see what you mean here. So the user could have written
> >
> > if (i >= 0) x = extract v[i]
> >
> > but if we speculate then we aren’t guarded and have UB. Having the
> > backend insert the mask would fix this, but I agree that someone,
> > somewhere needs to put in the mask if we want to allow speculation
> > here, and the target can’t do the variable extract in a register.
>
> I'd rather the frontend do this if the language wants it. We can use
> ComputeKnownBits when we do the speculation check, and so if there
> is a mask, we'll be fine.
>
>
>
> I don't think we can rely on ComputeKnownBits for this sort of thing.
> Consider:
> %and = and i64 %x, 1
> %idx = lshr exact i64 %and, 1
> %z = extractelement <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>, i64
> %idx
>
> ComputeKnownBits doesn't take 'exact' into account and will report
> that %idx must be all zeros. However, %idx might turn into undef if
> %x has the bottom bit set. In fact, it doesn't appear to be
> conservative at all in the face of flags. If anything, it takes
> advantage of them:
> http://llvm.org/docs/doxygen/html/ValueTracking_8cpp_source.html#l00275
>
>
> This tells me that we have three options if we want the instruction
> to stay speculatable:
> 1. Make a variant of ComputeKnownBits (or something along those
> lines) which is explicitly pessimistic in the face of flags.
>
> 2. Kick the can to the backend.
> 3. Make ComputeKnowBits pessimistic in the face of flags.
>

If we do the speculation correctly, then this is not a problem. Because the 'exact' flag might have control dependencies we cannot speculate the lshr without dropping the flag. Only at that point can we check (or confirm) that we can still speculate the extractelement.

I don't think that simplifycg, loop unrolling or any of the other speculation transforms actually drop flags.  Wouldn't such a regime make it impossible to speculate call instructions (with 'pure' targets) which don't have side effects (because we couldn't necessarily strip their flags)?
 

 -Hal

>
> >
> > Pete
> > >
> > > -Hal
> > >
> > >>
> > >> Anyway, just my 2c.
> > >>
> > >>
> > >> Cheers,
> > >> Pete
> > >>
> > >>
> > >>
> > >>
> > >> On Jun 30, 2015, at 2:07 PM, Paweł Bylica < [hidden email] >
> > >> wrote:
> > >>
> > >>
> > >>
> > >>
> > >> On Tue, Jun 30, 2015 at 11:03 PM Hal Finkel < [hidden email] >
> > >> wrote:
> > >>
> > >>
> > >> ----- Original Message -----
> > >>> From: "Paweł Bylica" < [hidden email] >
> > >>> To: "David Majnemer" < [hidden email] >
> > >>> Cc: "LLVMdev" < [hidden email] >
> > >>> Sent: Tuesday, June 30, 2015 5:42:24 AM
> > >>> Subject: Re: [LLVMdev] extractelement causes memory access
> > >>> violation - what to do?
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> On Fri, Jun 26, 2015 at 5:42 PM David Majnemer <
> > >>> [hidden email] > wrote:
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> On Fri, Jun 26, 2015 at 7:00 AM, Paweł Bylica <
> > >>> [hidden email]
> > >>> >
> > >>> wrote:
> > >>>
> > >>>
> > >>>
> > >>> Hi,
> > >>>
> > >>>
> > >>> Let's have a simple program:
> > >>>
> > >>> define i32 @main(i32 %n, i64 %idx) {
> > >>> %idxSafe = trunc i64 %idx to i5
> > >>> %r = extractelement <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>,
> > >>> i64
> > >>> %idx
> > >>> ret i32 %r
> > >>> }
> > >>>
> > >>>
> > >>> The assembly of that would be:
> > >>>
> > >>> pcmpeqd %xmm0, %xmm0
> > >>> movdqa %xmm0, -24(%rsp)
> > >>> movl -24(%rsp,%rsi,4), %eax
> > >>> retq
> > >>>
> > >>>
> > >>> The language reference states that the extractelement
> > >>> instruction
> > >>> produces undefined value in case the index argument is invalid
> > >>> (our
> > >>> case). But the implementation simply dumps the vector to the
> > >>> stack
> > >>> memory, calculates the memory offset out of the index value and
> > >>> tries to access the memory. That causes the crash.
> > >>>
> > >>>
> > >>> The workaround is to trunc the index value before
> > >>> extractelement
> > >>> (see
> > >>> %idxSafe). But what should be the ultimate solution?
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> We could fix this by specifying that out of bounds access on an
> > >>> extractelement leads to full-on undefined behavior, no need to
> > >>> force
> > >>> everyone to eat the cost of a mask.
> > >>>
> > >>>
> > >>> I don't have preference for any of the solutions.
> > >>>
> > >>>
> > >>> I have a side question. It is not stated explicitly in the
> > >>> reference
> > >>> but I would assume the index of extractelement is processed as
> > >>> an
> > >>> unsigned value. However, the DAG Builder extends the index with
> > >>> sext. Is it correct?
> > >>
> > >> Hrmm. Given that only (small) positive numbers are valid, it
> > >> shouldn't matter. Unless we can find a reason that it works
> > >> better
> > >> to be sext, it seems conceptually cleaner to make it zext.
> > >>
> > >>
> > >>
> > >> I have tried to change it to zext. 2 Mips test have failed. I
> > >> haven't
> > >> checked the details though.
> > >> sext looks pretty wrong to me because i5 -1 does not mean 31 any
> > >> more.
> > >>
> > >>
> > >> - PB
> > >>
> > >>
> > >>
> > >> -Hal
> > >>
> > >>>
> > >>>
> > >>> - PB
> > >>> _______________________________________________
> > >>> LLVM Developers mailing list
> > >>> [hidden email] http://llvm.cs.uiuc.edu
> > >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> > >>>
> > >>
> > >> --
> > >> Hal Finkel
> > >> Assistant Computational Scientist
> > >> Leadership Computing Facility
> > >> Argonne National Laboratory
> > >> _______________________________________________
> > >> LLVM Developers mailing list
> > >> [hidden email] http://llvm.cs.uiuc.edu
> > >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> > >>
> > >>
> > >
> > > --
> > > Hal Finkel
> > > Assistant Computational Scientist
> > > Leadership Computing Facility
> > > Argonne National Laboratory
> >
> >
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email] http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>

--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: extractelement causes memory access violation - what to do?

Hal Finkel
----- Original Message -----

> From: "David Majnemer" <[hidden email]>
> To: "Hal Finkel" <[hidden email]>
> Cc: "Pete Cooper" <[hidden email]>, "LLVMdev" <[hidden email]>
> Sent: Saturday, July 4, 2015 5:21:23 PM
> Subject: Re: [LLVMdev] extractelement causes memory access violation - what to do?
>
>
> On Thu, Jul 2, 2015 at 3:57 AM, Hal Finkel < [hidden email] > wrote:
>
>
> ----- Original Message -----
> > From: "David Majnemer" < [hidden email] >
> > To: "Hal Finkel" < [hidden email] >
>
>
> > Cc: "Pete Cooper" < [hidden email] >, "LLVMdev" <
> > [hidden email] >
> > Sent: Wednesday, July 1, 2015 7:17:19 PM
> > Subject: Re: [LLVMdev] extractelement causes memory access
> > violation - what to do?
> >
> >
> > On Wed, Jul 1, 2015 at 4:48 PM, Hal Finkel < [hidden email] >
> > wrote:
> >
> >
> > ----- Original Message -----
> > > From: "Pete Cooper" < [hidden email] >
> >
> >
> > > To: "Hal Finkel" < [hidden email] >
> > > Cc: "LLVMdev" < [hidden email] >, "Paweł Bylica" <
> > > [hidden email] >
> > > Sent: Wednesday, July 1, 2015 6:42:41 PM
> > > Subject: Re: [LLVMdev] extractelement causes memory access
> > > violation - what to do?
> > >
> > >
> > > > On Jul 1, 2015, at 3:45 PM, Hal Finkel < [hidden email] >
> > > > wrote:
> > > >
> > > > ----- Original Message -----
> > > >> From: "Pete Cooper" < [hidden email] >
> > > >> To: "Paweł Bylica" < [hidden email] >
> > > >> Cc: "Hal Finkel" < [hidden email] >, "LLVMdev"
> > > >> < [hidden email] >
> > > >> Sent: Wednesday, July 1, 2015 12:08:37 PM
> > > >> Subject: Re: [LLVMdev] extractelement causes memory access
> > > >> violation - what to do?
> > > >>
> > > >> Sorry for chiming in so late in this.
> > > >>
> > > >> So I agree that negative indices are UB, I don’t think thats
> > > >> contentious.
> > > >>
> > > >> However, I think the issue here is the DAG expansion. That is
> > > >> the
> > > >> point at which we go from UB which would just be hidden in the
> > > >> instruction to something which can crash. I think its at that
> > > >> point
> > > >> where we should mask to ensure that the in memory expansion
> > > >> doesn’t
> > > >> read out of bounds. On architectures which do the variable
> > > >> extract
> > > >> in an instruction, they won’t be penalized by a mask,
> > > >
> > > > Why do you feel that they won't be penalized by the mask? Or
> > > > are
> > > > you assuming will adjust the patterns to match the index plus
> > > > the
> > > > mask?
> > > Ah, should have explained better. What I meant was that if i can
> > > do
> > > the variable extract in a register without going to memory at all
> > > (so have suitable legal instructions to do so), then we won’t
> > > generate a mask at all. The mask would only be generated when the
> > > legalizer moves the data to memory which we don’t do if its
> > > legal.
> >
> > Ah, alright.
> >
> > > >
> > > >> only the
> > > >> memory expansion will be which should be rarer,
> > > >
> > > > On some architectures expansion in memory is not particularly
> > > > expensive because they have very good store-to-load forwarding.
> > > > Adding additional masking instructions into the critical path
> > > > of
> > > > correct code will not be a welcome change.
> > > Thats true, so i guess it depends how many architectures need to
> > > do
> > > variable extracts in memory. I have no idea if any architectures
> > > we
> > > support are able to do a variable extract in a register, or if
> > > all
> > > use memory.
> >
> > At least on PowerPC, when using QPX, we can do this using
> > instructions.
> >
> > > If most use a register, then penalizing the few who do
> > > use memory by inserting a mask seems reasonable.
> > > >
> > > >>
> > > >> The point about speculation at the IR level is interesting.
> > > >> Personally i’m ok with constant indices being speculated and
> > > >> variable not. If we later want to find a good way to ask TTI
> > > >> whether
> > > >> a variable extract is cheap then we can find a way to do so.
> > > >
> > > > It is not about expense, it is about not introducing UB when
> > > > speculating the instruction.
> > > Yeah, I see what you mean here. So the user could have written
> > >
> > > if (i >= 0) x = extract v[i]
> > >
> > > but if we speculate then we aren’t guarded and have UB. Having
> > > the
> > > backend insert the mask would fix this, but I agree that someone,
> > > somewhere needs to put in the mask if we want to allow
> > > speculation
> > > here, and the target can’t do the variable extract in a register.
> >
> > I'd rather the frontend do this if the language wants it. We can
> > use
> > ComputeKnownBits when we do the speculation check, and so if there
> > is a mask, we'll be fine.
> >
> >
> >
> > I don't think we can rely on ComputeKnownBits for this sort of
> > thing.
> > Consider:
> > %and = and i64 %x, 1
> > %idx = lshr exact i64 %and, 1
> > %z = extractelement <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>, i64
> > %idx
> >
> > ComputeKnownBits doesn't take 'exact' into account and will report
> > that %idx must be all zeros. However, %idx might turn into undef if
> > %x has the bottom bit set. In fact, it doesn't appear to be
> > conservative at all in the face of flags. If anything, it takes
> > advantage of them:
> > http://llvm.org/docs/doxygen/html/ValueTracking_8cpp_source.html#l00275
> >
> >
> > This tells me that we have three options if we want the instruction
> > to stay speculatable:
> > 1. Make a variant of ComputeKnownBits (or something along those
> > lines) which is explicitly pessimistic in the face of flags.
> >
> > 2. Kick the can to the backend.
> > 3. Make ComputeKnowBits pessimistic in the face of flags.
> >
>
> If we do the speculation correctly, then this is not a problem.
> Because the 'exact' flag might have control dependencies we cannot
> speculate the lshr without dropping the flag. Only at that point can
> we check (or confirm) that we can still speculate the
> extractelement.
>
>
>
> I don't think that simplifycg, loop unrolling or any of the other
> speculation transforms actually drop flags.

If so, this is, unfortunately, a bug.

The reality is, of course, that most of these flags come from type properties of a high-level language and are not actually subject to control dependencies. As a result, it is often hard to notice these problems when they occur. However, once you combine casting with GVN/CSE, etc. it is possible to expose problems.

FWIW, I'd really like to have a different scheme for these kinds of properties that would allow us not to have to make the most conservative possible assumptions about control dependencies. I'm not entirely sure what such a scheme should look like, however.

> Wouldn't such a regime
> make it impossible to speculate call instructions (with 'pure'
> targets) which don't have side effects (because we couldn't
> necessarily strip their flags)?
>

We generically assume that the 'pureness' of a function is a property of the function, and not subject to control dependencies at the call site. OTOH, this is generally because we have the attribute on the function declaration. If we had only the attribute at the call site, and not on the declaration, then we'd need to strip the flags there too.

 -Hal

>
>
> -Hal
>
>
>
> >
> > >
> > > Pete
> > > >
> > > > -Hal
> > > >
> > > >>
> > > >> Anyway, just my 2c.
> > > >>
> > > >>
> > > >> Cheers,
> > > >> Pete
> > > >>
> > > >>
> > > >>
> > > >>
> > > >> On Jun 30, 2015, at 2:07 PM, Paweł Bylica < [hidden email] >
> > > >> wrote:
> > > >>
> > > >>
> > > >>
> > > >>
> > > >> On Tue, Jun 30, 2015 at 11:03 PM Hal Finkel < [hidden email]
> > > >> >
> > > >> wrote:
> > > >>
> > > >>
> > > >> ----- Original Message -----
> > > >>> From: "Paweł Bylica" < [hidden email] >
> > > >>> To: "David Majnemer" < [hidden email] >
> > > >>> Cc: "LLVMdev" < [hidden email] >
> > > >>> Sent: Tuesday, June 30, 2015 5:42:24 AM
> > > >>> Subject: Re: [LLVMdev] extractelement causes memory access
> > > >>> violation - what to do?
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>> On Fri, Jun 26, 2015 at 5:42 PM David Majnemer <
> > > >>> [hidden email] > wrote:
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>> On Fri, Jun 26, 2015 at 7:00 AM, Paweł Bylica <
> > > >>> [hidden email]
> > > >>> >
> > > >>> wrote:
> > > >>>
> > > >>>
> > > >>>
> > > >>> Hi,
> > > >>>
> > > >>>
> > > >>> Let's have a simple program:
> > > >>>
> > > >>> define i32 @main(i32 %n, i64 %idx) {
> > > >>> %idxSafe = trunc i64 %idx to i5
> > > >>> %r = extractelement <4 x i32> <i32 -1, i32 -1, i32 -1, i32
> > > >>> -1>,
> > > >>> i64
> > > >>> %idx
> > > >>> ret i32 %r
> > > >>> }
> > > >>>
> > > >>>
> > > >>> The assembly of that would be:
> > > >>>
> > > >>> pcmpeqd %xmm0, %xmm0
> > > >>> movdqa %xmm0, -24(%rsp)
> > > >>> movl -24(%rsp,%rsi,4), %eax
> > > >>> retq
> > > >>>
> > > >>>
> > > >>> The language reference states that the extractelement
> > > >>> instruction
> > > >>> produces undefined value in case the index argument is
> > > >>> invalid
> > > >>> (our
> > > >>> case). But the implementation simply dumps the vector to the
> > > >>> stack
> > > >>> memory, calculates the memory offset out of the index value
> > > >>> and
> > > >>> tries to access the memory. That causes the crash.
> > > >>>
> > > >>>
> > > >>> The workaround is to trunc the index value before
> > > >>> extractelement
> > > >>> (see
> > > >>> %idxSafe). But what should be the ultimate solution?
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>> We could fix this by specifying that out of bounds access on
> > > >>> an
> > > >>> extractelement leads to full-on undefined behavior, no need
> > > >>> to
> > > >>> force
> > > >>> everyone to eat the cost of a mask.
> > > >>>
> > > >>>
> > > >>> I don't have preference for any of the solutions.
> > > >>>
> > > >>>
> > > >>> I have a side question. It is not stated explicitly in the
> > > >>> reference
> > > >>> but I would assume the index of extractelement is processed
> > > >>> as
> > > >>> an
> > > >>> unsigned value. However, the DAG Builder extends the index
> > > >>> with
> > > >>> sext. Is it correct?
> > > >>
> > > >> Hrmm. Given that only (small) positive numbers are valid, it
> > > >> shouldn't matter. Unless we can find a reason that it works
> > > >> better
> > > >> to be sext, it seems conceptually cleaner to make it zext.
> > > >>
> > > >>
> > > >>
> > > >> I have tried to change it to zext. 2 Mips test have failed. I
> > > >> haven't
> > > >> checked the details though.
> > > >> sext looks pretty wrong to me because i5 -1 does not mean 31
> > > >> any
> > > >> more.
> > > >>
> > > >>
> > > >> - PB
> > > >>
> > > >>
> > > >>
> > > >> -Hal
> > > >>
> > > >>>
> > > >>>
> > > >>> - PB
> > > >>> _______________________________________________
> > > >>> LLVM Developers mailing list
> > > >>> [hidden email] http://llvm.cs.uiuc.edu
> > > >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> > > >>>
> > > >>
> > > >> --
> > > >> Hal Finkel
> > > >> Assistant Computational Scientist
> > > >> Leadership Computing Facility
> > > >> Argonne National Laboratory
> > > >> _______________________________________________
> > > >> LLVM Developers mailing list
> > > >> [hidden email] http://llvm.cs.uiuc.edu
> > > >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> > > >>
> > > >>
> > > >
> > > > --
> > > > Hal Finkel
> > > > Assistant Computational Scientist
> > > > Leadership Computing Facility
> > > > Argonne National Laboratory
> > >
> > >
> >
> > --
> > Hal Finkel
> > Assistant Computational Scientist
> > Leadership Computing Facility
> > Argonne National Laboratory
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > [hidden email] http://llvm.cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >
> >
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
>
>

--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev