struct with signed bitfield (PR17827)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

struct with signed bitfield (PR17827)

Kay Tiong Khoo
I've been diagnosing this bug:
http://llvm.org/bugs/show_bug.cgi?id=17827

Summary: I think the following program miscompiles at -O1 because the fact that 'f0' is a signed 3-bit value is lost in the unoptimized LLVM IR. How do we fix this?

$ cat bitfield.c
/* %struct.S = type { i8, [3 x i8] } ??? */
struct S {
  int f0:3;
} a;

int foo (int p) {
  struct S c = a;
  c.f0 = p & 6;
  return c.f0 < 1;
}

int main () {
  return foo (4);
}




_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: struct with signed bitfield (PR17827)

Mark Lacey-4

On Nov 15, 2013, at 3:42 PM, Kay Tiong Khoo <[hidden email]> wrote:

I've been diagnosing this bug:
http://llvm.org/bugs/show_bug.cgi?id=17827

Summary: I think the following program miscompiles at -O1 because the fact that 'f0' is a signed 3-bit value is lost in the unoptimized LLVM IR. How do we fix this?

I don’t have the C/C++ standards in front of me but IIRC whether a char/short/int/long/long long bitfield is signed or unsigned is implementation defined. You need to explicitly specify signed or unsigned in order to have any guarantee of the signedness, e.g. signed int.


$ cat bitfield.c
/* %struct.S = type { i8, [3 x i8] } ??? */
struct S {
  int f0:3;
} a;

int foo (int p) {
  struct S c = a;
  c.f0 = p & 6;
  return c.f0 < 1;
}

int main () {
  return foo (4);
}



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: struct with signed bitfield (PR17827)

Henrique Santos-2
I actually think it is a problem with the optimizer like Kay first thought. -instcombine seems turning "((x and 6) shl 5) slt 32" into "(x and 6) slt 1". If the comparison were unsigned or the shl had a nsw flag, I think this would be okay. Since none of these is true, I don't think this transformation is correct.

H.



On Sat, Nov 16, 2013 at 1:41 AM, Mark Lacey <[hidden email]> wrote:

On Nov 15, 2013, at 3:42 PM, Kay Tiong Khoo <[hidden email]> wrote:

I've been diagnosing this bug:
http://llvm.org/bugs/show_bug.cgi?id=17827

Summary: I think the following program miscompiles at -O1 because the fact that 'f0' is a signed 3-bit value is lost in the unoptimized LLVM IR. How do we fix this?

I don’t have the C/C++ standards in front of me but IIRC whether a char/short/int/long/long long bitfield is signed or unsigned is implementation defined. You need to explicitly specify signed or unsigned in order to have any guarantee of the signedness, e.g. signed int.


$ cat bitfield.c
/* %struct.S = type { i8, [3 x i8] } ??? */
struct S {
  int f0:3;
} a;

int foo (int p) {
  struct S c = a;
  c.f0 = p & 6;
  return c.f0 < 1;
}

int main () {
  return foo (4);
}



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: struct with signed bitfield (PR17827)

Kay Tiong Khoo
In reply to this post by Mark Lacey-4
Invalidating this bug with a language technicality would be a great way out. :)

But I don't see anything in the standard to suggest that a plain 'int' bitfield is any different than a 'signed int' bitfield, and even if that was true, I don't see any difference in codegen whether I specify 'signed' explicitly or not. So at the least, clang or llvm still has a bug for the explicit 'signed' case from what I can tell.


On Fri, Nov 15, 2013 at 8:41 PM, Mark Lacey <[hidden email]> wrote:

On Nov 15, 2013, at 3:42 PM, Kay Tiong Khoo <[hidden email]> wrote:

I've been diagnosing this bug:
http://llvm.org/bugs/show_bug.cgi?id=17827

Summary: I think the following program miscompiles at -O1 because the fact that 'f0' is a signed 3-bit value is lost in the unoptimized LLVM IR. How do we fix this?

I don’t have the C/C++ standards in front of me but IIRC whether a char/short/int/long/long long bitfield is signed or unsigned is implementation defined. You need to explicitly specify signed or unsigned in order to have any guarantee of the signedness, e.g. signed int.


$ cat bitfield.c
/* %struct.S = type { i8, [3 x i8] } ??? */
struct S {
  int f0:3;
} a;

int foo (int p) {
  struct S c = a;
  c.f0 = p & 6;
  return c.f0 < 1;
}

int main () {
  return foo (4);
}



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: struct with signed bitfield (PR17827)

Kay Tiong Khoo
In reply to this post by Henrique Santos-2
I need to read up on how nsw would make this different, but I see your point about the shift:

  %bf.result.shl = shl i8 %bf.value, 5
  %bf.result.ashr = ashr i8 %bf.result.shl, 5

This should have splatted the sign bit across the upper 5 bits of the char, so the subsequent compare:
 %cmp = icmp slt i32 %bf.cast, 1

Can't be transformed to a check for 'equal to 0'.

Thanks!



On Fri, Nov 15, 2013 at 9:18 PM, Henrique Santos <[hidden email]> wrote:
I actually think it is a problem with the optimizer like Kay first thought. -instcombine seems turning "((x and 6) shl 5) slt 32" into "(x and 6) slt 1". If the comparison were unsigned or the shl had a nsw flag, I think this would be okay. Since none of these is true, I don't think this transformation is correct.

H.



On Sat, Nov 16, 2013 at 1:41 AM, Mark Lacey <[hidden email]> wrote:

On Nov 15, 2013, at 3:42 PM, Kay Tiong Khoo <[hidden email]> wrote:

I've been diagnosing this bug:
http://llvm.org/bugs/show_bug.cgi?id=17827

Summary: I think the following program miscompiles at -O1 because the fact that 'f0' is a signed 3-bit value is lost in the unoptimized LLVM IR. How do we fix this?

I don’t have the C/C++ standards in front of me but IIRC whether a char/short/int/long/long long bitfield is signed or unsigned is implementation defined. You need to explicitly specify signed or unsigned in order to have any guarantee of the signedness, e.g. signed int.


$ cat bitfield.c
/* %struct.S = type { i8, [3 x i8] } ??? */
struct S {
  int f0:3;
} a;

int foo (int p) {
  struct S c = a;
  c.f0 = p & 6;
  return c.f0 < 1;
}

int main () {
  return foo (4);
}



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev




_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: struct with signed bitfield (PR17827)

Duncan P. N. Exon Smith
In reply to this post by Mark Lacey-4
On 2013 Nov 15, at 19:41, Mark Lacey <[hidden email]> wrote:

> I don’t have the C/C++ standards in front of me but IIRC whether a char/short/int/long/long long bitfield is signed or unsigned is implementation defined. You need to explicitly specify signed or unsigned in order to have any guarantee of the signedness, e.g. signed int.

Section 3.9.1 of the C++11 standard [1] defines short/int/long/long long as signed.  Bit-fields are discussed in 9.6 and have lots of implementation-defined behavior, but I don’t see anything about signedness.  The ABI (e.g., [2]) defines whether char is signed or unsigned.

[1]: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3485.pdf
[2]: http://www.cs.tufts.edu/comp/40/readings/amd64-abi.pdf


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: struct with signed bitfield (PR17827)

Henrique Santos-2
In reply to this post by Kay Tiong Khoo
On Sat, Nov 16, 2013 at 3:39 PM, Kay Tiong Khoo <[hidden email]> wrote:
I need to read up on how nsw would make this different, but I see your point about the shift:
 
If the nsw flag is set, we can assume the sign bit isn't affected and apply the simplification.
If the nuw flag is set, we can assume no bits are shifted out and apply the simplification if the comparison is unsigned.
If no flag is set (and no flag is settable), then I don't think there's anything to be done.


  %bf.result.shl = shl i8 %bf.value, 5
  %bf.result.ashr = ashr i8 %bf.result.shl, 5

This should have splatted the sign bit across the upper 5 bits of the char, so the subsequent compare:
 %cmp = icmp slt i32 %bf.cast, 1

Can't be transformed to a check for 'equal to 0'.

Thanks!

H.

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: struct with signed bitfield (PR17827)

Richard Smith-33
In reply to this post by Duncan P. N. Exon Smith
C says that plain bit-fields could be either signed or unsigned. C++ removed this allowance in core issue 739:

http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#739


On Sat, Nov 16, 2013 at 1:55 PM, Duncan P. N. Exon Smith <[hidden email]> wrote:
On 2013 Nov 15, at 19:41, Mark Lacey <[hidden email]> wrote:

> I don’t have the C/C++ standards in front of me but IIRC whether a char/short/int/long/long long bitfield is signed or unsigned is implementation defined. You need to explicitly specify signed or unsigned in order to have any guarantee of the signedness, e.g. signed int.

Section 3.9.1 of the C++11 standard [1] defines short/int/long/long long as signed.  Bit-fields are discussed in 9.6 and have lots of implementation-defined behavior, but I don’t see anything about signedness.  The ABI (e.g., [2]) defines whether char is signed or unsigned.

[1]: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3485.pdf
[2]: http://www.cs.tufts.edu/comp/40/readings/amd64-abi.pdf


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: struct with signed bitfield (PR17827)

Kay Tiong Khoo
Thanks, Richard (and also Mark Lacey, who did remember correctly!). I've noted this thread and linked to the standard in the bug report.

I'll try to pinpoint the problem in InstCombine - thanks, Henrique.

Does anyone know why clang always rounds bit fields up to byte widths rather than say just using 'i3' to represent a 3-bit field? The bit-shifting gymnastics created by the byte-size ops appear to lead to other problems like:
http://llvm.org/bugs/show_bug.cgi?id=17956



On Sat, Nov 16, 2013 at 4:09 PM, Richard Smith <[hidden email]> wrote:
C says that plain bit-fields could be either signed or unsigned. C++ removed this allowance in core issue 739:

http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#739


On Sat, Nov 16, 2013 at 1:55 PM, Duncan P. N. Exon Smith <[hidden email]> wrote:
On 2013 Nov 15, at 19:41, Mark Lacey <[hidden email]> wrote:

> I don’t have the C/C++ standards in front of me but IIRC whether a char/short/int/long/long long bitfield is signed or unsigned is implementation defined. You need to explicitly specify signed or unsigned in order to have any guarantee of the signedness, e.g. signed int.

Section 3.9.1 of the C++11 standard [1] defines short/int/long/long long as signed.  Bit-fields are discussed in 9.6 and have lots of implementation-defined behavior, but I don’t see anything about signedness.  The ABI (e.g., [2]) defines whether char is signed or unsigned.

[1]: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3485.pdf
[2]: http://www.cs.tufts.edu/comp/40/readings/amd64-abi.pdf


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev