ubsan - active member check for unions

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

ubsan - active member check for unions

Ismail Pazarbasi
Hi,

I decided to implement a sanitizer that tracks the active member, and
reports reads from inactive members.

e.g.:
  struct S {
    union {
      long l;
      char s[42];
      double d;
    };
  };
  void f() {
    S s;
    s.d = 42.0;
    if (s.l > 100) // fire here
      ;
  }

My current implementation is naïve (also possibly wrong), and doesn't
care about "special guarantee" that allows inspecting common initial
sequence. Briefly;
1. in EmitBinaryOperatorLValue, if we are storing value of a union
member, we call a ubsan runtime function (not a handler) that maps
pointer value (of the union) to a tuple/struct (of TypeDescr, active
field index, expression location that has changed the active member,
and array of field names)
2. When a union member is being loaded, a call is emitted to a ubsan
runtime function that both checks for, and reports access to inactive
members

For above example, sanitizer says:
  union-track-active-t1.cpp:11:9: runtime error: active field of union
'S::(anonymous union at union-track-active-t1.cpp:2:3)' is 'd'; access
to field 'l' is undefined
  union-track-active-t1.cpp:10:7: note: active index set to 'd' here

Runtime functions I've added:
  struct StringVec {
    unsigned NumElts;
    const char *Elts[];
  };
  struct UnionStaticData {
    TypeDescriptor *Type;
    StringVec* FieldNames;
  };

extern "C" SANITIZER_INTERFACE_ATTRIBUTE
void __ubsan_union_set_active_field_index(uptr Addr, const
UnionStaticData *Data,
                                            const SourceLocation *ActivationLoc,
                                            unsigned ActiveFieldIndex);

extern "C" SANITIZER_INTERFACE_ATTRIBUTE
void __ubsan_union_check_access(uptr Addr, const SourceLocation *Loc,
                                  unsigned FieldIndex);

For above code, IR looks like:
  // s.d = 42.0;
  store double 4.200000e+01, double* %d, align 8
  // injected by sanitizer
  // __ubsan_union_set_active_field_index(/*union addr*/&s,
  // /*type descr=*/'S::anonymous union',
  // /*field names*/{"l", "s", "d"},
  // /*expr loc=*/"file.cpp:10:7",
  // /*field index*/2);
  %1 = bitcast %union.anon* %0 to i8*
  call void @__ubsan_union_set_active_field_index(i8* %1, i8* bitcast
({ { i16, i16, [56 x i8] }*, { i32, [3 x i8*] }* }* @6 to i8*), i8*
bitcast ({ [26 x i8]*, i32, i32 }* @5 to i8*), i32 2)

  // if (s.l > 100L) ;
  %2 = getelementptr inbounds %struct.S* %s, i32 0, i32 0
  %l = bitcast %union.anon* %2 to i64*
  %3 = bitcast i64* %l to i8*
  call void @__ubsan_union_check_access(i8* %3, i8* bitcast ({ [26 x
i8]*, i32, i32 }* @7 to i8*), i32 0)
  %4 = load i64* %l, align 8
  %cmp = icmp sgt i64 %4, 100

I have a few questions regarding the overall design:
1. Do you think this is a useful check?
2. Where can I store type and field info about the union; some form of
a shadow memory or a simple array/map?
3. Should sanitizer abort immediately or continue upon detection?
4. When/how can I remove entries from ubsan shadow memory when union's
lifetime ends; perhaps in a module pass or at the end of each
function?

It's far from submission quality at the moment, that's why I am
holding back the patch. Before proceeding further, I'd like to get
some feedback as to whether this is a valuable feature, and whether I
am on the right track.

Ismail

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: ubsan - active member check for unions

Yury Gribov
On 12/15/2014 10:24 PM, Ismail Pazarbasi wrote:
>      s.d = 42.0;
>      if (s.l > 100) // fire here

Note that code like this is frequently used to convert integers to
floats so you'll get tons of false positives.  Emitting error for
accessing differently sized elements of enum may work (but should
already be handled by MSan?).

> I have a few questions regarding the overall design:
> 1. Do you think this is a useful check?

That's actually an interesting questions. It could be useful for tagged
unions although I believe programmers usually surround them with
checking asserts anyway.

> 2. Where can I store type and field info about the union; some form of
> a shadow memory or a simple array/map?

Without shadow it may be unacceptably slow in union-intensive
applications. But with shadow, it'll greatly complicate UBSan.

> 3. Should sanitizer abort immediately or continue upon detection?

AFAIK normally UBSan checks continue after error (but there's a flag to
alter this).

> 4. When/how can I remove entries from ubsan shadow memory when union's
> lifetime ends; perhaps in a module pass or at the end of each
> function?

Take a look at how ASan does this (it's not easy).

-Y

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: ubsan - active member check for unions

Nick Lewycky
On 12/16/2014 02:43 AM, Yury Gribov wrote:
> On 12/15/2014 10:24 PM, Ismail Pazarbasi wrote:
>>      s.d = 42.0;
>>      if (s.l > 100) // fire here
>
> Note that code like this is frequently used to convert integers to
> floats so you'll get tons of false positives.

True positives. The fix is to use memcpy instead.

   Emitting error for
> accessing differently sized elements of enum may work (but should
> already be handled by MSan?).
>
>> I have a few questions regarding the overall design:
>> 1. Do you think this is a useful check?
>
> That's actually an interesting questions. It could be useful for tagged
> unions although I believe programmers usually surround them with
> checking asserts anyway.

Useful, yes. It will find bugs. I haven't heard anyone clamoring for it
though.

>> 2. Where can I store type and field info about the union; some form of
>> a shadow memory or a simple array/map?
>
> Without shadow it may be unacceptably slow in union-intensive
> applications. But with shadow, it'll greatly complicate UBSan.

None of the other checks in UBSan change the ABI. You can freely link
ubsan .o files with non-ubsan .o files and the program will still work
and ubsan will generate no false positives.

With this check, that is not so. Uninstrumented code changing the active
member of a union will cause a false positive when we next read it in
instrumented code.

>> 3. Should sanitizer abort immediately or continue upon detection?
>
> AFAIK normally UBSan checks continue after error (but there's a flag to
> alter this).
>
>> 4. When/how can I remove entries from ubsan shadow memory when union's
>> lifetime ends; perhaps in a module pass or at the end of each
>> function?
>
> Take a look at how ASan does this (it's not easy).
>
> -Y
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [cfe-dev] ubsan - active member check for unions

Richard Smith-33
In reply to this post by Ismail Pazarbasi
On Mon, Dec 15, 2014 at 11:24 AM, Ismail Pazarbasi <[hidden email]> wrote:
Hi,

I decided to implement a sanitizer that tracks the active member, and
reports reads from inactive members.

e.g.:
  struct S {
    union {
      long l;
      char s[42];
      double d;
    };
  };
  void f() {
    S s;
    s.d = 42.0;
    if (s.l > 100) // fire here
      ;
  }

My current implementation is naïve (also possibly wrong), and doesn't
care about "special guarantee" that allows inspecting common initial
sequence. Briefly;
1. in EmitBinaryOperatorLValue, if we are storing value of a union
member, we call a ubsan runtime function (not a handler) that maps
pointer value (of the union) to a tuple/struct (of TypeDescr, active
field index, expression location that has changed the active member,
and array of field names)
2. When a union member is being loaded, a call is emitted to a ubsan
runtime function that both checks for, and reports access to inactive
members

I do not believe this is correct, and I do not believe it can be made to be correct. Here's why:

* In C, there is no restriction on accessing an inactive union member. The only relevant rule is the effective type rule, in C11 6.5/6, that says that you can't store to memory with one type and then load from that same memory with an incompatible type. But beware! C is deeply confused, and C11 footnote 95 says that reading a union member with the wrong effective type will produce "the appropriate part of the object representation of the value [...] reinterpreted as an object representation in the new type", and then refers the reader to another part of the standard which says no such thing.

* In C++, the only standard way to change the active member of a union object is by applying placement new to the member (and even that doesn't work in some cases). No-one actually writes code that way, so every implementation in practice allows the active union member to change in various undocumented and non-standard ways.

* It is an explicit goal of UBSan to allow some TUs to be built with it enabled and some to be built with it disabled. This check does not and cannot support that, because the active union member / effective type can change in uninstrumented code. So this does not belong in UBSan.

However...

Implementing a sanitizer for C's effective type rule seems like a good idea. Such a sanitizer would also be correct in C++ (whose rules are strictly less permissive), but not complete. The best way to approach this would be to provide shadow memory indicating the effective type of real memory. In order to keep things simple, you'd probably want to ignore aggregate types and simply track the primitive effective type of memory.

Up to compatibility, there are not many such types (bool, char, short, int, long, long long, float, double, long double, and some language extensions), so you can probably get away with a 4 bit representation of the type. Further, if a byte has effective type of (signed/unsigned/plain) char, the adjacent byte must have that type too (due to alignment restrictions), so you can use the same type representation for each pair of bytes. There are likely to be enough spare values that you can separately encode [4 char], [2 char][short], [short][2 char], and [short][short], which would let you use a single macrotype for each 4 byte chunk of memory. That gives only a 12.5% memory overhead, which seems very respectable.

For above example, sanitizer says:
  union-track-active-t1.cpp:11:9: runtime error: active field of union
'S::(anonymous union at union-track-active-t1.cpp:2:3)' is 'd'; access
to field 'l' is undefined
  union-track-active-t1.cpp:10:7: note: active index set to 'd' here

Runtime functions I've added:
  struct StringVec {
    unsigned NumElts;
    const char *Elts[];
  };
  struct UnionStaticData {
    TypeDescriptor *Type;
    StringVec* FieldNames;
  };

extern "C" SANITIZER_INTERFACE_ATTRIBUTE
void __ubsan_union_set_active_field_index(uptr Addr, const
UnionStaticData *Data,
                                            const SourceLocation *ActivationLoc,
                                            unsigned ActiveFieldIndex);

extern "C" SANITIZER_INTERFACE_ATTRIBUTE
void __ubsan_union_check_access(uptr Addr, const SourceLocation *Loc,
                                  unsigned FieldIndex);

For above code, IR looks like:
  // s.d = 42.0;
  store double 4.200000e+01, double* %d, align 8
  // injected by sanitizer
  // __ubsan_union_set_active_field_index(/*union addr*/&s,
  // /*type descr=*/'S::anonymous union',
  // /*field names*/{"l", "s", "d"},
  // /*expr loc=*/"file.cpp:10:7",
  // /*field index*/2);
  %1 = bitcast %union.anon* %0 to i8*
  call void @__ubsan_union_set_active_field_index(i8* %1, i8* bitcast
({ { i16, i16, [56 x i8] }*, { i32, [3 x i8*] }* }* @6 to i8*), i8*
bitcast ({ [26 x i8]*, i32, i32 }* @5 to i8*), i32 2)

  // if (s.l > 100L) ;
  %2 = getelementptr inbounds %struct.S* %s, i32 0, i32 0
  %l = bitcast %union.anon* %2 to i64*
  %3 = bitcast i64* %l to i8*
  call void @__ubsan_union_check_access(i8* %3, i8* bitcast ({ [26 x
i8]*, i32, i32 }* @7 to i8*), i32 0)
  %4 = load i64* %l, align 8
  %cmp = icmp sgt i64 %4, 100

I have a few questions regarding the overall design:
1. Do you think this is a useful check?
2. Where can I store type and field info about the union; some form of
a shadow memory or a simple array/map?
3. Should sanitizer abort immediately or continue upon detection?
4. When/how can I remove entries from ubsan shadow memory when union's
lifetime ends; perhaps in a module pass or at the end of each
function?

It's far from submission quality at the moment, that's why I am
holding back the patch. Before proceeding further, I'd like to get
some feedback as to whether this is a valuable feature, and whether I
am on the right track.

Ismail

_______________________________________________
cfe-dev mailing list
[hidden email]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: ubsan - active member check for unions

Yury Gribov
In reply to this post by Nick Lewycky
On 12/19/2014 01:14 AM, Nick Lewycky wrote:
> On 12/16/2014 02:43 AM, Yury Gribov wrote:
>> On 12/15/2014 10:24 PM, Ismail Pazarbasi wrote:
>>>      s.d = 42.0;
>>>      if (s.l > 100) // fire here
>>
>> Note that code like this is frequently used to convert integers to
>> floats so you'll get tons of false positives.
>
> True positives. The fix is to use memcpy instead.

Hm, I thought C aliasing rules explicitly allow changing types through
unions. Anyway, the pattern is so widespread that IMHO most maintainers
will find such errors useless.

-Y

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [cfe-dev] ubsan - active member check for unions

Richard Smith-33
On Thu, Dec 18, 2014 at 11:05 PM, Yury Gribov <[hidden email]> wrote:
On 12/19/2014 01:14 AM, Nick Lewycky wrote:
On 12/16/2014 02:43 AM, Yury Gribov wrote:
On 12/15/2014 10:24 PM, Ismail Pazarbasi wrote:
     s.d = 42.0;
     if (s.l > 100) // fire here

Note that code like this is frequently used to convert integers to
floats so you'll get tons of false positives.

True positives. The fix is to use memcpy instead.

Hm, I thought C aliasing rules explicitly allow changing types through unions.

See my previous email; the kindest thing I can say about how C treats aliasing through unions is that it is confused.
 
Anyway, the pattern is so widespread that IMHO most maintainers will find such errors useless.

Well, given that this is the bug that the sanitizer is built to detect, such maintainers should not turn it on.

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [cfe-dev] ubsan - active member check for unions

Joerg Sonnenberger
On Fri, Dec 19, 2014 at 06:59:48PM -0800, Richard Smith wrote:
> On Thu, Dec 18, 2014 at 11:05 PM, Yury Gribov <[hidden email]> wrote:
> > Anyway, the pattern is so widespread that IMHO most maintainers will find
> > such errors useless.
>
>
> Well, given that this is the bug that the sanitizer is built to detect,
> such maintainers should not turn it on.

I agree with Yury that there are a few cases where it is the standard
approach and flagging it as error doesn't make sense. Question is
whether we can enumerate those cases and white list them. I know unions
are used in mathematical code for accesing floats as int and vice versa,
not sure about any thing else.

Joerg
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [cfe-dev] ubsan - active member check for unions

Caldarale, Charles R
> From: [hidden email] [mailto:[hidden email]]
> On Behalf Of Joerg Sonnenberger
> Subject: Re: [LLVMdev] [cfe-dev] ubsan - active member check for unions

> On Fri, Dec 19, 2014 at 06:59:48PM -0800, Richard Smith wrote:
> > On Thu, Dec 18, 2014 at 11:05 PM, Yury Gribov <[hidden email]> wrote:
> > > Anyway, the pattern is so widespread that IMHO most maintainers will find
> > > such errors useless.

> > Well, given that this is the bug that the sanitizer is built to detect,
> > such maintainers should not turn it on.

> I agree with Yury that there are a few cases where it is the standard
> approach and flagging it as error doesn't make sense. Question is
> whether we can enumerate those cases and white list them. I know unions
> are used in mathematical code for accesing floats as int and vice versa,
> not sure about any thing else.

Often used when dealing with bare hardware and the types of fields are dependent on other information, so several are folded into one union.

 - Chuck


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [cfe-dev] ubsan - active member check for unions

Alex Rosenberg
And SIMD vector types.

Every compiler I've worked on exempted union trickery from alias analysis assumptions for compatibility reasons.

That said, more diagnostics, including optional sanitizer checks is welcome here.

Alex

On Dec 20, 2014, at 5:39 AM, Caldarale, Charles R <[hidden email]> wrote:

>> From: [hidden email] [mailto:[hidden email]]
>> On Behalf Of Joerg Sonnenberger
>> Subject: Re: [LLVMdev] [cfe-dev] ubsan - active member check for unions
>
>>> On Fri, Dec 19, 2014 at 06:59:48PM -0800, Richard Smith wrote:
>>>> On Thu, Dec 18, 2014 at 11:05 PM, Yury Gribov <[hidden email]> wrote:
>>>> Anyway, the pattern is so widespread that IMHO most maintainers will find
>>>> such errors useless.
>
>>> Well, given that this is the bug that the sanitizer is built to detect,
>>> such maintainers should not turn it on.
>
>> I agree with Yury that there are a few cases where it is the standard
>> approach and flagging it as error doesn't make sense. Question is
>> whether we can enumerate those cases and white list them. I know unions
>> are used in mathematical code for accesing floats as int and vice versa,
>> not sure about any thing else.
>
> Often used when dealing with bare hardware and the types of fields are dependent on other information, so several are folded into one union.
>
> - Chuck
>
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [cfe-dev] ubsan - active member check for unions

Ismail Pazarbasi
In reply to this post by Richard Smith-33
On Fri, Dec 19, 2014 at 2:26 AM, Richard Smith <[hidden email]> wrote:

> On Mon, Dec 15, 2014 at 11:24 AM, Ismail Pazarbasi
> <[hidden email]> wrote:
>>
>> Hi,
>>
>> I decided to implement a sanitizer that tracks the active member, and
>> reports reads from inactive members.
>>
>> e.g.:
>>   struct S {
>>     union {
>>       long l;
>>       char s[42];
>>       double d;
>>     };
>>   };
>>   void f() {
>>     S s;
>>     s.d = 42.0;
>>     if (s.l > 100) // fire here
>>       ;
>>   }
>>
>> My current implementation is naïve (also possibly wrong), and doesn't
>> care about "special guarantee" that allows inspecting common initial
>> sequence. Briefly;
>> 1. in EmitBinaryOperatorLValue, if we are storing value of a union
>> member, we call a ubsan runtime function (not a handler) that maps
>> pointer value (of the union) to a tuple/struct (of TypeDescr, active
>> field index, expression location that has changed the active member,
>> and array of field names)
>> 2. When a union member is being loaded, a call is emitted to a ubsan
>> runtime function that both checks for, and reports access to inactive
>> members
>
>
> I do not believe this is correct, and I do not believe it can be made to be
> correct. Here's why:
>
> * In C, there is no restriction on accessing an inactive union member. The
> only relevant rule is the effective type rule, in C11 6.5/6, that says that
> you can't store to memory with one type and then load from that same memory
> with an incompatible type. But beware! C is deeply confused, and C11
> footnote 95 says that reading a union member with the wrong effective type
> will produce "the appropriate part of the object representation of the value
> [...] reinterpreted as an object representation in the new type", and then
> refers the reader to another part of the standard which says no such thing.
>
> * In C++, the only standard way to change the active member of a union
> object is by applying placement new to the member (and even that doesn't
> work in some cases). No-one actually writes code that way, so every
> implementation in practice allows the active union member to change in
> various undocumented and non-standard ways.
>
> * It is an explicit goal of UBSan to allow some TUs to be built with it
> enabled and some to be built with it disabled. This check does not and
> cannot support that, because the active union member / effective type can
> change in uninstrumented code. So this does not belong in UBSan.
The primary reason I thought ubsan would be the right place is because
type mismatch check is implemented in ubsan. I also thought about
asking whether ubsan is the right place for this check. According to
my interpretation of the discussions in -core reflector, C DR236, and
in UB list, reasons of undefined-ness of this case are:
a) lifetime of inactive members seems like to end when a member
becomes active (C++-only)
b) multiple subobjects with different type point to the same storage
(C, and C++)

For reference (C++ N3797):
9.5/p1: In a union, at most one of the non-static data members can be
active at any time, that is, the value of at most one of the
non-static data members can be stored in a union at any time. ... The
size of a union is sufficient to contain the largest of its non-static
data members. Each non-static data member is allocated as if it were
the sole member of a struct.

Per your message 25690 to -core reflector, given a union:
  U { int i; float f; } u;
both members' lifetimes will begin when storage for `u` is allocated,
the union has no active member.

 u.i = 0;
means lifetime of `u.f` will end immediately, per 3.8/p1 [...]the
storage which the object occupies is reused[...] So:
 u.f = 0;
is undefined behavior. But:
 new (&u.f) float(0);
is well-defined.

>
> However...
>
> Implementing a sanitizer for C's effective type rule seems like a good idea.
> Such a sanitizer would also be correct in C++ (whose rules are strictly less
> permissive), but not complete. The best way to approach this would be to
> provide shadow memory indicating the effective type of real memory. In order
> to keep things simple, you'd probably want to ignore aggregate types and
> simply track the primitive effective type of memory.
>
> Up to compatibility, there are not many such types (bool, char, short, int,
> long, long long, float, double, long double, and some language extensions),
> so you can probably get away with a 4 bit representation of the type.
> Further, if a byte has effective type of (signed/unsigned/plain) char, the
> adjacent byte must have that type too (due to alignment restrictions), so
> you can use the same type representation for each pair of bytes. There are
> likely to be enough spare values that you can separately encode [4 char], [2
> char][short], [short][2 char], and [short][short], which would let you use a
> single macrotype for each 4 byte chunk of memory. That gives only a 12.5%
> memory overhead, which seems very respectable.
Thanks for the idea!

Which sanitizer do you think fits best for this check? :)

>
>> For above example, sanitizer says:
>>   union-track-active-t1.cpp:11:9: runtime error: active field of union
>> 'S::(anonymous union at union-track-active-t1.cpp:2:3)' is 'd'; access
>> to field 'l' is undefined
>>   union-track-active-t1.cpp:10:7: note: active index set to 'd' here
>>
>> Runtime functions I've added:
>>   struct StringVec {
>>     unsigned NumElts;
>>     const char *Elts[];
>>   };
>>   struct UnionStaticData {
>>     TypeDescriptor *Type;
>>     StringVec* FieldNames;
>>   };
>>
>> extern "C" SANITIZER_INTERFACE_ATTRIBUTE
>> void __ubsan_union_set_active_field_index(uptr Addr, const
>> UnionStaticData *Data,
>>                                             const SourceLocation
>> *ActivationLoc,
>>                                             unsigned ActiveFieldIndex);
>>
>> extern "C" SANITIZER_INTERFACE_ATTRIBUTE
>> void __ubsan_union_check_access(uptr Addr, const SourceLocation *Loc,
>>                                   unsigned FieldIndex);
>>
>> For above code, IR looks like:
>>   // s.d = 42.0;
>>   store double 4.200000e+01, double* %d, align 8
>>   // injected by sanitizer
>>   // __ubsan_union_set_active_field_index(/*union addr*/&s,
>>   // /*type descr=*/'S::anonymous union',
>>   // /*field names*/{"l", "s", "d"},
>>   // /*expr loc=*/"file.cpp:10:7",
>>   // /*field index*/2);
>>   %1 = bitcast %union.anon* %0 to i8*
>>   call void @__ubsan_union_set_active_field_index(i8* %1, i8* bitcast
>> ({ { i16, i16, [56 x i8] }*, { i32, [3 x i8*] }* }* @6 to i8*), i8*
>> bitcast ({ [26 x i8]*, i32, i32 }* @5 to i8*), i32 2)
>>
>>   // if (s.l > 100L) ;
>>   %2 = getelementptr inbounds %struct.S* %s, i32 0, i32 0
>>   %l = bitcast %union.anon* %2 to i64*
>>   %3 = bitcast i64* %l to i8*
>>   call void @__ubsan_union_check_access(i8* %3, i8* bitcast ({ [26 x
>> i8]*, i32, i32 }* @7 to i8*), i32 0)
>>   %4 = load i64* %l, align 8
>>   %cmp = icmp sgt i64 %4, 100
>>
>> I have a few questions regarding the overall design:
>> 1. Do you think this is a useful check?
>> 2. Where can I store type and field info about the union; some form of
>> a shadow memory or a simple array/map?
>> 3. Should sanitizer abort immediately or continue upon detection?
>> 4. When/how can I remove entries from ubsan shadow memory when union's
>> lifetime ends; perhaps in a module pass or at the end of each
>> function?
>>
>> It's far from submission quality at the moment, that's why I am
>> holding back the patch. Before proceeding further, I'd like to get
>> some feedback as to whether this is a valuable feature, and whether I
>> am on the right track.
>>
>> Ismail
>>
>> _______________________________________________
>> cfe-dev mailing list
>> [hidden email]
>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev