Re: [cfe-dev] UB in TypeLoc casting

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: [cfe-dev] UB in TypeLoc casting

David Blaikie
Moving to LLVM dev to discuss the possibility of extending the cast
infrastructure to handle this.

On Tue, Nov 20, 2012 at 5:51 PM, John McCall <[hidden email]> wrote:

> On Nov 18, 2012, at 5:05 PM, David Blaikie <[hidden email]> wrote:
>> TypeLoc casting looks bogus.
>>
>> TypeLoc derived types return true from classof when the dynamic type
>> is not actually an instance of the derived type. The TypeLoc is then
>> (erroneously) cast to the derived type and the derived type's implicit
>> copy ctor is executed (so long as you remember to assign the result of
>> cast<SpecificTypeLoc>), if you copy, otherwise the SpecificTypeLoc's
>> member functions are actually invoked on a TypeLoc object - bogas, but
>> it works (because there's no other members in the SpecificTypeLoc
>> types).
>
> Yep.  See also LLVM's IntrinsicInst.  This kind of UB is very common and
> essentially harmless, but if you want to stamp it out, feel free.
>
>> Richard / Kostya: what's the word on catching this kind of UB
>> (essentially: calling member functions of one type on an instance not
>> of that type)? (in the specific case mentioned below, as discussed in
>> the original thread, ASan or MSan might be able to help)
>>
>> Clang Dev: what should we do to fix this? I don't think it's helpful
>> to add machinery to llvm::cast to handle concrete type conversion, so
>> I think we should consider a non-llvm::cast solution for this
>> hierarchy. Replace the llvm::isa calls with getTypeLocClass calls (or
>> a separate wrapper for that) & add a SpecificTypeLoc(const TypeLoc&)
>> ctor for every specific TypeLoc that has an appropriate assert & calls
>> up through to the base copy ctor. It'll be a fair bit of code to add
>> to TypeLoc, but nothing complicated.
>>
>> Any other ideas?
>
> I don't see why isa<> and cast<> aren't the right code interface for this,
> rather than reinventing something else.  You just need to teach cast<>
> to construct and return a proper temporary.

LLVM-dev folks: what do you reckon? Should we extend the cast
infrastructure to be able to handle value type conversions?

It doesn't really feel like the right fit to me, but I'll defer to the
community's wisdom if it's overwhelming. I'd just like to see a
solution wherein we can't write this particular kind of bug again,
ideally.

> You should construct that temporary from a QualType and void*.
> Please do not add a constructor which basically breaks the static
> type safety of the entire hierarchy.

[fair point - bad suggestion on my part, but the key point was: some
solution that doesn't use the cast infrastructure because these are
value type conversions not pointer/reference conversions for objects
that are already of the intended destination type]
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [cfe-dev] UB in TypeLoc casting

Eli Friedman-2
On Thu, Nov 29, 2012 at 3:49 PM, David Blaikie <[hidden email]> wrote:

> Moving to LLVM dev to discuss the possibility of extending the cast
> infrastructure to handle this.
>
> On Tue, Nov 20, 2012 at 5:51 PM, John McCall <[hidden email]> wrote:
>> On Nov 18, 2012, at 5:05 PM, David Blaikie <[hidden email]> wrote:
>>> TypeLoc casting looks bogus.
>>>
>>> TypeLoc derived types return true from classof when the dynamic type
>>> is not actually an instance of the derived type. The TypeLoc is then
>>> (erroneously) cast to the derived type and the derived type's implicit
>>> copy ctor is executed (so long as you remember to assign the result of
>>> cast<SpecificTypeLoc>), if you copy, otherwise the SpecificTypeLoc's
>>> member functions are actually invoked on a TypeLoc object - bogas, but
>>> it works (because there's no other members in the SpecificTypeLoc
>>> types).
>>
>> Yep.  See also LLVM's IntrinsicInst.  This kind of UB is very common and
>> essentially harmless, but if you want to stamp it out, feel free.
>>
>>> Richard / Kostya: what's the word on catching this kind of UB
>>> (essentially: calling member functions of one type on an instance not
>>> of that type)? (in the specific case mentioned below, as discussed in
>>> the original thread, ASan or MSan might be able to help)
>>>
>>> Clang Dev: what should we do to fix this? I don't think it's helpful
>>> to add machinery to llvm::cast to handle concrete type conversion, so
>>> I think we should consider a non-llvm::cast solution for this
>>> hierarchy. Replace the llvm::isa calls with getTypeLocClass calls (or
>>> a separate wrapper for that) & add a SpecificTypeLoc(const TypeLoc&)
>>> ctor for every specific TypeLoc that has an appropriate assert & calls
>>> up through to the base copy ctor. It'll be a fair bit of code to add
>>> to TypeLoc, but nothing complicated.
>>>
>>> Any other ideas?
>>
>> I don't see why isa<> and cast<> aren't the right code interface for this,
>> rather than reinventing something else.  You just need to teach cast<>
>> to construct and return a proper temporary.
>
> LLVM-dev folks: what do you reckon? Should we extend the cast
> infrastructure to be able to handle value type conversions?
No, I don't think that makes sense; the casting infrastructure is
complicated enough without having to deal with this.  It would be easy
enough to implement a method on TypeLoc to handle this,

> It doesn't really feel like the right fit to me, but I'll defer to the
> community's wisdom if it's overwhelming. I'd just like to see a
> solution wherein we can't write this particular kind of bug again,
> ideally.

Attaching proof-of-concept solution which prevents this kind of bug.
(Ugly, and requires C++11 support as written, but potentially
interesting anyway.)

-Eli

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

casthack.txt (10K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [cfe-dev] UB in TypeLoc casting

Richard Smith-33
On Thu, Nov 29, 2012 at 7:43 PM, Eli Friedman <[hidden email]> wrote:
On Thu, Nov 29, 2012 at 3:49 PM, David Blaikie <[hidden email]> wrote:
> Moving to LLVM dev to discuss the possibility of extending the cast
> infrastructure to handle this.
>
> On Tue, Nov 20, 2012 at 5:51 PM, John McCall <[hidden email]> wrote:
>> On Nov 18, 2012, at 5:05 PM, David Blaikie <[hidden email]> wrote:
>>> TypeLoc casting looks bogus.
>>>
>>> TypeLoc derived types return true from classof when the dynamic type
>>> is not actually an instance of the derived type. The TypeLoc is then
>>> (erroneously) cast to the derived type and the derived type's implicit
>>> copy ctor is executed (so long as you remember to assign the result of
>>> cast<SpecificTypeLoc>), if you copy, otherwise the SpecificTypeLoc's
>>> member functions are actually invoked on a TypeLoc object - bogas, but
>>> it works (because there's no other members in the SpecificTypeLoc
>>> types).
>>
>> Yep.  See also LLVM's IntrinsicInst.  This kind of UB is very common and
>> essentially harmless, but if you want to stamp it out, feel free.
>>
>>> Richard / Kostya: what's the word on catching this kind of UB
>>> (essentially: calling member functions of one type on an instance not
>>> of that type)? (in the specific case mentioned below, as discussed in
>>> the original thread, ASan or MSan might be able to help)
>>>
>>> Clang Dev: what should we do to fix this? I don't think it's helpful
>>> to add machinery to llvm::cast to handle concrete type conversion, so
>>> I think we should consider a non-llvm::cast solution for this
>>> hierarchy. Replace the llvm::isa calls with getTypeLocClass calls (or
>>> a separate wrapper for that) & add a SpecificTypeLoc(const TypeLoc&)
>>> ctor for every specific TypeLoc that has an appropriate assert & calls
>>> up through to the base copy ctor. It'll be a fair bit of code to add
>>> to TypeLoc, but nothing complicated.
>>>
>>> Any other ideas?
>>
>> I don't see why isa<> and cast<> aren't the right code interface for this,
>> rather than reinventing something else.  You just need to teach cast<>
>> to construct and return a proper temporary.
>
> LLVM-dev folks: what do you reckon? Should we extend the cast
> infrastructure to be able to handle value type conversions?

No, I don't think that makes sense; the casting infrastructure is
complicated enough without having to deal with this.  It would be easy
enough to implement a method on TypeLoc to handle this,

> It doesn't really feel like the right fit to me, but I'll defer to the
> community's wisdom if it's overwhelming. I'd just like to see a
> solution wherein we can't write this particular kind of bug again,
> ideally.

Attaching proof-of-concept solution which prevents this kind of bug.
(Ugly, and requires C++11 support as written, but potentially
interesting anyway.)

The C++11 should be easy to remove (move the enable_if to the return type). This approach seems reasonable to me.

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [cfe-dev] UB in TypeLoc casting

David Blaikie
On Thu, Nov 29, 2012 at 8:16 PM, Richard Smith <[hidden email]> wrote:

> On Thu, Nov 29, 2012 at 7:43 PM, Eli Friedman <[hidden email]>
> wrote:
>>
>> On Thu, Nov 29, 2012 at 3:49 PM, David Blaikie <[hidden email]> wrote:
>> > Moving to LLVM dev to discuss the possibility of extending the cast
>> > infrastructure to handle this.
>> >
>> > On Tue, Nov 20, 2012 at 5:51 PM, John McCall <[hidden email]> wrote:
>> >> On Nov 18, 2012, at 5:05 PM, David Blaikie <[hidden email]> wrote:
>> >>> TypeLoc casting looks bogus.
>> >>>
>> >>> TypeLoc derived types return true from classof when the dynamic type
>> >>> is not actually an instance of the derived type. The TypeLoc is then
>> >>> (erroneously) cast to the derived type and the derived type's implicit
>> >>> copy ctor is executed (so long as you remember to assign the result of
>> >>> cast<SpecificTypeLoc>), if you copy, otherwise the SpecificTypeLoc's
>> >>> member functions are actually invoked on a TypeLoc object - bogas, but
>> >>> it works (because there's no other members in the SpecificTypeLoc
>> >>> types).
>> >>
>> >> Yep.  See also LLVM's IntrinsicInst.  This kind of UB is very common
>> >> and
>> >> essentially harmless, but if you want to stamp it out, feel free.
>> >>
>> >>> Richard / Kostya: what's the word on catching this kind of UB
>> >>> (essentially: calling member functions of one type on an instance not
>> >>> of that type)? (in the specific case mentioned below, as discussed in
>> >>> the original thread, ASan or MSan might be able to help)
>> >>>
>> >>> Clang Dev: what should we do to fix this? I don't think it's helpful
>> >>> to add machinery to llvm::cast to handle concrete type conversion, so
>> >>> I think we should consider a non-llvm::cast solution for this
>> >>> hierarchy. Replace the llvm::isa calls with getTypeLocClass calls (or
>> >>> a separate wrapper for that) & add a SpecificTypeLoc(const TypeLoc&)
>> >>> ctor for every specific TypeLoc that has an appropriate assert & calls
>> >>> up through to the base copy ctor. It'll be a fair bit of code to add
>> >>> to TypeLoc, but nothing complicated.
>> >>>
>> >>> Any other ideas?
>> >>
>> >> I don't see why isa<> and cast<> aren't the right code interface for
>> >> this,
>> >> rather than reinventing something else.  You just need to teach cast<>
>> >> to construct and return a proper temporary.
>> >
>> > LLVM-dev folks: what do you reckon? Should we extend the cast
>> > infrastructure to be able to handle value type conversions?
>>
>> No, I don't think that makes sense; the casting infrastructure is
>> complicated enough without having to deal with this.  It would be easy
>> enough to implement a method on TypeLoc to handle this,
>>
>> > It doesn't really feel like the right fit to me, but I'll defer to the
>> > community's wisdom if it's overwhelming. I'd just like to see a
>> > solution wherein we can't write this particular kind of bug again,
>> > ideally.
>>
>> Attaching proof-of-concept solution which prevents this kind of bug.
>> (Ugly, and requires C++11 support as written, but potentially
>> interesting anyway.)
>
>
> The C++11 should be easy to remove (move the enable_if to the return type).
> This approach seems reasonable to me.

Thanks Eli - that does seem to do the trick (took me a little while to
wrap my head around it, though). I've modified it, as Richard
suggested, to use enable_if in the return type & it builds in C++98
and correctly catches the TypeLoc usage.

I'll have to find some time to actually go & fix TypeLoc before we can
submit this, though.

(& technically we could allow casting where the parameter type is a
temporary, so long as you only cast to the same type - but that's not
much use. I wonder if we should trivially disallow casting when the
source/dest type are the same anyway? But I suppose there's no harm in
allowing that either, except that it might confuse the reader a bit
when they see a no-op cast like that)

- David
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [cfe-dev] UB in TypeLoc casting

Richard Smith-33
On Wed, Dec 5, 2012 at 1:06 PM, David Blaikie <[hidden email]> wrote:
On Thu, Nov 29, 2012 at 8:16 PM, Richard Smith <[hidden email]> wrote:
> On Thu, Nov 29, 2012 at 7:43 PM, Eli Friedman <[hidden email]>
> wrote:
>>
>> On Thu, Nov 29, 2012 at 3:49 PM, David Blaikie <[hidden email]> wrote:
>> > Moving to LLVM dev to discuss the possibility of extending the cast
>> > infrastructure to handle this.
>> >
>> > On Tue, Nov 20, 2012 at 5:51 PM, John McCall <[hidden email]> wrote:
>> >> On Nov 18, 2012, at 5:05 PM, David Blaikie <[hidden email]> wrote:
>> >>> TypeLoc casting looks bogus.
>> >>>
>> >>> TypeLoc derived types return true from classof when the dynamic type
>> >>> is not actually an instance of the derived type. The TypeLoc is then
>> >>> (erroneously) cast to the derived type and the derived type's implicit
>> >>> copy ctor is executed (so long as you remember to assign the result of
>> >>> cast<SpecificTypeLoc>), if you copy, otherwise the SpecificTypeLoc's
>> >>> member functions are actually invoked on a TypeLoc object - bogas, but
>> >>> it works (because there's no other members in the SpecificTypeLoc
>> >>> types).
>> >>
>> >> Yep.  See also LLVM's IntrinsicInst.  This kind of UB is very common
>> >> and
>> >> essentially harmless, but if you want to stamp it out, feel free.
>> >>
>> >>> Richard / Kostya: what's the word on catching this kind of UB
>> >>> (essentially: calling member functions of one type on an instance not
>> >>> of that type)? (in the specific case mentioned below, as discussed in
>> >>> the original thread, ASan or MSan might be able to help)
>> >>>
>> >>> Clang Dev: what should we do to fix this? I don't think it's helpful
>> >>> to add machinery to llvm::cast to handle concrete type conversion, so
>> >>> I think we should consider a non-llvm::cast solution for this
>> >>> hierarchy. Replace the llvm::isa calls with getTypeLocClass calls (or
>> >>> a separate wrapper for that) & add a SpecificTypeLoc(const TypeLoc&)
>> >>> ctor for every specific TypeLoc that has an appropriate assert & calls
>> >>> up through to the base copy ctor. It'll be a fair bit of code to add
>> >>> to TypeLoc, but nothing complicated.
>> >>>
>> >>> Any other ideas?
>> >>
>> >> I don't see why isa<> and cast<> aren't the right code interface for
>> >> this,
>> >> rather than reinventing something else.  You just need to teach cast<>
>> >> to construct and return a proper temporary.
>> >
>> > LLVM-dev folks: what do you reckon? Should we extend the cast
>> > infrastructure to be able to handle value type conversions?
>>
>> No, I don't think that makes sense; the casting infrastructure is
>> complicated enough without having to deal with this.  It would be easy
>> enough to implement a method on TypeLoc to handle this,
>>
>> > It doesn't really feel like the right fit to me, but I'll defer to the
>> > community's wisdom if it's overwhelming. I'd just like to see a
>> > solution wherein we can't write this particular kind of bug again,
>> > ideally.
>>
>> Attaching proof-of-concept solution which prevents this kind of bug.
>> (Ugly, and requires C++11 support as written, but potentially
>> interesting anyway.)
>
>
> The C++11 should be easy to remove (move the enable_if to the return type).
> This approach seems reasonable to me.

Thanks Eli - that does seem to do the trick (took me a little while to
wrap my head around it, though). I've modified it, as Richard
suggested, to use enable_if in the return type & it builds in C++98
and correctly catches the TypeLoc usage.

I'll have to find some time to actually go & fix TypeLoc before we can
submit this, though.

(& technically we could allow casting where the parameter type is a
temporary, so long as you only cast to the same type - but that's not
much use. I wonder if we should trivially disallow casting when the
source/dest type are the same anyway? But I suppose there's no harm in
allowing that either, except that it might confuse the reader a bit
when they see a no-op cast like that) 

ISTR we have cases of no-op casts in templates and in code generated by x macros. These seem reasonably harmless.

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [cfe-dev] UB in TypeLoc casting

David Blaikie
In reply to this post by David Blaikie
On Wed, Dec 5, 2012 at 1:06 PM, David Blaikie <[hidden email]> wrote:

> On Thu, Nov 29, 2012 at 8:16 PM, Richard Smith <[hidden email]> wrote:
>> On Thu, Nov 29, 2012 at 7:43 PM, Eli Friedman <[hidden email]>
>> wrote:
>>>
>>> On Thu, Nov 29, 2012 at 3:49 PM, David Blaikie <[hidden email]> wrote:
>>> > Moving to LLVM dev to discuss the possibility of extending the cast
>>> > infrastructure to handle this.
>>> >
>>> > On Tue, Nov 20, 2012 at 5:51 PM, John McCall <[hidden email]> wrote:
>>> >> On Nov 18, 2012, at 5:05 PM, David Blaikie <[hidden email]> wrote:
>>> >>> TypeLoc casting looks bogus.
>>> >>>
>>> >>> TypeLoc derived types return true from classof when the dynamic type
>>> >>> is not actually an instance of the derived type. The TypeLoc is then
>>> >>> (erroneously) cast to the derived type and the derived type's implicit
>>> >>> copy ctor is executed (so long as you remember to assign the result of
>>> >>> cast<SpecificTypeLoc>), if you copy, otherwise the SpecificTypeLoc's
>>> >>> member functions are actually invoked on a TypeLoc object - bogas, but
>>> >>> it works (because there's no other members in the SpecificTypeLoc
>>> >>> types).
>>> >>
>>> >> Yep.  See also LLVM's IntrinsicInst.  This kind of UB is very common
>>> >> and
>>> >> essentially harmless, but if you want to stamp it out, feel free.
>>> >>
>>> >>> Richard / Kostya: what's the word on catching this kind of UB
>>> >>> (essentially: calling member functions of one type on an instance not
>>> >>> of that type)? (in the specific case mentioned below, as discussed in
>>> >>> the original thread, ASan or MSan might be able to help)
>>> >>>
>>> >>> Clang Dev: what should we do to fix this? I don't think it's helpful
>>> >>> to add machinery to llvm::cast to handle concrete type conversion, so
>>> >>> I think we should consider a non-llvm::cast solution for this
>>> >>> hierarchy. Replace the llvm::isa calls with getTypeLocClass calls (or
>>> >>> a separate wrapper for that) & add a SpecificTypeLoc(const TypeLoc&)
>>> >>> ctor for every specific TypeLoc that has an appropriate assert & calls
>>> >>> up through to the base copy ctor. It'll be a fair bit of code to add
>>> >>> to TypeLoc, but nothing complicated.
>>> >>>
>>> >>> Any other ideas?
>>> >>
>>> >> I don't see why isa<> and cast<> aren't the right code interface for
>>> >> this,
>>> >> rather than reinventing something else.  You just need to teach cast<>
>>> >> to construct and return a proper temporary.
>>> >
>>> > LLVM-dev folks: what do you reckon? Should we extend the cast
>>> > infrastructure to be able to handle value type conversions?
>>>
>>> No, I don't think that makes sense; the casting infrastructure is
>>> complicated enough without having to deal with this.  It would be easy
>>> enough to implement a method on TypeLoc to handle this,
>>>
>>> > It doesn't really feel like the right fit to me, but I'll defer to the
>>> > community's wisdom if it's overwhelming. I'd just like to see a
>>> > solution wherein we can't write this particular kind of bug again,
>>> > ideally.
>>>
>>> Attaching proof-of-concept solution which prevents this kind of bug.
>>> (Ugly, and requires C++11 support as written, but potentially
>>> interesting anyway.)
>>
>>
>> The C++11 should be easy to remove (move the enable_if to the return type).
>> This approach seems reasonable to me.
>
> Thanks Eli - that does seem to do the trick (took me a little while to
> wrap my head around it, though). I've modified it, as Richard
> suggested, to use enable_if in the return type & it builds in C++98
> and correctly catches the TypeLoc usage.
>
> I'll have to find some time to actually go & fix TypeLoc before we can
> submit this, though.
>
> (& technically we could allow casting where the parameter type is a
> temporary, so long as you only cast to the same type - but that's not
> much use. I wonder if we should trivially disallow casting when the
> source/dest type are the same anyway? But I suppose there's no harm in
> allowing that either, except that it might confuse the reader a bit
> when they see a no-op cast like that)

Looking into this I've fixed a few other things that were a little
broken by the looseness of llvm::cast & friends (see r174853). I've
been prototyping a fix to TypeLoc specifically, hit some hiccups in
ASTMatchers (probably surmountable - Manuel helped me with r174862
(might need one other change there)) but otherwise seems achievable.

Beyond that, though, I've hit one hierarchy in the Static Analyzer
that does this as well: ProgramPoint. On IRC Jordan Rose mentioned
that there's another more pervasive use of this pattern in the Static
Analyzer, the SVal hierarchy.

So, Ted, how objectionable would it be for me to introduce something
like (names subject to adjustment):

template<typename T>
T SVal::castAs();

template<typename T>
llvm::Optional<T> SVal::getAs();

(the implementations of these functions might involve invoking private
FooSVal(SVal) ctors in each SVal derived type - so adding a bit of
boilerplate ctors to all those classes)

and replace "cast<FooSVal>(bar)" with "bar.castAs<FooSVal>()" and
dyn_cast with getAs similarly? & similarly for the ProgramPoint (&
TypeLoc - which I'm already working on) hierarchies?


Full disclosure:
Obviously the more important part of this is fixing the easily-written
bugs such as r174862 & that doesn't necessarily require us to fix the
UB, nor to migrate these APIs away from the LLVM RTTI infrastructure.

We could, for example, add value-type-conversion support to the LLVM
RTTI infrastructure that checks that if you pass a temporary it would
explicitly do a value conversion and return a value not a
reference/pointer. (this wouldn't scale to dyn_cast, though - that'd
have to return Optional<T> not T* since there's no T object for a T*
to point to - we could still, technically, wedge this in to dyn_cast
but that might be a bit much) And we could do this without fixing the
UB type punning here (still reinterpret_casting the base to the
derived type & using the implicit copy ctor) or while fixing the UB
(require the derived types to have derived(base) ctors) but keeping
the cast API usage the same.

It's my personal opinion (& see Eli's agreement above) that it's best
just to leave this out of the LLVM RTTI infrastructure entirely & have
these hierarchies handle it themselves.

Thanks,
- David
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [cfe-dev] UB in TypeLoc casting

David Blaikie
On Mon, Feb 11, 2013 at 12:02 PM, David Blaikie <[hidden email]> wrote:

> On Wed, Dec 5, 2012 at 1:06 PM, David Blaikie <[hidden email]> wrote:
>> On Thu, Nov 29, 2012 at 8:16 PM, Richard Smith <[hidden email]> wrote:
>>> On Thu, Nov 29, 2012 at 7:43 PM, Eli Friedman <[hidden email]>
>>> wrote:
>>>>
>>>> On Thu, Nov 29, 2012 at 3:49 PM, David Blaikie <[hidden email]> wrote:
>>>> > Moving to LLVM dev to discuss the possibility of extending the cast
>>>> > infrastructure to handle this.
>>>> >
>>>> > On Tue, Nov 20, 2012 at 5:51 PM, John McCall <[hidden email]> wrote:
>>>> >> On Nov 18, 2012, at 5:05 PM, David Blaikie <[hidden email]> wrote:
>>>> >>> TypeLoc casting looks bogus.
>>>> >>>
>>>> >>> TypeLoc derived types return true from classof when the dynamic type
>>>> >>> is not actually an instance of the derived type. The TypeLoc is then
>>>> >>> (erroneously) cast to the derived type and the derived type's implicit
>>>> >>> copy ctor is executed (so long as you remember to assign the result of
>>>> >>> cast<SpecificTypeLoc>), if you copy, otherwise the SpecificTypeLoc's
>>>> >>> member functions are actually invoked on a TypeLoc object - bogas, but
>>>> >>> it works (because there's no other members in the SpecificTypeLoc
>>>> >>> types).
>>>> >>
>>>> >> Yep.  See also LLVM's IntrinsicInst.  This kind of UB is very common
>>>> >> and
>>>> >> essentially harmless, but if you want to stamp it out, feel free.
>>>> >>
>>>> >>> Richard / Kostya: what's the word on catching this kind of UB
>>>> >>> (essentially: calling member functions of one type on an instance not
>>>> >>> of that type)? (in the specific case mentioned below, as discussed in
>>>> >>> the original thread, ASan or MSan might be able to help)
>>>> >>>
>>>> >>> Clang Dev: what should we do to fix this? I don't think it's helpful
>>>> >>> to add machinery to llvm::cast to handle concrete type conversion, so
>>>> >>> I think we should consider a non-llvm::cast solution for this
>>>> >>> hierarchy. Replace the llvm::isa calls with getTypeLocClass calls (or
>>>> >>> a separate wrapper for that) & add a SpecificTypeLoc(const TypeLoc&)
>>>> >>> ctor for every specific TypeLoc that has an appropriate assert & calls
>>>> >>> up through to the base copy ctor. It'll be a fair bit of code to add
>>>> >>> to TypeLoc, but nothing complicated.
>>>> >>>
>>>> >>> Any other ideas?
>>>> >>
>>>> >> I don't see why isa<> and cast<> aren't the right code interface for
>>>> >> this,
>>>> >> rather than reinventing something else.  You just need to teach cast<>
>>>> >> to construct and return a proper temporary.
>>>> >
>>>> > LLVM-dev folks: what do you reckon? Should we extend the cast
>>>> > infrastructure to be able to handle value type conversions?
>>>>
>>>> No, I don't think that makes sense; the casting infrastructure is
>>>> complicated enough without having to deal with this.  It would be easy
>>>> enough to implement a method on TypeLoc to handle this,
>>>>
>>>> > It doesn't really feel like the right fit to me, but I'll defer to the
>>>> > community's wisdom if it's overwhelming. I'd just like to see a
>>>> > solution wherein we can't write this particular kind of bug again,
>>>> > ideally.
>>>>
>>>> Attaching proof-of-concept solution which prevents this kind of bug.
>>>> (Ugly, and requires C++11 support as written, but potentially
>>>> interesting anyway.)
>>>
>>>
>>> The C++11 should be easy to remove (move the enable_if to the return type).
>>> This approach seems reasonable to me.
>>
>> Thanks Eli - that does seem to do the trick (took me a little while to
>> wrap my head around it, though). I've modified it, as Richard
>> suggested, to use enable_if in the return type & it builds in C++98
>> and correctly catches the TypeLoc usage.
>>
>> I'll have to find some time to actually go & fix TypeLoc before we can
>> submit this, though.
>>
>> (& technically we could allow casting where the parameter type is a
>> temporary, so long as you only cast to the same type - but that's not
>> much use. I wonder if we should trivially disallow casting when the
>> source/dest type are the same anyway? But I suppose there's no harm in
>> allowing that either, except that it might confuse the reader a bit
>> when they see a no-op cast like that)
>
> Looking into this I've fixed a few other things that were a little
> broken by the looseness of llvm::cast & friends (see r174853). I've
> been prototyping a fix to TypeLoc specifically, hit some hiccups in
> ASTMatchers (probably surmountable - Manuel helped me with r174862
> (might need one other change there)) but otherwise seems achievable.
>
> Beyond that, though, I've hit one hierarchy in the Static Analyzer
> that does this as well: ProgramPoint. On IRC Jordan Rose mentioned
> that there's another more pervasive use of this pattern in the Static
> Analyzer, the SVal hierarchy.
>
> So, Ted, how objectionable would it be for me to introduce something
> like (names subject to adjustment):
>
> template<typename T>
> T SVal::castAs();
>
> template<typename T>
> llvm::Optional<T> SVal::getAs();
>
> (the implementations of these functions might involve invoking private
> FooSVal(SVal) ctors in each SVal derived type - so adding a bit of
> boilerplate ctors to all those classes)

Richard Smith pointed out that this could be done without the
boilerplate with something like:

Derived d;
d.Base::operator=(base);

a bit nasty, but valid/well-defined & would avoid writing the
boilerplate delegating Derived(Base) ctors in every derived type

>
> and replace "cast<FooSVal>(bar)" with "bar.castAs<FooSVal>()" and
> dyn_cast with getAs similarly? & similarly for the ProgramPoint (&
> TypeLoc - which I'm already working on) hierarchies?
>
>
> Full disclosure:
> Obviously the more important part of this is fixing the easily-written
> bugs such as r174862 & that doesn't necessarily require us to fix the
> UB, nor to migrate these APIs away from the LLVM RTTI infrastructure.
>
> We could, for example, add value-type-conversion support to the LLVM
> RTTI infrastructure that checks that if you pass a temporary it would
> explicitly do a value conversion and return a value not a
> reference/pointer. (this wouldn't scale to dyn_cast, though - that'd
> have to return Optional<T> not T* since there's no T object for a T*
> to point to - we could still, technically, wedge this in to dyn_cast
> but that might be a bit much) And we could do this without fixing the
> UB type punning here (still reinterpret_casting the base to the
> derived type & using the implicit copy ctor) or while fixing the UB
> (require the derived types to have derived(base) ctors) but keeping
> the cast API usage the same.
>
> It's my personal opinion (& see Eli's agreement above) that it's best
> just to leave this out of the LLVM RTTI infrastructure entirely & have
> these hierarchies handle it themselves.
>
> Thanks,
> - David
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [cfe-dev] UB in TypeLoc casting

Ted Kremenek-2
In reply to this post by David Blaikie
On Feb 11, 2013, at 12:02 PM, David Blaikie <[hidden email]> wrote:

Beyond that, though, I've hit one hierarchy in the Static Analyzer
that does this as well: ProgramPoint. On IRC Jordan Rose mentioned
that there's another more pervasive use of this pattern in the Static
Analyzer, the SVal hierarchy.

So, Ted, how objectionable would it be for me to introduce something
like (names subject to adjustment):

template<typename T>
T SVal::castAs();

template<typename T>
llvm::Optional<T> SVal::getAs();

(the implementations of these functions might involve invoking private
FooSVal(SVal) ctors in each SVal derived type - so adding a bit of
boilerplate ctors to all those classes)

and replace "cast<FooSVal>(bar)" with "bar.castAs<FooSVal>()" and
dyn_cast with getAs similarly? & similarly for the ProgramPoint (&
TypeLoc - which I'm already working on) hierarchies?


Sorry everyone for not seeing this.  Jordan was nice enough to ping me.  I would be perfectly fine with this change, and I think it looks cleaner anyways.  I think it is a great idea.

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [cfe-dev] UB in TypeLoc casting

David Blaikie



On Fri, Feb 15, 2013 at 4:55 PM, Ted Kremenek <[hidden email]> wrote:
On Feb 11, 2013, at 12:02 PM, David Blaikie <[hidden email]> wrote:

Beyond that, though, I've hit one hierarchy in the Static Analyzer
that does this as well: ProgramPoint. On IRC Jordan Rose mentioned
that there's another more pervasive use of this pattern in the Static
Analyzer, the SVal hierarchy.

So, Ted, how objectionable would it be for me to introduce something
like (names subject to adjustment):

template<typename T>
T SVal::castAs();

template<typename T>
llvm::Optional<T> SVal::getAs();

(the implementations of these functions might involve invoking private
FooSVal(SVal) ctors in each SVal derived type - so adding a bit of
boilerplate ctors to all those classes)

and replace "cast<FooSVal>(bar)" with "bar.castAs<FooSVal>()" and
dyn_cast with getAs similarly? & similarly for the ProgramPoint (&
TypeLoc - which I'm already working on) hierarchies?


Sorry everyone for not seeing this.  Jordan was nice enough to ping me.  I would be perfectly fine with this change, and I think it looks cleaner anyways.  I think it is a great idea.

Thanks for your patience - the final change to llvm::cast & friends to disallow rvalues (based on Eli's suggestion, tweaked to work with LLVM type traits rather than C++11 traits) has been committed in r175462.


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev