[llvm-dev] [RFC] A nofree (and nosynch) function attribute: Mixing dereferenceable and delete

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

[llvm-dev] [RFC] A nofree (and nosynch) function attribute: Mixing dereferenceable and delete

Bruce Hoult via llvm-dev
Hi, everyone,

I'd like to propose adding a nofree function attribute to indicate that
a function does not, directly or indirectly, call a memory-deallocation
function (e.g., free, C++'s operator delete). Clang/LLVM can currently
misoptimize functions that:

 1. Have a reference argument.

 2. Free the memory backing the object to which the reference is bound
during the function's execution.

Because we tag, in Clang, all reference arguments using the
dereferenceable attribute, LLVM assumes that the pointer is
unconditionally dereferenceable throughout the course of the entire
function. This isn't true, however, if the memory is freed during the
execution of the function. For more information, please see the
discussion in https://reviews.llvm.org/D48239.

To solve this problem, we need to give LLVM more information in order to
help it determine when a pointer, which is dereferenceable when the
functions begins to execute, will still be dereferenceable later on in
the function's execution. This nofree attribute can be part of that
solution. If we know that free (and friends) are not called by the
function (nor by any function called by the function, and so on), then
we know that pointers that started out dereferenceable will stay that
way (except as explained below).

I'm initially proposing this to be only a function attribute, although
one could easily imagine a parameter attribute as well (that indicates
that a particular pointer argument is not freed by the function). This
might be useful, but for the use case of helping dereferenceable, it
would be subtle to use, unless the parameter was also marked as noalias,
because you'd need to know that the parameter was not also aliased with
another argument (or had not been captured). Another analysis would need
to provide this kind of information.

Also, just because a function does not, directly or indirectly, call
free does not mean that it cannot cause memory to be deallocated. The
function might communicate (synchronize) with another thread causing
that other thread to delete the memory. For this reason, to use
dereferenceable as we currently do, we also need to know that the
function does not synchronize with any other threads. To solve this
problem, like nofree, I propose to add a nosynch attribute (to indicate
that a function does not use (non-relaxed) atomics or otherwise
synchronize with any other threads (e.g., perform I/O or, as a practical
matter, use volatile accesses).

I've posted a patch for the nofree attribute
(https://reviews.llvm.org/D49165). nosynch's implementation would be
very similar (except instead of looking for calls to free, it would look
for uses of non-relaxed atomics, volatile ops, and known functions that
are not I/O functions).

With both of these attributes (nofree and nosynch), a function argument
with the dereferenceable attribute will be known to be dereferenceable
throughout the execution of the attributed function. We can update
isDereferenceableAndAlignedPointer to include these additional checks on
the current function.

One more choice we have: We can, as I proposed above, essentially weaken
the current semantics of dereferenceable to not exclude
mid-function-execution deallocation. We can also add a second attribute
with the current, stronger, semantics. We can keep the current attribute
as-is, and add a second attribute with the weaker semantics (and switch
Clang to use that).

Please let me know what you think.

Thanks again,

Hal

--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] A nofree (and nosynch) function attribute: Mixing dereferenceable and delete

Bruce Hoult via llvm-dev
Hi Hal,

I'm interested in this functionality and the overall idea of inferring
things from the function body to turn into attributes. I'm looking at
this from the XRay instrumentation angle.

Overall, this is a +1 from me. Some questions below though:

On Wed, Jul 11, 2018 at 12:01 PM Hal Finkel via llvm-dev
<[hidden email]> wrote:

>
> Hi, everyone,
>
> I'd like to propose adding a nofree function attribute to indicate that
> a function does not, directly or indirectly, call a memory-deallocation
> function (e.g., free, C++'s operator delete). Clang/LLVM can currently
> misoptimize functions that:
>
>  1. Have a reference argument.
>
>  2. Free the memory backing the object to which the reference is bound
> during the function's execution.
>
> Because we tag, in Clang, all reference arguments using the
> dereferenceable attribute, LLVM assumes that the pointer is
> unconditionally dereferenceable throughout the course of the entire
> function. This isn't true, however, if the memory is freed during the
> execution of the function. For more information, please see the
> discussion in https://reviews.llvm.org/D48239.
>
> To solve this problem, we need to give LLVM more information in order to
> help it determine when a pointer, which is dereferenceable when the
> functions begins to execute, will still be dereferenceable later on in
> the function's execution. This nofree attribute can be part of that
> solution. If we know that free (and friends) are not called by the
> function (nor by any function called by the function, and so on), then
> we know that pointers that started out dereferenceable will stay that
> way (except as explained below).
>
> I'm initially proposing this to be only a function attribute, although
> one could easily imagine a parameter attribute as well (that indicates
> that a particular pointer argument is not freed by the function). This
> might be useful, but for the use case of helping dereferenceable, it
> would be subtle to use, unless the parameter was also marked as noalias,
> because you'd need to know that the parameter was not also aliased with
> another argument (or had not been captured). Another analysis would need
> to provide this kind of information.
>
> Also, just because a function does not, directly or indirectly, call
> free does not mean that it cannot cause memory to be deallocated. The
> function might communicate (synchronize) with another thread causing
> that other thread to delete the memory. For this reason, to use
> dereferenceable as we currently do, we also need to know that the
> function does not synchronize with any other threads. To solve this
> problem, like nofree, I propose to add a nosynch attribute (to indicate
> that a function does not use (non-relaxed) atomics or otherwise
> synchronize with any other threads (e.g., perform I/O or, as a practical
> matter, use volatile accesses).
>

How far does the attribute go? For example, does it propagate up the
caller stack?

This might be a basic IR question but I suppose this only works for
definitions in the same module -- I wonder whether the attribute can
be asserted/added in the declarations, and ensured that somehow at
link-time the attribute holds. For example, while we might assume that
a function declaration says `nofree` today but the implementation
might actually change to do something else, how we might be able to
guard against this.

Will this also extend/change the default attributes that are defined
for the intrinsics? XRay has a couple of intrinsics that have a number
of attributes, and I imagine some other intrinsics for the sanitizers
would need to learn about the attribute as well.

How extensive do we expect changes like this to be handled when doing
things like inlining, outlining, partial-inlining, etc.?

Is the default assumption going to be that a function that isn't
marked `nofree` *will* free and pessimize that way? Does it make more
sense then to make an attribute that's positive, say 'frees' and relax
the default assumption to "does not free"?

> I've posted a patch for the nofree attribute
> (https://reviews.llvm.org/D49165). nosynch's implementation would be
> very similar (except instead of looking for calls to free, it would look
> for uses of non-relaxed atomics, volatile ops, and known functions that
> are not I/O functions).
>
> With both of these attributes (nofree and nosynch), a function argument
> with the dereferenceable attribute will be known to be dereferenceable
> throughout the execution of the attributed function. We can update
> isDereferenceableAndAlignedPointer to include these additional checks on
> the current function.
>
> One more choice we have: We can, as I proposed above, essentially weaken
> the current semantics of dereferenceable to not exclude
> mid-function-execution deallocation. We can also add a second attribute
> with the current, stronger, semantics. We can keep the current attribute
> as-is, and add a second attribute with the weaker semantics (and switch
> Clang to use that).
>
> Please let me know what you think.
>

I've not worked out the full matrix of possibilities here in my head
yet, but what are the risks with relaxing the default semantics then
introducing the stronger attributes? Maybe you or someone has thought
that through before, and it would be great to have a summary or an
idea what the pros/cons are of doing that instead of attempting to
infer non-freeing behaviour.

Cheers

--
Dean
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] A nofree (and nosynch) function attribute: Mixing dereferenceable and delete

Bruce Hoult via llvm-dev

On 07/10/2018 09:12 PM, Dean Michael Berris wrote:

> Hi Hal,
>
> I'm interested in this functionality and the overall idea of inferring
> things from the function body to turn into attributes. I'm looking at
> this from the XRay instrumentation angle.
>
> Overall, this is a +1 from me. Some questions below though:
>
> On Wed, Jul 11, 2018 at 12:01 PM Hal Finkel via llvm-dev
> <[hidden email]> wrote:
>> Hi, everyone,
>>
>> I'd like to propose adding a nofree function attribute to indicate that
>> a function does not, directly or indirectly, call a memory-deallocation
>> function (e.g., free, C++'s operator delete). Clang/LLVM can currently
>> misoptimize functions that:
>>
>>  1. Have a reference argument.
>>
>>  2. Free the memory backing the object to which the reference is bound
>> during the function's execution.
>>
>> Because we tag, in Clang, all reference arguments using the
>> dereferenceable attribute, LLVM assumes that the pointer is
>> unconditionally dereferenceable throughout the course of the entire
>> function. This isn't true, however, if the memory is freed during the
>> execution of the function. For more information, please see the
>> discussion in https://reviews.llvm.org/D48239.
>>
>> To solve this problem, we need to give LLVM more information in order to
>> help it determine when a pointer, which is dereferenceable when the
>> functions begins to execute, will still be dereferenceable later on in
>> the function's execution. This nofree attribute can be part of that
>> solution. If we know that free (and friends) are not called by the
>> function (nor by any function called by the function, and so on), then
>> we know that pointers that started out dereferenceable will stay that
>> way (except as explained below).
>>
>> I'm initially proposing this to be only a function attribute, although
>> one could easily imagine a parameter attribute as well (that indicates
>> that a particular pointer argument is not freed by the function). This
>> might be useful, but for the use case of helping dereferenceable, it
>> would be subtle to use, unless the parameter was also marked as noalias,
>> because you'd need to know that the parameter was not also aliased with
>> another argument (or had not been captured). Another analysis would need
>> to provide this kind of information.
>>
>> Also, just because a function does not, directly or indirectly, call
>> free does not mean that it cannot cause memory to be deallocated. The
>> function might communicate (synchronize) with another thread causing
>> that other thread to delete the memory. For this reason, to use
>> dereferenceable as we currently do, we also need to know that the
>> function does not synchronize with any other threads. To solve this
>> problem, like nofree, I propose to add a nosynch attribute (to indicate
>> that a function does not use (non-relaxed) atomics or otherwise
>> synchronize with any other threads (e.g., perform I/O or, as a practical
>> matter, use volatile accesses).
>>
> How far does the attribute go? For example, does it propagate up the
> caller stack?
>
> This might be a basic IR question but I suppose this only works for
> definitions in the same module -- I wonder whether the attribute can
> be asserted/added in the declarations, and ensured that somehow at
> link-time the attribute holds. For example, while we might assume that
> a function declaration says `nofree` today but the implementation
> might actually change to do something else, how we might be able to
> guard against this.

Currently, we can only infer in the same module, and only when we have a
definitive implementation. inline linkage, and similar, doesn't count
(for all the same reasons why we generally can't do IPA over
inline-linkage functions). If we add a user-level attribute in Clang
(and I do generally like exposing these kinds of things to the user
too), then it's the user's responsibility to make sure that the
attributes remain semantically correct.

>
> Will this also extend/change the default attributes that are defined
> for the intrinsics? XRay has a couple of intrinsics that have a number
> of attributes, and I imagine some other intrinsics for the sanitizers
> would need to learn about the attribute as well.

We can certainly add these for intrinsics. Nearly everything intrinsic
that writes to memory could be usefully marked.

>
> How extensive do we expect changes like this to be handled when doing
> things like inlining, outlining, partial-inlining, etc.?

I don't envision this being any different from most other attributes.
They're lost when inlining, and we can infer them -- We don't currently
infer late to handle late outlining, etc., but could change that, as an
orthogonal matter, if we'd like (maybe functions created as a result of
partial inlining could benefit from this today).

>
> Is the default assumption going to be that a function that isn't
> marked `nofree` *will* free and pessimize that way? Does it make more
> sense then to make an attribute that's positive, say 'frees' and relax
> the default assumption to "does not free"?

The default for unknown functions needs to be that they might free
memory. Otherwise, it's not conservatively correct.

>
>> I've posted a patch for the nofree attribute
>> (https://reviews.llvm.org/D49165). nosynch's implementation would be
>> very similar (except instead of looking for calls to free, it would look
>> for uses of non-relaxed atomics, volatile ops, and known functions that
>> are not I/O functions).
>>
>> With both of these attributes (nofree and nosynch), a function argument
>> with the dereferenceable attribute will be known to be dereferenceable
>> throughout the execution of the attributed function. We can update
>> isDereferenceableAndAlignedPointer to include these additional checks on
>> the current function.
>>
>> One more choice we have: We can, as I proposed above, essentially weaken
>> the current semantics of dereferenceable to not exclude
>> mid-function-execution deallocation. We can also add a second attribute
>> with the current, stronger, semantics. We can keep the current attribute
>> as-is, and add a second attribute with the weaker semantics (and switch
>> Clang to use that).
>>
>> Please let me know what you think.
>>
> I've not worked out the full matrix of possibilities here in my head
> yet, but what are the risks with relaxing the default semantics then
> introducing the stronger attributes?

Benefits: Current IR produced by Clang becomes conservatively correct.
No changes to Clang's codegen are necessary (minor).
Downsides: Other frontends producing the attribute in a
currently-correct way now need to change to a new attribute to retain
current behavior.

If we introduce a new attribute with the weaker semantics, then:

Benefits: Currently-correct IR remains unchanged and will continue to be
optimized strongly.
Downsides: Older IR produced by Clang will remain incorrect. Clang's
codegen will need to be updated (minor).

I don't have sufficient knowledge of non-Clang usage of the attribute to
have a strong opinion. I'm happy to add a new weaker attribute, we just
need to figure out what to name it. deferenceable_entry, perhaps?

>  Maybe you or someone has thought
> that through before, and it would be great to have a summary or an
> idea what the pros/cons are of doing that instead of attempting to
> infer non-freeing behaviour.

I think that we need to infer regardless to get optimizations going forward.

Thanks again,
Hal

>
> Cheers
>

--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] A nofree (and nosynch) function attribute: Mixing dereferenceable and delete

Bruce Hoult via llvm-dev
On Wed, Jul 11, 2018 at 12:37 PM Hal Finkel <[hidden email]> wrote:

>
>
> On 07/10/2018 09:12 PM, Dean Michael Berris wrote:
> >
> > Is the default assumption going to be that a function that isn't
> > marked `nofree` *will* free and pessimize that way? Does it make more
> > sense then to make an attribute that's positive, say 'frees' and relax
> > the default assumption to "does not free"?
>
> The default for unknown functions needs to be that they might free
> memory. Otherwise, it's not conservatively correct.
>

Yes, for unknown functions we can assume the worst and be
conservatively correct.

For the fully-defined in-module function (or even special functions
defined by the front-end) then marking the functions that definitely
free might be a stronger signal to inhibit code motion or aggressive
reordering.

I'm essentially trying to see whether there's a way to achieve the
goal without having to sprinkle `nofree` in majority of functions that
take pointers but don't free them, just to be able to allow correct
but more aggressive optimisation.


> >>
> > I've not worked out the full matrix of possibilities here in my head
> > yet, but what are the risks with relaxing the default semantics then
> > introducing the stronger attributes?
>
> Benefits: Current IR produced by Clang becomes conservatively correct.
> No changes to Clang's codegen are necessary (minor).
> Downsides: Other frontends producing the attribute in a
> currently-correct way now need to change to a new attribute to retain
> current behavior.
>

Yeah that sounds like it's not ideal and may cost more than
introducing the attribute with weaker semantics.

> If we introduce a new attribute with the weaker semantics, then:
>
> Benefits: Currently-correct IR remains unchanged and will continue to be
> optimized strongly.
> Downsides: Older IR produced by Clang will remain incorrect. Clang's
> codegen will need to be updated (minor).
>
> I don't have sufficient knowledge of non-Clang usage of the attribute to
> have a strong opinion. I'm happy to add a new weaker attribute, we just
> need to figure out what to name it. deferenceable_entry, perhaps?
>

I suspect the attribute really ought to communicate a post-condition,
so that optimisations can check for legality of a transformation (or
avoid making transformations that are provably incorrect). There's a
number of these, which might be useful:

- maybe_invalidated : says that the pointer may be invalidated after a
call to the function (because there are branches that lead to
invalidation) (can be the default)
- is_preserved : says strongly that the pointer is live after the
function is called
- invalidated : says strongly that the pointer is definitely, without
shadow of a doubt invalidated after a call to the function (you can
mark all calls to 'free' and similar to have this attribute on the
pointer)

Whether these map to existing attributes is something I'll need to
learn myself, but I would think will cover the use-cases for
optimisations for code-motion on pointers.

The more I think about it, the more I'm wondering whether the
attribute has to be on the function or whether it's on the pointer.
Or, if we want both -- an attribute on the function that it mark which
pointers (not just the ones it takes as parameters) are invalidated or
maybe invalidated (globals for example).

> >  Maybe you or someone has thought
> > that through before, and it would be great to have a summary or an
> > idea what the pros/cons are of doing that instead of attempting to
> > infer non-freeing behaviour.
>
> I think that we need to infer regardless to get optimizations going forward.
>

Indeed.

Thanks, Hal!

/me subscribes to the patch.

Cheers

--
Dean
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] A nofree (and nosynch) function attribute: Mixing dereferenceable and delete

Bruce Hoult via llvm-dev
In reply to this post by Bruce Hoult via llvm-dev
It looks like the current proposal doesn’t allow to express the semantics
of GC managed pointers. In this scenario functions don’t deallocate. We
might be able to mark every single function as nofree, but nosynch part is
problematic. We do have functions which synchronize with other threads
but it doesn’t change the property that no function call can invalidate a
reference.

With that in mind the alternative with a new attribute looks like a better option
for me. But I haven't given it much thought.

Artur

> On 11 Jul 2018, at 05:01, Hal Finkel via llvm-dev <[hidden email]> wrote:
>
> Hi, everyone,
>
> I'd like to propose adding a nofree function attribute to indicate that
> a function does not, directly or indirectly, call a memory-deallocation
> function (e.g., free, C++'s operator delete). Clang/LLVM can currently
> misoptimize functions that:
>
>  1. Have a reference argument.
>
>  2. Free the memory backing the object to which the reference is bound
> during the function's execution.
>
> Because we tag, in Clang, all reference arguments using the
> dereferenceable attribute, LLVM assumes that the pointer is
> unconditionally dereferenceable throughout the course of the entire
> function. This isn't true, however, if the memory is freed during the
> execution of the function. For more information, please see the
> discussion in https://reviews.llvm.org/D48239.
>
> To solve this problem, we need to give LLVM more information in order to
> help it determine when a pointer, which is dereferenceable when the
> functions begins to execute, will still be dereferenceable later on in
> the function's execution. This nofree attribute can be part of that
> solution. If we know that free (and friends) are not called by the
> function (nor by any function called by the function, and so on), then
> we know that pointers that started out dereferenceable will stay that
> way (except as explained below).
>
> I'm initially proposing this to be only a function attribute, although
> one could easily imagine a parameter attribute as well (that indicates
> that a particular pointer argument is not freed by the function). This
> might be useful, but for the use case of helping dereferenceable, it
> would be subtle to use, unless the parameter was also marked as noalias,
> because you'd need to know that the parameter was not also aliased with
> another argument (or had not been captured). Another analysis would need
> to provide this kind of information.
>
> Also, just because a function does not, directly or indirectly, call
> free does not mean that it cannot cause memory to be deallocated. The
> function might communicate (synchronize) with another thread causing
> that other thread to delete the memory. For this reason, to use
> dereferenceable as we currently do, we also need to know that the
> function does not synchronize with any other threads. To solve this
> problem, like nofree, I propose to add a nosynch attribute (to indicate
> that a function does not use (non-relaxed) atomics or otherwise
> synchronize with any other threads (e.g., perform I/O or, as a practical
> matter, use volatile accesses).
>
> I've posted a patch for the nofree attribute
> (https://reviews.llvm.org/D49165). nosynch's implementation would be
> very similar (except instead of looking for calls to free, it would look
> for uses of non-relaxed atomics, volatile ops, and known functions that
> are not I/O functions).
>
> With both of these attributes (nofree and nosynch), a function argument
> with the dereferenceable attribute will be known to be dereferenceable
> throughout the execution of the attributed function. We can update
> isDereferenceableAndAlignedPointer to include these additional checks on
> the current function.
>
> One more choice we have: We can, as I proposed above, essentially weaken
> the current semantics of dereferenceable to not exclude
> mid-function-execution deallocation. We can also add a second attribute
> with the current, stronger, semantics. We can keep the current attribute
> as-is, and add a second attribute with the weaker semantics (and switch
> Clang to use that).
>
> Please let me know what you think.
>
> Thanks again,
>
> Hal
>
> --
> Hal Finkel
> Lead, Compiler Technology and Programming Languages
> Leadership Computing Facility
> Argonne National Laboratory
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] A nofree (and nosynch) function attribute: Mixing dereferenceable and delete

Bruce Hoult via llvm-dev
In reply to this post by Bruce Hoult via llvm-dev
I'm not sure if nosynch is sufficient.  What if we had:

void f(int& x) {
  if (false) {
    int r0 = x;
  }
}

// other thread
free(<pointer to x>);

The source program is race free, but LLVM may speculate the read from
x (seeing that it is dereferenceable) creating a race.

-- Sanjoy
On Tue, Jul 10, 2018 at 7:01 PM Hal Finkel via llvm-dev
<[hidden email]> wrote:

>
> Hi, everyone,
>
> I'd like to propose adding a nofree function attribute to indicate that
> a function does not, directly or indirectly, call a memory-deallocation
> function (e.g., free, C++'s operator delete). Clang/LLVM can currently
> misoptimize functions that:
>
>  1. Have a reference argument.
>
>  2. Free the memory backing the object to which the reference is bound
> during the function's execution.
>
> Because we tag, in Clang, all reference arguments using the
> dereferenceable attribute, LLVM assumes that the pointer is
> unconditionally dereferenceable throughout the course of the entire
> function. This isn't true, however, if the memory is freed during the
> execution of the function. For more information, please see the
> discussion in https://reviews.llvm.org/D48239.
>
> To solve this problem, we need to give LLVM more information in order to
> help it determine when a pointer, which is dereferenceable when the
> functions begins to execute, will still be dereferenceable later on in
> the function's execution. This nofree attribute can be part of that
> solution. If we know that free (and friends) are not called by the
> function (nor by any function called by the function, and so on), then
> we know that pointers that started out dereferenceable will stay that
> way (except as explained below).
>
> I'm initially proposing this to be only a function attribute, although
> one could easily imagine a parameter attribute as well (that indicates
> that a particular pointer argument is not freed by the function). This
> might be useful, but for the use case of helping dereferenceable, it
> would be subtle to use, unless the parameter was also marked as noalias,
> because you'd need to know that the parameter was not also aliased with
> another argument (or had not been captured). Another analysis would need
> to provide this kind of information.
>
> Also, just because a function does not, directly or indirectly, call
> free does not mean that it cannot cause memory to be deallocated. The
> function might communicate (synchronize) with another thread causing
> that other thread to delete the memory. For this reason, to use
> dereferenceable as we currently do, we also need to know that the
> function does not synchronize with any other threads. To solve this
> problem, like nofree, I propose to add a nosynch attribute (to indicate
> that a function does not use (non-relaxed) atomics or otherwise
> synchronize with any other threads (e.g., perform I/O or, as a practical
> matter, use volatile accesses).
>
> I've posted a patch for the nofree attribute
> (https://reviews.llvm.org/D49165). nosynch's implementation would be
> very similar (except instead of looking for calls to free, it would look
> for uses of non-relaxed atomics, volatile ops, and known functions that
> are not I/O functions).
>
> With both of these attributes (nofree and nosynch), a function argument
> with the dereferenceable attribute will be known to be dereferenceable
> throughout the execution of the attributed function. We can update
> isDereferenceableAndAlignedPointer to include these additional checks on
> the current function.
>
> One more choice we have: We can, as I proposed above, essentially weaken
> the current semantics of dereferenceable to not exclude
> mid-function-execution deallocation. We can also add a second attribute
> with the current, stronger, semantics. We can keep the current attribute
> as-is, and add a second attribute with the weaker semantics (and switch
> Clang to use that).
>
> Please let me know what you think.
>
> Thanks again,
>
> Hal
>
> --
> Hal Finkel
> Lead, Compiler Technology and Programming Languages
> Leadership Computing Facility
> Argonne National Laboratory
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] A nofree (and nosynch) function attribute: Mixing dereferenceable and delete

Bruce Hoult via llvm-dev
[+Richard]


On 07/11/2018 08:29 AM, Sanjoy Das wrote:

> I'm not sure if nosynch is sufficient.  What if we had:
>
> void f(int& x) {
>   if (false) {
>     int r0 = x;
>   }
> }
>
> // other thread
> free(<pointer to x>);
>
> The source program is race free, but LLVM may speculate the read from
> x (seeing that it is dereferenceable) creating a race.

Interestingly, I'm not sure. I trust that Richard can answer this
question. :-)

So, if we had:

int y = ...;
...
f(y);

then I think that Clang's use of dereferenceable is almost certainly
okay (because the standard explicitly says, 9.2.3.2p5, "A reference
shall be initialized to refer to a valid object or
function."). Because the reference must have been valid when f(y) began
executing, unless it synchronizes somehow with the other thread, any
asynchronous deletion of y must be a race.

On the other hand, if we have:

int &y = ...;
...
f(y);

do we know that, when f(y) begins executing, the reference points to a
valid object? My reading of 9.3.3p2, which says, "Argument passing
(7.6.1.2) and
function value return (8.6.3) are initializations.", combined with the
statement above, implies that, perhaps surprisingly, the same holds
here. When the argument to f is initialized, it must refer to a valid
object (even if the initializer is another reference).

Richard, what do you think?

Thanks again,
Hal

P.S. If I'm right, then I might be happy, but it's also somewhat scary
(although we've been doing this optimization for multiple releases and I
don't think we have a bug along these lines), and I'd at least smell the
need for a sanitizer.

>
> -- Sanjoy
> On Tue, Jul 10, 2018 at 7:01 PM Hal Finkel via llvm-dev
> <[hidden email]> wrote:
>> Hi, everyone,
>>
>> I'd like to propose adding a nofree function attribute to indicate that
>> a function does not, directly or indirectly, call a memory-deallocation
>> function (e.g., free, C++'s operator delete). Clang/LLVM can currently
>> misoptimize functions that:
>>
>>  1. Have a reference argument.
>>
>>  2. Free the memory backing the object to which the reference is bound
>> during the function's execution.
>>
>> Because we tag, in Clang, all reference arguments using the
>> dereferenceable attribute, LLVM assumes that the pointer is
>> unconditionally dereferenceable throughout the course of the entire
>> function. This isn't true, however, if the memory is freed during the
>> execution of the function. For more information, please see the
>> discussion in https://reviews.llvm.org/D48239.
>>
>> To solve this problem, we need to give LLVM more information in order to
>> help it determine when a pointer, which is dereferenceable when the
>> functions begins to execute, will still be dereferenceable later on in
>> the function's execution. This nofree attribute can be part of that
>> solution. If we know that free (and friends) are not called by the
>> function (nor by any function called by the function, and so on), then
>> we know that pointers that started out dereferenceable will stay that
>> way (except as explained below).
>>
>> I'm initially proposing this to be only a function attribute, although
>> one could easily imagine a parameter attribute as well (that indicates
>> that a particular pointer argument is not freed by the function). This
>> might be useful, but for the use case of helping dereferenceable, it
>> would be subtle to use, unless the parameter was also marked as noalias,
>> because you'd need to know that the parameter was not also aliased with
>> another argument (or had not been captured). Another analysis would need
>> to provide this kind of information.
>>
>> Also, just because a function does not, directly or indirectly, call
>> free does not mean that it cannot cause memory to be deallocated. The
>> function might communicate (synchronize) with another thread causing
>> that other thread to delete the memory. For this reason, to use
>> dereferenceable as we currently do, we also need to know that the
>> function does not synchronize with any other threads. To solve this
>> problem, like nofree, I propose to add a nosynch attribute (to indicate
>> that a function does not use (non-relaxed) atomics or otherwise
>> synchronize with any other threads (e.g., perform I/O or, as a practical
>> matter, use volatile accesses).
>>
>> I've posted a patch for the nofree attribute
>> (https://reviews.llvm.org/D49165). nosynch's implementation would be
>> very similar (except instead of looking for calls to free, it would look
>> for uses of non-relaxed atomics, volatile ops, and known functions that
>> are not I/O functions).
>>
>> With both of these attributes (nofree and nosynch), a function argument
>> with the dereferenceable attribute will be known to be dereferenceable
>> throughout the execution of the attributed function. We can update
>> isDereferenceableAndAlignedPointer to include these additional checks on
>> the current function.
>>
>> One more choice we have: We can, as I proposed above, essentially weaken
>> the current semantics of dereferenceable to not exclude
>> mid-function-execution deallocation. We can also add a second attribute
>> with the current, stronger, semantics. We can keep the current attribute
>> as-is, and add a second attribute with the weaker semantics (and switch
>> Clang to use that).
>>
>> Please let me know what you think.
>>
>> Thanks again,
>>
>> Hal
>>
>> --
>> Hal Finkel
>> Lead, Compiler Technology and Programming Languages
>> Leadership Computing Facility
>> Argonne National Laboratory
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> [hidden email]
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] A nofree (and nosynch) function attribute: Mixing dereferenceable and delete

Bruce Hoult via llvm-dev
In reply to this post by Bruce Hoult via llvm-dev

On 07/11/2018 04:07 AM, Artur Pilipenko wrote:
> It looks like the current proposal doesn’t allow to express the semantics
> of GC managed pointers. In this scenario functions don’t deallocate. We
> might be able to mark every single function as nofree, but nosynch part is
> problematic. We do have functions which synchronize with other threads
> but it doesn’t change the property that no function call can invalidate a
> reference.
>
> With that in mind the alternative with a new attribute looks like a better option
> for me.

Makes sense to me.

 -Hal

>  But I haven't given it much thought.
>
> Artur
>
>> On 11 Jul 2018, at 05:01, Hal Finkel via llvm-dev <[hidden email]> wrote:
>>
>> Hi, everyone,
>>
>> I'd like to propose adding a nofree function attribute to indicate that
>> a function does not, directly or indirectly, call a memory-deallocation
>> function (e.g., free, C++'s operator delete). Clang/LLVM can currently
>> misoptimize functions that:
>>
>>  1. Have a reference argument.
>>
>>  2. Free the memory backing the object to which the reference is bound
>> during the function's execution.
>>
>> Because we tag, in Clang, all reference arguments using the
>> dereferenceable attribute, LLVM assumes that the pointer is
>> unconditionally dereferenceable throughout the course of the entire
>> function. This isn't true, however, if the memory is freed during the
>> execution of the function. For more information, please see the
>> discussion in https://reviews.llvm.org/D48239.
>>
>> To solve this problem, we need to give LLVM more information in order to
>> help it determine when a pointer, which is dereferenceable when the
>> functions begins to execute, will still be dereferenceable later on in
>> the function's execution. This nofree attribute can be part of that
>> solution. If we know that free (and friends) are not called by the
>> function (nor by any function called by the function, and so on), then
>> we know that pointers that started out dereferenceable will stay that
>> way (except as explained below).
>>
>> I'm initially proposing this to be only a function attribute, although
>> one could easily imagine a parameter attribute as well (that indicates
>> that a particular pointer argument is not freed by the function). This
>> might be useful, but for the use case of helping dereferenceable, it
>> would be subtle to use, unless the parameter was also marked as noalias,
>> because you'd need to know that the parameter was not also aliased with
>> another argument (or had not been captured). Another analysis would need
>> to provide this kind of information.
>>
>> Also, just because a function does not, directly or indirectly, call
>> free does not mean that it cannot cause memory to be deallocated. The
>> function might communicate (synchronize) with another thread causing
>> that other thread to delete the memory. For this reason, to use
>> dereferenceable as we currently do, we also need to know that the
>> function does not synchronize with any other threads. To solve this
>> problem, like nofree, I propose to add a nosynch attribute (to indicate
>> that a function does not use (non-relaxed) atomics or otherwise
>> synchronize with any other threads (e.g., perform I/O or, as a practical
>> matter, use volatile accesses).
>>
>> I've posted a patch for the nofree attribute
>> (https://reviews.llvm.org/D49165). nosynch's implementation would be
>> very similar (except instead of looking for calls to free, it would look
>> for uses of non-relaxed atomics, volatile ops, and known functions that
>> are not I/O functions).
>>
>> With both of these attributes (nofree and nosynch), a function argument
>> with the dereferenceable attribute will be known to be dereferenceable
>> throughout the execution of the attributed function. We can update
>> isDereferenceableAndAlignedPointer to include these additional checks on
>> the current function.
>>
>> One more choice we have: We can, as I proposed above, essentially weaken
>> the current semantics of dereferenceable to not exclude
>> mid-function-execution deallocation. We can also add a second attribute
>> with the current, stronger, semantics. We can keep the current attribute
>> as-is, and add a second attribute with the weaker semantics (and switch
>> Clang to use that).
>>
>> Please let me know what you think.
>>
>> Thanks again,
>>
>> Hal
>>
>> --
>> Hal Finkel
>> Lead, Compiler Technology and Programming Languages
>> Leadership Computing Facility
>> Argonne National Laboratory
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> [hidden email]
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] A nofree (and nosynch) function attribute: Mixing dereferenceable and delete

Bruce Hoult via llvm-dev
In reply to this post by Bruce Hoult via llvm-dev
On Wed, Jul 11, 2018 at 4:13 PM Hal Finkel <[hidden email]> wrote:

> Interestingly, I'm not sure. I trust that Richard can answer this
> question. :-)
>
> So, if we had:
>
> int y = ...;
> ...
> f(y);
>
> then I think that Clang's use of dereferenceable is almost certainly
> okay (because the standard explicitly says, 9.2.3.2p5, "A reference
> shall be initialized to refer to a valid object or
> function."). Because the reference must have been valid when f(y) began
> executing, unless it synchronizes somehow with the other thread, any
> asynchronous deletion of y must be a race.
>
> On the other hand, if we have:
>
> int &y = ...;
> ...
> f(y);
>
> do we know that, when f(y) begins executing, the reference points to a
> valid object? My reading of 9.3.3p2, which says, "Argument passing
> (7.6.1.2) and
> function value return (8.6.3) are initializations.", combined with the
> statement above, implies that, perhaps surprisingly, the same holds
> here. When the argument to f is initialized, it must refer to a valid
> object (even if the initializer is another reference).

Ok, I didn't know this.  If this is true then nosynch + nofree seems
sufficient to me.  And I realized my example is needlessly complex; if
arg passing isn't initialization then this is a problem too:

int&y = *ptr;
free(ptr);
f(y)

-- Sanjoy
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] A nofree (and nosynch) function attribute: Mixing dereferenceable and delete

Bruce Hoult via llvm-dev
In reply to this post by Bruce Hoult via llvm-dev
On Wed, 11 Jul 2018 at 16:13, Hal Finkel via llvm-dev <[hidden email]> wrote:
[+Richard]


On 07/11/2018 08:29 AM, Sanjoy Das wrote:
> I'm not sure if nosynch is sufficient.  What if we had:
>
> void f(int& x) {
>   if (false) {
>     int r0 = x;
>   }
> }
>
> // other thread
> free(<pointer to x>);
>
> The source program is race free, but LLVM may speculate the read from
> x (seeing that it is dereferenceable) creating a race.

Interestingly, I'm not sure. I trust that Richard can answer this
question. :-)

So, if we had:

int y = ...;
...
f(y);

then I think that Clang's use of dereferenceable is almost certainly
okay (because the standard explicitly says, 9.2.3.2p5, "A reference
shall be initialized to refer to a valid object or
function."). Because the reference must have been valid when f(y) began
executing, unless it synchronizes somehow with the other thread, any
asynchronous deletion of y must be a race.

On the other hand, if we have:

int &y = ...;
...
f(y);

do we know that, when f(y) begins executing, the reference points to a
valid object? My reading of 9.3.3p2, which says, "Argument passing
(7.6.1.2) and
function value return (8.6.3) are initializations.", combined with the
statement above, implies that, perhaps surprisingly, the same holds
here. When the argument to f is initialized, it must refer to a valid
object (even if the initializer is another reference).

Richard, what do you think?

First, see also core issue 453, under the guise of which we're fixing the wording in [dcl.ref](9.2.3.2)p5 from

  "A reference shall be initialized to refer to a valid object or function."

to something like

  "If an lvalue to which a reference is directly bound designates neither an existing object or function of an appropriate type (11.6.3 [dcl.init.ref]), nor a region of storage of suitable size and alignment to contain an object of the reference's type (4.5 [intro.object], 6.8 [basic.life], 6.9 [basic.types]), the behavior is undefined."

My take is that, if the end of the duration of the region of storage is unsequenced with respect to the binding of the reference, then behavior is undefined. Generally when we refer to a thing happening while some condition is true, we mean that the execution point when the condition became true is sequenced before the thing happening, and the execution point where it becomes not true again is sequenced after.

So the behavior of that program is undefined regardless of whether 'f' actually loads through 'x'.
 
Thanks again,
Hal

P.S. If I'm right, then I might be happy, but it's also somewhat scary
(although we've been doing this optimization for multiple releases and I
don't think we have a bug along these lines), and I'd at least smell the
need for a sanitizer.

>
> -- Sanjoy
> On Tue, Jul 10, 2018 at 7:01 PM Hal Finkel via llvm-dev
> <[hidden email]> wrote:
>> Hi, everyone,
>>
>> I'd like to propose adding a nofree function attribute to indicate that
>> a function does not, directly or indirectly, call a memory-deallocation
>> function (e.g., free, C++'s operator delete). Clang/LLVM can currently
>> misoptimize functions that:
>>
>>  1. Have a reference argument.
>>
>>  2. Free the memory backing the object to which the reference is bound
>> during the function's execution.
>>
>> Because we tag, in Clang, all reference arguments using the
>> dereferenceable attribute, LLVM assumes that the pointer is
>> unconditionally dereferenceable throughout the course of the entire
>> function. This isn't true, however, if the memory is freed during the
>> execution of the function. For more information, please see the
>> discussion in https://reviews.llvm.org/D48239.
>>
>> To solve this problem, we need to give LLVM more information in order to
>> help it determine when a pointer, which is dereferenceable when the
>> functions begins to execute, will still be dereferenceable later on in
>> the function's execution. This nofree attribute can be part of that
>> solution. If we know that free (and friends) are not called by the
>> function (nor by any function called by the function, and so on), then
>> we know that pointers that started out dereferenceable will stay that
>> way (except as explained below).
>>
>> I'm initially proposing this to be only a function attribute, although
>> one could easily imagine a parameter attribute as well (that indicates
>> that a particular pointer argument is not freed by the function). This
>> might be useful, but for the use case of helping dereferenceable, it
>> would be subtle to use, unless the parameter was also marked as noalias,
>> because you'd need to know that the parameter was not also aliased with
>> another argument (or had not been captured). Another analysis would need
>> to provide this kind of information.
>>
>> Also, just because a function does not, directly or indirectly, call
>> free does not mean that it cannot cause memory to be deallocated. The
>> function might communicate (synchronize) with another thread causing
>> that other thread to delete the memory. For this reason, to use
>> dereferenceable as we currently do, we also need to know that the
>> function does not synchronize with any other threads. To solve this
>> problem, like nofree, I propose to add a nosynch attribute (to indicate
>> that a function does not use (non-relaxed) atomics or otherwise
>> synchronize with any other threads (e.g., perform I/O or, as a practical
>> matter, use volatile accesses).
>>
>> I've posted a patch for the nofree attribute
>> (https://reviews.llvm.org/D49165). nosynch's implementation would be
>> very similar (except instead of looking for calls to free, it would look
>> for uses of non-relaxed atomics, volatile ops, and known functions that
>> are not I/O functions).
>>
>> With both of these attributes (nofree and nosynch), a function argument
>> with the dereferenceable attribute will be known to be dereferenceable
>> throughout the execution of the attributed function. We can update
>> isDereferenceableAndAlignedPointer to include these additional checks on
>> the current function.
>>
>> One more choice we have: We can, as I proposed above, essentially weaken
>> the current semantics of dereferenceable to not exclude
>> mid-function-execution deallocation. We can also add a second attribute
>> with the current, stronger, semantics. We can keep the current attribute
>> as-is, and add a second attribute with the weaker semantics (and switch
>> Clang to use that).
>>
>> Please let me know what you think.
>>
>> Thanks again,
>>
>> Hal
>>
>> --
>> Hal Finkel
>> Lead, Compiler Technology and Programming Languages
>> Leadership Computing Facility
>> Argonne National Laboratory
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> [hidden email]
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] A nofree (and nosynch) function attribute: Mixing dereferenceable and delete

Bruce Hoult via llvm-dev

Thanks, Richard.

Based on the feedback from this thread, I'll move forward with the patches for nofree, nosync, adding a new corresponding dereferenceable attribute (my suggestion is to name this dereferenceable_on_entry; suggestions welcome), and updating Clang is emit this new attribute instead of the current one.

 -Hal

On 07/11/2018 06:43 PM, Richard Smith wrote:
On Wed, 11 Jul 2018 at 16:13, Hal Finkel via llvm-dev <[hidden email]> wrote:
[+Richard]


On 07/11/2018 08:29 AM, Sanjoy Das wrote:
> I'm not sure if nosynch is sufficient.  What if we had:
>
> void f(int& x) {
>   if (false) {
>     int r0 = x;
>   }
> }
>
> // other thread
> free(<pointer to x>);
>
> The source program is race free, but LLVM may speculate the read from
> x (seeing that it is dereferenceable) creating a race.

Interestingly, I'm not sure. I trust that Richard can answer this
question. :-)

So, if we had:

int y = ...;
...
f(y);

then I think that Clang's use of dereferenceable is almost certainly
okay (because the standard explicitly says, 9.2.3.2p5, "A reference
shall be initialized to refer to a valid object or
function."). Because the reference must have been valid when f(y) began
executing, unless it synchronizes somehow with the other thread, any
asynchronous deletion of y must be a race.

On the other hand, if we have:

int &y = ...;
...
f(y);

do we know that, when f(y) begins executing, the reference points to a
valid object? My reading of 9.3.3p2, which says, "Argument passing
(7.6.1.2) and
function value return (8.6.3) are initializations.", combined with the
statement above, implies that, perhaps surprisingly, the same holds
here. When the argument to f is initialized, it must refer to a valid
object (even if the initializer is another reference).

Richard, what do you think?

First, see also core issue 453, under the guise of which we're fixing the wording in [dcl.ref](9.2.3.2)p5 from

  "A reference shall be initialized to refer to a valid object or function."

to something like

  "If an lvalue to which a reference is directly bound designates neither an existing object or function of an appropriate type (11.6.3 [dcl.init.ref]), nor a region of storage of suitable size and alignment to contain an object of the reference's type (4.5 [intro.object], 6.8 [basic.life], 6.9 [basic.types]), the behavior is undefined."

My take is that, if the end of the duration of the region of storage is unsequenced with respect to the binding of the reference, then behavior is undefined. Generally when we refer to a thing happening while some condition is true, we mean that the execution point when the condition became true is sequenced before the thing happening, and the execution point where it becomes not true again is sequenced after.

So the behavior of that program is undefined regardless of whether 'f' actually loads through 'x'.
 
Thanks again,
Hal

P.S. If I'm right, then I might be happy, but it's also somewhat scary
(although we've been doing this optimization for multiple releases and I
don't think we have a bug along these lines), and I'd at least smell the
need for a sanitizer.

>
> -- Sanjoy
> On Tue, Jul 10, 2018 at 7:01 PM Hal Finkel via llvm-dev
> <[hidden email]> wrote:
>> Hi, everyone,
>>
>> I'd like to propose adding a nofree function attribute to indicate that
>> a function does not, directly or indirectly, call a memory-deallocation
>> function (e.g., free, C++'s operator delete). Clang/LLVM can currently
>> misoptimize functions that:
>>
>>  1. Have a reference argument.
>>
>>  2. Free the memory backing the object to which the reference is bound
>> during the function's execution.
>>
>> Because we tag, in Clang, all reference arguments using the
>> dereferenceable attribute, LLVM assumes that the pointer is
>> unconditionally dereferenceable throughout the course of the entire
>> function. This isn't true, however, if the memory is freed during the
>> execution of the function. For more information, please see the
>> discussion in https://reviews.llvm.org/D48239.
>>
>> To solve this problem, we need to give LLVM more information in order to
>> help it determine when a pointer, which is dereferenceable when the
>> functions begins to execute, will still be dereferenceable later on in
>> the function's execution. This nofree attribute can be part of that
>> solution. If we know that free (and friends) are not called by the
>> function (nor by any function called by the function, and so on), then
>> we know that pointers that started out dereferenceable will stay that
>> way (except as explained below).
>>
>> I'm initially proposing this to be only a function attribute, although
>> one could easily imagine a parameter attribute as well (that indicates
>> that a particular pointer argument is not freed by the function). This
>> might be useful, but for the use case of helping dereferenceable, it
>> would be subtle to use, unless the parameter was also marked as noalias,
>> because you'd need to know that the parameter was not also aliased with
>> another argument (or had not been captured). Another analysis would need
>> to provide this kind of information.
>>
>> Also, just because a function does not, directly or indirectly, call
>> free does not mean that it cannot cause memory to be deallocated. The
>> function might communicate (synchronize) with another thread causing
>> that other thread to delete the memory. For this reason, to use
>> dereferenceable as we currently do, we also need to know that the
>> function does not synchronize with any other threads. To solve this
>> problem, like nofree, I propose to add a nosynch attribute (to indicate
>> that a function does not use (non-relaxed) atomics or otherwise
>> synchronize with any other threads (e.g., perform I/O or, as a practical
>> matter, use volatile accesses).
>>
>> I've posted a patch for the nofree attribute
>> (https://reviews.llvm.org/D49165). nosynch's implementation would be
>> very similar (except instead of looking for calls to free, it would look
>> for uses of non-relaxed atomics, volatile ops, and known functions that
>> are not I/O functions).
>>
>> With both of these attributes (nofree and nosynch), a function argument
>> with the dereferenceable attribute will be known to be dereferenceable
>> throughout the execution of the attributed function. We can update
>> isDereferenceableAndAlignedPointer to include these additional checks on
>> the current function.
>>
>> One more choice we have: We can, as I proposed above, essentially weaken
>> the current semantics of dereferenceable to not exclude
>> mid-function-execution deallocation. We can also add a second attribute
>> with the current, stronger, semantics. We can keep the current attribute
>> as-is, and add a second attribute with the weaker semantics (and switch
>> Clang to use that).
>>
>> Please let me know what you think.
>>
>> Thanks again,
>>
>> Hal
>>
>> --
>> Hal Finkel
>> Lead, Compiler Technology and Programming Languages
>> Leadership Computing Facility
>> Argonne National Laboratory
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> [hidden email]
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] A nofree (and nosynch) function attribute: Mixing dereferenceable and delete

Bruce Hoult via llvm-dev
+1 to plan

It would be good to include inference logic to promote to full `dereferenceable` when possible so that transforms which need that anyways can just use that.

And/or we should ensure the APIs used always query both so that transform authors don't have to remember to do so...

I still somewhat wish we could do with a single `dereferenceable`.

On Wed, Jul 11, 2018 at 5:21 PM Hal Finkel via llvm-dev <[hidden email]> wrote:

Thanks, Richard.

Based on the feedback from this thread, I'll move forward with the patches for nofree, nosync, adding a new corresponding dereferenceable attribute (my suggestion is to name this dereferenceable_on_entry; suggestions welcome), and updating Clang is emit this new attribute instead of the current one.

 -Hal


On 07/11/2018 06:43 PM, Richard Smith wrote:
On Wed, 11 Jul 2018 at 16:13, Hal Finkel via llvm-dev <[hidden email]> wrote:
[+Richard]


On 07/11/2018 08:29 AM, Sanjoy Das wrote:
> I'm not sure if nosynch is sufficient.  What if we had:
>
> void f(int& x) {
>   if (false) {
>     int r0 = x;
>   }
> }
>
> // other thread
> free(<pointer to x>);
>
> The source program is race free, but LLVM may speculate the read from
> x (seeing that it is dereferenceable) creating a race.

Interestingly, I'm not sure. I trust that Richard can answer this
question. :-)

So, if we had:

int y = ...;
...
f(y);

then I think that Clang's use of dereferenceable is almost certainly
okay (because the standard explicitly says, 9.2.3.2p5, "A reference
shall be initialized to refer to a valid object or
function."). Because the reference must have been valid when f(y) began
executing, unless it synchronizes somehow with the other thread, any
asynchronous deletion of y must be a race.

On the other hand, if we have:

int &y = ...;
...
f(y);

do we know that, when f(y) begins executing, the reference points to a
valid object? My reading of 9.3.3p2, which says, "Argument passing
(7.6.1.2) and
function value return (8.6.3) are initializations.", combined with the
statement above, implies that, perhaps surprisingly, the same holds
here. When the argument to f is initialized, it must refer to a valid
object (even if the initializer is another reference).

Richard, what do you think?

First, see also core issue 453, under the guise of which we're fixing the wording in [dcl.ref](9.2.3.2)p5 from

  "A reference shall be initialized to refer to a valid object or function."

to something like

  "If an lvalue to which a reference is directly bound designates neither an existing object or function of an appropriate type (11.6.3 [dcl.init.ref]), nor a region of storage of suitable size and alignment to contain an object of the reference's type (4.5 [intro.object], 6.8 [basic.life], 6.9 [basic.types]), the behavior is undefined."

My take is that, if the end of the duration of the region of storage is unsequenced with respect to the binding of the reference, then behavior is undefined. Generally when we refer to a thing happening while some condition is true, we mean that the execution point when the condition became true is sequenced before the thing happening, and the execution point where it becomes not true again is sequenced after.

So the behavior of that program is undefined regardless of whether 'f' actually loads through 'x'.
 
Thanks again,
Hal

P.S. If I'm right, then I might be happy, but it's also somewhat scary
(although we've been doing this optimization for multiple releases and I
don't think we have a bug along these lines), and I'd at least smell the
need for a sanitizer.

>
> -- Sanjoy
> On Tue, Jul 10, 2018 at 7:01 PM Hal Finkel via llvm-dev
> <[hidden email]> wrote:
>> Hi, everyone,
>>
>> I'd like to propose adding a nofree function attribute to indicate that
>> a function does not, directly or indirectly, call a memory-deallocation
>> function (e.g., free, C++'s operator delete). Clang/LLVM can currently
>> misoptimize functions that:
>>
>>  1. Have a reference argument.
>>
>>  2. Free the memory backing the object to which the reference is bound
>> during the function's execution.
>>
>> Because we tag, in Clang, all reference arguments using the
>> dereferenceable attribute, LLVM assumes that the pointer is
>> unconditionally dereferenceable throughout the course of the entire
>> function. This isn't true, however, if the memory is freed during the
>> execution of the function. For more information, please see the
>> discussion in https://reviews.llvm.org/D48239.
>>
>> To solve this problem, we need to give LLVM more information in order to
>> help it determine when a pointer, which is dereferenceable when the
>> functions begins to execute, will still be dereferenceable later on in
>> the function's execution. This nofree attribute can be part of that
>> solution. If we know that free (and friends) are not called by the
>> function (nor by any function called by the function, and so on), then
>> we know that pointers that started out dereferenceable will stay that
>> way (except as explained below).
>>
>> I'm initially proposing this to be only a function attribute, although
>> one could easily imagine a parameter attribute as well (that indicates
>> that a particular pointer argument is not freed by the function). This
>> might be useful, but for the use case of helping dereferenceable, it
>> would be subtle to use, unless the parameter was also marked as noalias,
>> because you'd need to know that the parameter was not also aliased with
>> another argument (or had not been captured). Another analysis would need
>> to provide this kind of information.
>>
>> Also, just because a function does not, directly or indirectly, call
>> free does not mean that it cannot cause memory to be deallocated. The
>> function might communicate (synchronize) with another thread causing
>> that other thread to delete the memory. For this reason, to use
>> dereferenceable as we currently do, we also need to know that the
>> function does not synchronize with any other threads. To solve this
>> problem, like nofree, I propose to add a nosynch attribute (to indicate
>> that a function does not use (non-relaxed) atomics or otherwise
>> synchronize with any other threads (e.g., perform I/O or, as a practical
>> matter, use volatile accesses).
>>
>> I've posted a patch for the nofree attribute
>> (https://reviews.llvm.org/D49165). nosynch's implementation would be
>> very similar (except instead of looking for calls to free, it would look
>> for uses of non-relaxed atomics, volatile ops, and known functions that
>> are not I/O functions).
>>
>> With both of these attributes (nofree and nosynch), a function argument
>> with the dereferenceable attribute will be known to be dereferenceable
>> throughout the execution of the attributed function. We can update
>> isDereferenceableAndAlignedPointer to include these additional checks on
>> the current function.
>>
>> One more choice we have: We can, as I proposed above, essentially weaken
>> the current semantics of dereferenceable to not exclude
>> mid-function-execution deallocation. We can also add a second attribute
>> with the current, stronger, semantics. We can keep the current attribute
>> as-is, and add a second attribute with the weaker semantics (and switch
>> Clang to use that).
>>
>> Please let me know what you think.
>>
>> Thanks again,
>>
>> Hal
>>
>> --
>> Hal Finkel
>> Lead, Compiler Technology and Programming Languages
>> Leadership Computing Facility
>> Argonne National Laboratory
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> [hidden email]
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] A nofree (and nosynch) function attribute: Mixing dereferenceable and delete

Bruce Hoult via llvm-dev

On 07/11/2018 07:38 PM, Chandler Carruth wrote:
+1 to plan

It would be good to include inference logic to promote to full `dereferenceable` when possible so that transforms which need that anyways can just use that.

And/or we should ensure the APIs used always query both so that transform authors don't have to remember to do so...

I still somewhat wish we could do with a single `dereferenceable`.

I definitely plan on having the APIs check both attributes appropriately. No one should need to remember to query both and, hopefully, there should be no changes necessary to existing passes (except enhancements to attribute inference, of course).

Thanks again,
Hal


On Wed, Jul 11, 2018 at 5:21 PM Hal Finkel via llvm-dev <[hidden email]> wrote:

Thanks, Richard.

Based on the feedback from this thread, I'll move forward with the patches for nofree, nosync, adding a new corresponding dereferenceable attribute (my suggestion is to name this dereferenceable_on_entry; suggestions welcome), and updating Clang is emit this new attribute instead of the current one.

 -Hal


On 07/11/2018 06:43 PM, Richard Smith wrote:
On Wed, 11 Jul 2018 at 16:13, Hal Finkel via llvm-dev <[hidden email]> wrote:
[+Richard]


On 07/11/2018 08:29 AM, Sanjoy Das wrote:
> I'm not sure if nosynch is sufficient.  What if we had:
>
> void f(int& x) {
>   if (false) {
>     int r0 = x;
>   }
> }
>
> // other thread
> free(<pointer to x>);
>
> The source program is race free, but LLVM may speculate the read from
> x (seeing that it is dereferenceable) creating a race.

Interestingly, I'm not sure. I trust that Richard can answer this
question. :-)

So, if we had:

int y = ...;
...
f(y);

then I think that Clang's use of dereferenceable is almost certainly
okay (because the standard explicitly says, 9.2.3.2p5, "A reference
shall be initialized to refer to a valid object or
function."). Because the reference must have been valid when f(y) began
executing, unless it synchronizes somehow with the other thread, any
asynchronous deletion of y must be a race.

On the other hand, if we have:

int &y = ...;
...
f(y);

do we know that, when f(y) begins executing, the reference points to a
valid object? My reading of 9.3.3p2, which says, "Argument passing
(7.6.1.2) and
function value return (8.6.3) are initializations.", combined with the
statement above, implies that, perhaps surprisingly, the same holds
here. When the argument to f is initialized, it must refer to a valid
object (even if the initializer is another reference).

Richard, what do you think?

First, see also core issue 453, under the guise of which we're fixing the wording in [dcl.ref](9.2.3.2)p5 from

  "A reference shall be initialized to refer to a valid object or function."

to something like

  "If an lvalue to which a reference is directly bound designates neither an existing object or function of an appropriate type (11.6.3 [dcl.init.ref]), nor a region of storage of suitable size and alignment to contain an object of the reference's type (4.5 [intro.object], 6.8 [basic.life], 6.9 [basic.types]), the behavior is undefined."

My take is that, if the end of the duration of the region of storage is unsequenced with respect to the binding of the reference, then behavior is undefined. Generally when we refer to a thing happening while some condition is true, we mean that the execution point when the condition became true is sequenced before the thing happening, and the execution point where it becomes not true again is sequenced after.

So the behavior of that program is undefined regardless of whether 'f' actually loads through 'x'.
 
Thanks again,
Hal

P.S. If I'm right, then I might be happy, but it's also somewhat scary
(although we've been doing this optimization for multiple releases and I
don't think we have a bug along these lines), and I'd at least smell the
need for a sanitizer.

>
> -- Sanjoy
> On Tue, Jul 10, 2018 at 7:01 PM Hal Finkel via llvm-dev
> <[hidden email]> wrote:
>> Hi, everyone,
>>
>> I'd like to propose adding a nofree function attribute to indicate that
>> a function does not, directly or indirectly, call a memory-deallocation
>> function (e.g., free, C++'s operator delete). Clang/LLVM can currently
>> misoptimize functions that:
>>
>>  1. Have a reference argument.
>>
>>  2. Free the memory backing the object to which the reference is bound
>> during the function's execution.
>>
>> Because we tag, in Clang, all reference arguments using the
>> dereferenceable attribute, LLVM assumes that the pointer is
>> unconditionally dereferenceable throughout the course of the entire
>> function. This isn't true, however, if the memory is freed during the
>> execution of the function. For more information, please see the
>> discussion in https://reviews.llvm.org/D48239.
>>
>> To solve this problem, we need to give LLVM more information in order to
>> help it determine when a pointer, which is dereferenceable when the
>> functions begins to execute, will still be dereferenceable later on in
>> the function's execution. This nofree attribute can be part of that
>> solution. If we know that free (and friends) are not called by the
>> function (nor by any function called by the function, and so on), then
>> we know that pointers that started out dereferenceable will stay that
>> way (except as explained below).
>>
>> I'm initially proposing this to be only a function attribute, although
>> one could easily imagine a parameter attribute as well (that indicates
>> that a particular pointer argument is not freed by the function). This
>> might be useful, but for the use case of helping dereferenceable, it
>> would be subtle to use, unless the parameter was also marked as noalias,
>> because you'd need to know that the parameter was not also aliased with
>> another argument (or had not been captured). Another analysis would need
>> to provide this kind of information.
>>
>> Also, just because a function does not, directly or indirectly, call
>> free does not mean that it cannot cause memory to be deallocated. The
>> function might communicate (synchronize) with another thread causing
>> that other thread to delete the memory. For this reason, to use
>> dereferenceable as we currently do, we also need to know that the
>> function does not synchronize with any other threads. To solve this
>> problem, like nofree, I propose to add a nosynch attribute (to indicate
>> that a function does not use (non-relaxed) atomics or otherwise
>> synchronize with any other threads (e.g., perform I/O or, as a practical
>> matter, use volatile accesses).
>>
>> I've posted a patch for the nofree attribute
>> (https://reviews.llvm.org/D49165). nosynch's implementation would be
>> very similar (except instead of looking for calls to free, it would look
>> for uses of non-relaxed atomics, volatile ops, and known functions that
>> are not I/O functions).
>>
>> With both of these attributes (nofree and nosynch), a function argument
>> with the dereferenceable attribute will be known to be dereferenceable
>> throughout the execution of the attributed function. We can update
>> isDereferenceableAndAlignedPointer to include these additional checks on
>> the current function.
>>
>> One more choice we have: We can, as I proposed above, essentially weaken
>> the current semantics of dereferenceable to not exclude
>> mid-function-execution deallocation. We can also add a second attribute
>> with the current, stronger, semantics. We can keep the current attribute
>> as-is, and add a second attribute with the weaker semantics (and switch
>> Clang to use that).
>>
>> Please let me know what you think.
>>
>> Thanks again,
>>
>> Hal
>>
>> --
>> Hal Finkel
>> Lead, Compiler Technology and Programming Languages
>> Leadership Computing Facility
>> Argonne National Laboratory
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> [hidden email]
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev