[llvm-dev] RFC: LoopIDs are not identifiers (and better loop-parallel metadata)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

[llvm-dev] RFC: LoopIDs are not identifiers (and better loop-parallel metadata)

Alberto Barbaro via llvm-dev
Dear LLVM community,

LLVM IR has a concept of 'LoopID' [1] which is a misnomer:

(a) LoopIDs are not unique: Any pass that duplicates IR will do it
including its metadata (e.g. LoopVersioning) such that thereafter
multiple loops are linked with the same LoopID. There is even a test
case (Transforms/LoopUnroll/unroll-pragmas-disabled.ll) for multiple
loops with the same LoopID.

(b) LoopIDs are not persistent: Adding or removing an item from a
LoopID can only be done by creating a new MDNode and assigning it to
the loop's branch(es). Passes such as LoopUnroll
(llvm.loop.unroll.disable) and LoopVectorize (llvm.loop.isvectorized)
use this to mark loops to be not transformed multiple times or to
avoid that a LoopVersioning original loop is transformed.

The only mechanism in LLVM that uses LoopIDs as identifiers is
llvm.mem.parallel_loop_access [2]. It is added to memory-accessing
instructions and points to one or multiple LoopIDs it is parallel to.
Because of (b), transformation passes can change a loop's llvm.loop
metadata such that llvm.mem.parallel_loop_access may point to an
orphaned LoopID.

Patch [3] fixes this issue. Instead of instructions referencing the
loops they are parallel to, it adds a new loop attribute
"llvm.loop.parallel_accesses" that references the instructions. Any
loop transformation that changes the loop metadata will by default
carry-over the llvm.loop.parallel_accesses property and the
information retained.

Since global metadata cannot reference instructions, [3] also
introduces a new distinct metadata type that represents a group of
instructions. An instruction will reference the a group using the
llvm.access.group metadata, and llvm.loop.parallel_accesses uses the
access group MDNode to reference to the instructions.

Currently, the metadata for parallel memory accesses looks like:

loop:
  load ... !llvm.mem.parallel_loop_access !0
  br label %loop, !llvm.loop !0

!0 = distinct !{!0}                                      ; 'LoopID'

My suggestion uses this pattern instead:

loop:
  load ... !llvm.access.group !2
  br label %loop, !llvm.loop !0

!0 = distinct !{!0, !1}                                  ; 'LoopID'
!1 = !{!"llvm.loop.parallel_accesses", !2}
!2 = distinct !{}                                        ; access group

Patch [4] modifies Clang to generate this kind of metadata. Accesses
in the same loop nest level are assigned to the same access group.


Fixing (a) and (b) instead would require much more work. For (a), any
code that copies entire basic blocks (this includes the inliner since
the same function can be inlined multiple times) would need to make a
copy of the LoopID and update any llvm.mem.parallel_loop_access in the
copy. For (b), in addition to MDNode::setOperand, we need a method to
change the MDNode's size without changing its address. Currently,
MDNode is optimized for an unchangeable number of operands.

I suggest to stop seeing 'LoopID' as an identifier entirely, but to
use it as a bag of loop attributes. Patch [5] updates the reference
manual. We may also think about renaming LoopID in the source code,
and dropping the 'distinct' and 'first item is self-reference'. The
latter two were intended to avoid separate LoopIDs into into one
because their content is identical, i.e. uphold (a). As we have seen,
there are other reasons for loops to have identical LoopIDs. With
patches [3,4], llvm.loop metadata can be collapsed (unlike access
groups), thus the 'distinct' is not necessary anymore. Unfortunately,
there is code in LLVM (and maybe elsewhere) that depends on LoopIDs'
first item, i.e. we cannot get rid of it that easily.


Regards,
Michael


[1] https://llvm.org/docs/LangRef.html#llvm-loop
[2] https://llvm.org/docs/LangRef.html#llvm-mem-parallel-loop-access-metadata
[3] https://reviews.llvm.org/D52116
[4] https://reviews.llvm.org/D52117
[5] https://reviews.llvm.org/D55290
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] RFC: LoopIDs are not identifiers (and better loop-parallel metadata)

Alberto Barbaro via llvm-dev
On 12/4/18 6:50 PM, Michael Kruse wrote:

> Dear LLVM community,
>
> LLVM IR has a concept of 'LoopID' [1] which is a misnomer:
>
> (a) LoopIDs are not unique: Any pass that duplicates IR will do it
> including its metadata (e.g. LoopVersioning) such that thereafter
> multiple loops are linked with the same LoopID. There is even a test
> case (Transforms/LoopUnroll/unroll-pragmas-disabled.ll) for multiple
> loops with the same LoopID.
>
> (b) LoopIDs are not persistent: Adding or removing an item from a
> LoopID can only be done by creating a new MDNode and assigning it to
> the loop's branch(es). Passes such as LoopUnroll
> (llvm.loop.unroll.disable) and LoopVectorize (llvm.loop.isvectorized)
> use this to mark loops to be not transformed multiple times or to
> avoid that a LoopVersioning original loop is transformed.
>
> The only mechanism in LLVM that uses LoopIDs as identifiers is
> llvm.mem.parallel_loop_access [2]. It is added to memory-accessing
> instructions and points to one or multiple LoopIDs it is parallel to.
> Because of (b), transformation passes can change a loop's llvm.loop
> metadata such that llvm.mem.parallel_loop_access may point to an
> orphaned LoopID.
>
> Patch [3] fixes this issue. Instead of instructions referencing the
> loops they are parallel to, it adds a new loop attribute
> "llvm.loop.parallel_accesses" that references the instructions. Any
> loop transformation that changes the loop metadata will by default
> carry-over the llvm.loop.parallel_accesses property and the
> information retained.
>
> Since global metadata cannot reference instructions, [3] also
> introduces a new distinct metadata type that represents a group of
> instructions. An instruction will reference the a group using the
> llvm.access.group metadata, and llvm.loop.parallel_accesses uses the
> access group MDNode to reference to the instructions.
>
> Currently, the metadata for parallel memory accesses looks like:
>
> loop:
>   load ... !llvm.mem.parallel_loop_access !0
>   br label %loop, !llvm.loop !0
>
> !0 = distinct !{!0}                                      ; 'LoopID'
>
> My suggestion uses this pattern instead:
>
> loop:
>   load ... !llvm.access.group !2
>   br label %loop, !llvm.loop !0
>
> !0 = distinct !{!0, !1}                                  ; 'LoopID'
> !1 = !{!"llvm.loop.parallel_accesses", !2}
> !2 = distinct !{}                                        ; access group
>
> Patch [4] modifies Clang to generate this kind of metadata. Accesses
> in the same loop nest level are assigned to the same access group.

I think that we should move forward with this solution. It will make
LLVM both more robust and most consistent in this regard:

 1. Robust: It will become less likely that we'll lose track of loops
marked as parallel because of an update to the loop ID. Given that any
change to the loop's properties (which happens, for example, when
user-requested transformations are performed) changes the metadata
identify of the loop id, this is not uncommon. As a result, we really
need the loop id to, in effect, point to the parallel access groupings
instead of the other-way around.

 2. Consistent: The only reason we need, AFAIK, the loop ids to be
distinct, semantically, is for parallel loop annotations in loop nests.
If I have loop-parallel annotations point at the loop ids, and loop ids
are not distinct, then there might be no way to distinguish an inner and
outer loop (and, thus, no way to make with respect to which level in a
loop nest a loop has been annotated as parallel). This new scheme
removes that requirement, and thus, the requirement that all loop ids
must be marked distinct.

>
>
> Fixing (a) and (b) instead would require much more work. For (a), any
> code that copies entire basic blocks (this includes the inliner since
> the same function can be inlined multiple times) would need to make a
> copy of the LoopID and update any llvm.mem.parallel_loop_access in the
> copy. For (b), in addition to MDNode::setOperand, we need a method to
> change the MDNode's size without changing its address. Currently,
> MDNode is optimized for an unchangeable number of operands.
>
> I suggest to stop seeing 'LoopID' as an identifier entirely, but to
> use it as a bag of loop attributes. Patch [5] updates the reference
> manual. We may also think about renaming LoopID in the source code,
> and dropping the 'distinct' and 'first item is self-reference'. The
> latter two were intended to avoid separate LoopIDs into into one
> because their content is identical, i.e. uphold (a). As we have seen,
> there are other reasons for loops to have identical LoopIDs. With
> patches [3,4], llvm.loop metadata can be collapsed (unlike access
> groups), thus the 'distinct' is not necessary anymore. Unfortunately,
> there is code in LLVM (and maybe elsewhere) that depends on LoopIDs'
> first item, i.e. we cannot get rid of it that easily.

I don't think it's worth changing this first element, unless we have
some other reason to do so.

 -Hal

--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] RFC: LoopIDs are not identifiers (and better loop-parallel metadata)

Alberto Barbaro via llvm-dev
Am Mi., 12. Dez. 2018 um 10:10 Uhr schrieb Finkel, Hal J. <[hidden email]>:
> > As we have seen,
> > there are other reasons for loops to have identical LoopIDs. With
> > patches [3,4], llvm.loop metadata can be collapsed (unlike access
> > groups), thus the 'distinct' is not necessary anymore. Unfortunately,
> > there is code in LLVM (and maybe elsewhere) that depends on LoopIDs'
> > first item, i.e. we cannot get rid of it that easily.
>
> I don't think it's worth changing this first element, unless we have
> some other reason to do so.

Would it be worthwhile to update the metadata uniquing algorithm to
consider shallow self-references?

Michael
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] RFC: LoopIDs are not identifiers (and better loop-parallel metadata)

Alberto Barbaro via llvm-dev
On 12/12/18 10:29 AM, Michael Kruse wrote:

> Am Mi., 12. Dez. 2018 um 10:10 Uhr schrieb Finkel, Hal J. <[hidden email]>:
>>> As we have seen,
>>> there are other reasons for loops to have identical LoopIDs. With
>>> patches [3,4], llvm.loop metadata can be collapsed (unlike access
>>> groups), thus the 'distinct' is not necessary anymore. Unfortunately,
>>> there is code in LLVM (and maybe elsewhere) that depends on LoopIDs'
>>> first item, i.e. we cannot get rid of it that easily.
>> I don't think it's worth changing this first element, unless we have
>> some other reason to do so.
> Would it be worthwhile to update the metadata uniquing algorithm to
> consider shallow self-references?

What benefit would that bring?

 -Hal

>
> Michael

--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] RFC: LoopIDs are not identifiers (and better loop-parallel metadata)

Alberto Barbaro via llvm-dev
Am Mi., 12. Dez. 2018 um 11:11 Uhr schrieb Finkel, Hal J. <[hidden email]>:

>
> On 12/12/18 10:29 AM, Michael Kruse wrote:
> > Am Mi., 12. Dez. 2018 um 10:10 Uhr schrieb Finkel, Hal J. <[hidden email]>:
> >>> As we have seen,
> >>> there are other reasons for loops to have identical LoopIDs. With
> >>> patches [3,4], llvm.loop metadata can be collapsed (unlike access
> >>> groups), thus the 'distinct' is not necessary anymore. Unfortunately,
> >>> there is code in LLVM (and maybe elsewhere) that depends on LoopIDs'
> >>> first item, i.e. we cannot get rid of it that easily.
> >> I don't think it's worth changing this first element, unless we have
> >> some other reason to do so.
> > Would it be worthwhile to update the metadata uniquing algorithm to
> > consider shallow self-references?
>
> What benefit would that bring?

Fewer metadata nodes by uniquing them. I would expect there might be
quite a few loops that have identical metadata, e.g. all loops that
have "setAlreaduUnrolled()" after loop unrolling.

Michael
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] RFC: LoopIDs are not identifiers (and better loop-parallel metadata)

Alberto Barbaro via llvm-dev
On 12/12/18 11:35 AM, Michael Kruse wrote:

> Am Mi., 12. Dez. 2018 um 11:11 Uhr schrieb Finkel, Hal J. <[hidden email]>:
>> On 12/12/18 10:29 AM, Michael Kruse wrote:
>>> Am Mi., 12. Dez. 2018 um 10:10 Uhr schrieb Finkel, Hal J. <[hidden email]>:
>>>>> As we have seen,
>>>>> there are other reasons for loops to have identical LoopIDs. With
>>>>> patches [3,4], llvm.loop metadata can be collapsed (unlike access
>>>>> groups), thus the 'distinct' is not necessary anymore. Unfortunately,
>>>>> there is code in LLVM (and maybe elsewhere) that depends on LoopIDs'
>>>>> first item, i.e. we cannot get rid of it that easily.
>>>> I don't think it's worth changing this first element, unless we have
>>>> some other reason to do so.
>>> Would it be worthwhile to update the metadata uniquing algorithm to
>>> consider shallow self-references?
>> What benefit would that bring?
> Fewer metadata nodes by uniquing them. I would expect there might be
> quite a few loops that have identical metadata, e.g. all loops that
> have "setAlreaduUnrolled()" after loop unrolling.

Okay. I have no idea how much code this would break (from none to a
lot). Any thoughts? I think that, in general, you should have a separate
RFC if you'd like to change them metadata uniquing algorithm.

Thanks again,

Hal

>
> Michael

--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] RFC: LoopIDs are not identifiers (and better loop-parallel metadata)

Alberto Barbaro via llvm-dev
Am Mi., 12. Dez. 2018 um 11:44 Uhr schrieb Finkel, Hal J. <[hidden email]>:
> > Fewer metadata nodes by uniquing them. I would expect there might be
> > quite a few loops that have identical metadata, e.g. all loops that
> > have "setAlreaduUnrolled()" after loop unrolling.
>
> Okay. I have no idea how much code this would break (from none to a
> lot). Any thoughts?

My guess: nothing breaks.

I have only seen this kind of nodes with LoopIDs and this RFC would
remove the need for them to be unique (as explained in (a), this
wasn't even a guaranteed property).

Using self-referencing nodes seem to be a workaround when no
'distinct' nodes existed, exploiting that the uniquing is not using a
graph minimization algorithm.


> I think that, in general, you should have a separate
> RFC if you'd like to change them metadata uniquing algorithm.

I am not yet sure myself whether I want this. As a maybe easier
alternative to changing the first element of a loop metadata node
(e.g. to 'null'), it is strongly related to this RFC.

Michael
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] RFC: LoopIDs are not identifiers (and better loop-parallel metadata)

Alberto Barbaro via llvm-dev
On 12/12/18 12:25 PM, Michael Kruse wrote:

> Am Mi., 12. Dez. 2018 um 11:44 Uhr schrieb Finkel, Hal J. <[hidden email]>:
>>> Fewer metadata nodes by uniquing them. I would expect there might be
>>> quite a few loops that have identical metadata, e.g. all loops that
>>> have "setAlreaduUnrolled()" after loop unrolling.
>> Okay. I have no idea how much code this would break (from none to a
>> lot). Any thoughts?
> My guess: nothing breaks.
>
> I have only seen this kind of nodes with LoopIDs and this RFC would
> remove the need for them to be unique (as explained in (a), this
> wasn't even a guaranteed property).
>
> Using self-referencing nodes seem to be a workaround when no
> 'distinct' nodes existed,

Correct.

>  exploiting that the uniquing is not using a
> graph minimization algorithm.
>
>
>> I think that, in general, you should have a separate
>> RFC if you'd like to change them metadata uniquing algorithm.
> I am not yet sure myself whether I want this. As a maybe easier
> alternative to changing the first element of a loop metadata node
> (e.g. to 'null'), it is strongly related to this RFC.

Maybe the following is easier: Remove the first self-referential
element, and remove it from existing metadata in auto-upgrade? This
seems like an easy thing to auto upgrade.

 -Hal

>
> Michael

--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev