[llvm-dev] [RFC] Porting MachinePipeliner to AArch64+SVE

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

[llvm-dev] [RFC] Porting MachinePipeliner to AArch64+SVE

Tim Northover via llvm-dev

Hi,

I am extending LLVM for HPC applications.
As one of them, I am trying to make MachinePipeliner available on
AArch64 + Scalable Vector Extension environment.

MachinePipeliner is currently used only by Hexagon CPU.
Since it is a very portable implementation, I think that it will
actually work just by adding a little code for many CPUs(See Code [2]).

The current MachinePipeliner is written on the premise that
DFAPacketizer is used for resource management.
However, I'd like to use MachinePipeliner in a way that does not use
DFAPacketizer for the reasons described below(*).
In MachinePipeliner implementation, only a small part is dependent on
DFAPacketizer or Instruction itineraries.
Therefore, I think that one of the following implementations is
possible:

(a) creating a path in MachinePipeliner that does not use DFAPacketizer
(b) making MachinePipeliner inheritable so that anyone can write code
    that does not use DFAPacketizer
   
Since implementations using only Instruction itineraries without
DFAPacketizer are possible, I don't think that I can use
TargetSchedModel::hasInstrItineraries to select the execution path.
Personally, I think that implementation of (b) is better.

Also, if predicated instructions like SVE are available, prologue and
epilogue code generation using predicated execution as shown in the
reference[1] may be possible.
In this case, if we choose the implementation of (b) and it is
possible to override SwingSchedulerDAG::generatePipelinedLoop, I think
that it can easily be extended.

Comments or suggestions are welcome.

Thank you very much.

Best regards,
--
--------------------------------------
Masaki Arai

========================================

(*) Currently, many CPU scheduling models are defined by the form not
using Instruction itineraries.
Therefore, they have the form 1 or 2 in the following
TargetSchedule.td:

// The SchedMachineModel is defined by subtargets for three categories of data:
// 1. Basic properties for coarse grained instruction cost model.
// 2. Scheduler Read/Write resources for simple per-opcode cost model.
// 3. Instruction itineraries for detailed reservation tables.

By making MachinePipeliner work even in a form not using Instruction
itineraries, we will be able to run MachinePipeliner's execution test
on various machines, even if we do not use it on those machines.

Instruction itineraries essentially expresses the following
correspondence:

  opcode ==> {FU1, FU2, ...}

and DFAPacketizer uses DFA with opcodes.
In order to strictly schedule predicated instructions like SVE,
We need to consider that following two instructions use pipeline resources
exclusively in the same cycle:

  MI1 if P ==> {FU1, FU2, ...}
  MI2 if Q ==> {FU1, FU2, ...}

where predicate P and Q hold P == not Q.
However, I don't think that current DFAPacketizer can represent these
situations.

References:

[1] Code Generation Schemas for Modulo Scheduled DO-loops and WHILE-loops
http://www.hpl.hp.com/techreports/92/HPL-92-47.pdf?jumpid=reg_R1002_USEN

Code:

  The sample patch for origin/release_60 [2], which doesn't use
  DFAPacketizer, can generate executable files from sample-code.c for
  both AArch64 and x86_64.

  [AArch64]% clang -O2 -mcpu=thunderx2t99 -mllvm -enable-pipeliner -mllvm -pipeliner-max=100 sample-code.c
  [x86_64] % clang -O2 -march=sandybridge -mllvm -enable-pipeliner -mllvm -pipeliner-max=100 sample-code.c

[2] https://reviews.llvm.org/D47943


_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

sample-code.c (640 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Porting MachinePipeliner to AArch64+SVE

Tim Northover via llvm-dev
Hi,

Masaki Arai via llvm-dev <[hidden email]> writes:
> Code:
>
> The sample patch for origin/release_60 [2], which doesn't use
> DFAPacketizer, can generate executable files from sample-code.c for
> both AArch64 and x86_64.
  ...

I am sorry that I misunderstood that `origin/release_60' means
`LLVM 6.0.0' and the above link included many irrelevant differences.

I made new


so please check this instead.

Best regards,
--
--------------------------------------
Masaki Arai

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Porting MachinePipeliner to AArch64+SVE

Tim Northover via llvm-dev
On 8 June 2018 at 18:04, Masaki Arai via llvm-dev
<[hidden email]> wrote:
> I made new
>
>    https://reviews.llvm.org/D47948
>
> so please check this instead.

Hi Masaki,

You can update the diff on the old review, I think it'll be easier, as
we don't have to keep adding all the people to it.

Also, make sure the review is against trunk, not a release.

--
cheers,
--renato
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Porting MachinePipeliner to AArch64+SVE

Tim Northover via llvm-dev
Hi Renato,

Renato Golin <[hidden email]> writes:
> You can update the diff on the old review, I think it'll be easier, as
> we don't have to keep adding all the people to it.

Thank you very much for your advice.

> Also, make sure the review is against trunk, not a release.

OK.
I will also update it after running tests on trunk.

Thanks.

Best regards,
--
--------------------------------------
Masaki Arai


_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Porting MachinePipeliner to AArch64+SVE

Tim Northover via llvm-dev
In reply to this post by Tim Northover via llvm-dev
Hi,

On 08/06/2018 15:11, Masaki Arai via llvm-dev wrote:
> Hi,
>
> I am extending LLVM for HPC applications.
> As one of them, I am trying to make MachinePipeliner available on
> AArch64 + Scalable Vector Extension environment.
>

Great, thanks for looking into that.

IIUC from having a first look at your patch, there is nothing SVE
specific there so far. Although it potentially will be very useful for
SVE, it should also be beneficial for AArch64 without SVE and X86,
right? As there are no scheduling models available for SVE in LLVM yet,
I suppose it would be a good motivation if you could show some benefit
on existing AArch64 or X86 cores with your proposed modelling.

> MachinePipeliner is currently used only by Hexagon CPU.
> Since it is a very portable implementation, I think that it will
> actually work just by adding a little code for many CPUs(See Code [2]).
>
> The current MachinePipeliner is written on the premise that
> DFAPacketizer is used for resource management.
> However, I'd like to use MachinePipeliner in a way that does not use
> DFAPacketizer for the reasons described below(*).
> In MachinePipeliner implementation, only a small part is dependent on
> DFAPacketizer or Instruction itineraries.
> Therefore, I think that one of the following implementations is
> possible:
>
> (a) creating a path in MachinePipeliner that does not use DFAPacketizer
> (b) making MachinePipeliner inheritable so that anyone can write code
>      that does not use DFAPacketizer
>
> Since implementations using only Instruction itineraries without
> DFAPacketizer are possible, I don't think that I can use
> TargetSchedModel::hasInstrItineraries to select the execution path.
> Personally, I think that implementation of (b) is better.
>

IMO it makes sense to go with (b), given that the dispatch overhead
should be tiny compared to the other work that's going on and we also
added similar hooks to the generic machine scheduler recently. But it
seems like this is a smaller implementation detail and making sure we
are getting the modelling aspect right is more important.

Thanks,
Florian
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Porting MachinePipeliner to AArch64+SVE

Tim Northover via llvm-dev
Hi,

Thank you very much for your comments.

Florian Hahn <[hidden email]> writes:
> IIUC from having a first look at your patch, there is nothing SVE
> specific there so far. Although it potentially will be very useful for
> SVE, it should also be beneficial for AArch64 without SVE and X86,
> right?

Yes.
Our significant target is FUJITSU's AArch64+SVE CPU, but I think
MachinePipeliner is beneficial for AArch 64 without SVE or any ILP
RISC CPUs.
However, I'm not sure for x86.

> As there are no scheduling models available for SVE in LLVM
> yet, I suppose it would be a good motivation if you could show some
> benefit on existing AArch64 or X86 cores with your proposed modelling.

It is easy to make a small test set that can confirm performance
improvement.
However, I think there are many challenges to make MachinePipeliner
really beneficial on AArch64 without SVE for actual applications.
For example,
(a) Preparing the appropriate machine model for scheduling
(b) Consideration of register pressure in AArch64
    (Coordination with register allocation pass)
(c) Extending iteration dependence distance (2 or more)
(d) Consideration of the impact of VPlan's estimation
    (Coordination with VPlan)
(e) Consideration of the impact of loop optimizations
    (especially loop distribution)
(f) Consideration of the impact of flang

I would like to make it work only when option `-enable-pipeliner' is
specified until these issues are solved.

> IMO it makes sense to go with (b), given that the dispatch overhead
> should be tiny compared to the other work that's going on and we also
> added similar hooks to the generic machine scheduler recently. But it
> seems like this is a smaller implementation detail and making sure we
> are getting the modelling aspect right is more important.

One of the reasons for posting the RFC is that MachinePipeliner is
updated frequently.
Therefore, I would like to hear the opinion of MachinePipeliner
developers.
I am glad to make any patches, but since I do not have a Hexagon
environment, I'm worried whether I can thoroughly test them.

Best regards,
--
--------------------------------------
Masaki Arai

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev