[llvm-dev] [RISCV][PIC] Lowering pseudo instructions in MCCodeEmitter vs AsmPrinter

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

[llvm-dev] [RISCV][PIC] Lowering pseudo instructions in MCCodeEmitter vs AsmPrinter

Bruce Hoult via llvm-dev
H all,

I'm looking at generating PIC code for RISC-V in the context of Linux. Not sure if anyone is working on this already, any inputs are very welcome.

I'm now looking at function calls which in the RISCV backend are represented via two pseudoinstructions RISCV::TAIL and RISCV::CALL.

Currently those pseudos are lowered in MCCodeEmitter. They are expanded into AUIPC and JALR instructions and the first one needs a relocation, which for a static reloc model is R_RISCV_CALL but for PIC code should be R_RISCV_CALL_PLT.

The problem I find is that at this point it is too late to tell the exact relocation needed: as far as I can tell there is no way to determine the relocation model. Perhaps this is on purpose and the MCCodeEmitter should not have that knowledge. Or maybe not and it is just a matter to "push" a TargetMachine to it, but the way the class is constructed does not look like this approach is workable.

So I was considering lowering these pseudo-instructions in AsmPrinter instead. There I can tell the exact kind of the MCOperand I want thanks to the fact that the AsmPrinter is constructed with a TargetMachine.

That said perhaps there are extra constraints that require doing the lowering in MCCodeEmitter, unfortunately I can't tell exactly what is the advantage of lowering that late.

These pseudos are marked as isCodegenOnly = 0 so if I lower them in AsmPrinter my understanding is that now I have to change them to isCodegenOnly = 1 and then teach AsmParser to recognize them (I would need to use the reloc model there too). Does this make sense?

Alternatively I was considering adding two new pseudos like RISCV::CALL_PLT and RISCV::TAIL_PLT and also lower them at MCCodeEmitter. But this looks a bit too bulky to me and I think I would still have the issue that the "call" and "tail" pseudos in the assembler would need some extra magic (i.e. when assembling a "call" pseudoinstruction with -fPIC) so they don't end being parsed as the non-PIC counterparts. I might be wrong here though.

Is this reasonable or there are other downsides to consider here?

Thank you very much,

--
Roger Ferrer Ibáñez

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RISCV][PIC] Lowering pseudo instructions in MCCodeEmitter vs AsmPrinter

Bruce Hoult via llvm-dev

Take a look at Alex's review comments for https://reviews.llvm.org/D45395. I had similar questions while implementing tail call lowering for RISCV.

--Mandeep


On 7/10/2018 9:51 AM, Roger Ferrer Ibáñez via llvm-dev wrote:
H all,

I'm looking at generating PIC code for RISC-V in the context of Linux. Not sure if anyone is working on this already, any inputs are very welcome.

I'm now looking at function calls which in the RISCV backend are represented via two pseudoinstructions RISCV::TAIL and RISCV::CALL.

Currently those pseudos are lowered in MCCodeEmitter. They are expanded into AUIPC and JALR instructions and the first one needs a relocation, which for a static reloc model is R_RISCV_CALL but for PIC code should be R_RISCV_CALL_PLT.

The problem I find is that at this point it is too late to tell the exact relocation needed: as far as I can tell there is no way to determine the relocation model. Perhaps this is on purpose and the MCCodeEmitter should not have that knowledge. Or maybe not and it is just a matter to "push" a TargetMachine to it, but the way the class is constructed does not look like this approach is workable.

So I was considering lowering these pseudo-instructions in AsmPrinter instead. There I can tell the exact kind of the MCOperand I want thanks to the fact that the AsmPrinter is constructed with a TargetMachine.

That said perhaps there are extra constraints that require doing the lowering in MCCodeEmitter, unfortunately I can't tell exactly what is the advantage of lowering that late.

These pseudos are marked as isCodegenOnly = 0 so if I lower them in AsmPrinter my understanding is that now I have to change them to isCodegenOnly = 1 and then teach AsmParser to recognize them (I would need to use the reloc model there too). Does this make sense?

Alternatively I was considering adding two new pseudos like RISCV::CALL_PLT and RISCV::TAIL_PLT and also lower them at MCCodeEmitter. But this looks a bit too bulky to me and I think I would still have the issue that the "call" and "tail" pseudos in the assembler would need some extra magic (i.e. when assembling a "call" pseudoinstruction with -fPIC) so they don't end being parsed as the non-PIC counterparts. I might be wrong here though.

Is this reasonable or there are other downsides to consider here?

Thank you very much,

--
Roger Ferrer Ibáñez


_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RISCV][PIC] Lowering pseudo instructions in MCCodeEmitter vs AsmPrinter

Bruce Hoult via llvm-dev
In reply to this post by Bruce Hoult via llvm-dev
On 7/10/2018 9:51 AM, Roger Ferrer Ibáñez via llvm-dev wrote:

> H all,
>
> I'm looking at generating PIC code for RISC-V in the context of Linux.
> Not sure if anyone is working on this already, any inputs are very
> welcome.
>
> I'm now looking at function calls which in the RISCV backend are
> represented via two pseudoinstructions RISCV::TAIL and RISCV::CALL.
>
> Currently those pseudos are lowered in MCCodeEmitter. They are
> expanded into AUIPC and JALR instructions and the first one needs a
> relocation, which for a static reloc model is R_RISCV_CALL but for PIC
> code should be R_RISCV_CALL_PLT.

This is not really correct.  A direct call is represented using
R_RISCV_CALL; a call through a PLT is represented with
R_RISCV_CALL_PLT.  In a static relocation model, no calls use the PLT;
with a PIC relocation model, some, but not all, calls use the PLT.

In assembly, this is represented by changing the operand of the call
instruction: a direct call is "call f", a call through the PLT is "call
f@plt". In the MC layer, this is represented using
MCSymbolRefExpr::VK_PLT.  In the LLVM backend, this is represented with
a target-specifc flag (for example, X86II::MO_PLT).

-Eli

--
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RISCV][PIC] Lowering pseudo instructions in MCCodeEmitter vs AsmPrinter

Bruce Hoult via llvm-dev
In reply to this post by Bruce Hoult via llvm-dev
On 10 July 2018 at 17:51, Roger Ferrer Ibáñez via llvm-dev
<[hidden email]> wrote:
> H all,
>
> I'm looking at generating PIC code for RISC-V in the context of Linux. Not
> sure if anyone is working on this already, any inputs are very welcome.

Great, that would be a useful contribution.

> I'm now looking at function calls which in the RISCV backend are represented
> via two pseudoinstructions RISCV::TAIL and RISCV::CALL.
>
> Currently those pseudos are lowered in MCCodeEmitter. They are expanded into
> AUIPC and JALR instructions and the first one needs a relocation, which for
> a static reloc model is R_RISCV_CALL but for PIC code should be
> R_RISCV_CALL_PLT.
>
> The problem I find is that at this point it is too late to tell the exact
> relocation needed: as far as I can tell there is no way to determine the
> relocation model. Perhaps this is on purpose and the MCCodeEmitter should
> not have that knowledge. Or maybe not and it is just a matter to "push" a
> TargetMachine to it, but the way the class is constructed does not look like
> this approach is workable.
>
> So I was considering lowering these pseudo-instructions in AsmPrinter
> instead. There I can tell the exact kind of the MCOperand I want thanks to
> the fact that the AsmPrinter is constructed with a TargetMachine.
>
> That said perhaps there are extra constraints that require doing the
> lowering in MCCodeEmitter, unfortunately I can't tell exactly what is the
> advantage of lowering that late.

As there is no way of generating an R_RISCV_CALL relocation in
assembly other than using the call pseudoinstruction, the desire is
that you can produce an ELF with that relocation regardless of whether
you emit .s and then assemble it or emit the .o directly. This pushes
you towards lowering at rather a late stage. There may be better ways
of structuring the current logic to achieve that aim of course.

> Alternatively I was considering adding two new pseudos like RISCV::CALL_PLT
> and RISCV::TAIL_PLT and also lower them at MCCodeEmitter. But this looks a
> bit too bulky to me and I think I would still have the issue that the "call"
> and "tail" pseudos in the assembler would need some extra magic (i.e. when
> assembling a "call" pseudoinstruction with -fPIC) so they don't end being
> parsed as the non-PIC counterparts. I might be wrong here though.

As Eli suggests, using the same instruction with different VariantKind
and/or MachineOperand flags would be the right way to go. It seems
that `call foo` in binutils gas always produces an R_RISCV_CALL
relocation while `call foo@plt` will produce R_RISCV_CALL_PLT.

Best,

Alex
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RISCV][PIC] Lowering pseudo instructions in MCCodeEmitter vs AsmPrinter

Bruce Hoult via llvm-dev
In reply to this post by Bruce Hoult via llvm-dev
Thanks for the pointer Mandeep!

2018-07-10 19:00 GMT+02:00 Grang, Mandeep Singh via llvm-dev <[hidden email]>:

Take a look at Alex's review comments for https://reviews.llvm.org/D45395. I had similar questions while implementing tail call lowering for RISCV.

--Mandeep


On 7/10/2018 9:51 AM, Roger Ferrer Ibáñez via llvm-dev wrote:
H all,

I'm looking at generating PIC code for RISC-V in the context of Linux. Not sure if anyone is working on this already, any inputs are very welcome.

I'm now looking at function calls which in the RISCV backend are represented via two pseudoinstructions RISCV::TAIL and RISCV::CALL.

Currently those pseudos are lowered in MCCodeEmitter. They are expanded into AUIPC and JALR instructions and the first one needs a relocation, which for a static reloc model is R_RISCV_CALL but for PIC code should be R_RISCV_CALL_PLT.

The problem I find is that at this point it is too late to tell the exact relocation needed: as far as I can tell there is no way to determine the relocation model. Perhaps this is on purpose and the MCCodeEmitter should not have that knowledge. Or maybe not and it is just a matter to "push" a TargetMachine to it, but the way the class is constructed does not look like this approach is workable.

So I was considering lowering these pseudo-instructions in AsmPrinter instead. There I can tell the exact kind of the MCOperand I want thanks to the fact that the AsmPrinter is constructed with a TargetMachine.

That said perhaps there are extra constraints that require doing the lowering in MCCodeEmitter, unfortunately I can't tell exactly what is the advantage of lowering that late.

These pseudos are marked as isCodegenOnly = 0 so if I lower them in AsmPrinter my understanding is that now I have to change them to isCodegenOnly = 1 and then teach AsmParser to recognize them (I would need to use the reloc model there too). Does this make sense?

Alternatively I was considering adding two new pseudos like RISCV::CALL_PLT and RISCV::TAIL_PLT and also lower them at MCCodeEmitter. But this looks a bit too bulky to me and I think I would still have the issue that the "call" and "tail" pseudos in the assembler would need some extra magic (i.e. when assembling a "call" pseudoinstruction with -fPIC) so they don't end being parsed as the non-PIC counterparts. I might be wrong here though.

Is this reasonable or there are other downsides to consider here?

Thank you very much,

--
Roger Ferrer Ibáñez


_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev




--
Roger Ferrer Ibáñez

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RISCV][PIC] Lowering pseudo instructions in MCCodeEmitter vs AsmPrinter

Bruce Hoult via llvm-dev
In reply to this post by Bruce Hoult via llvm-dev
Hi,

thanks a lot Eli.

Now I see that the relocmodel is not the right piece of information needed here but the target flags (when coming from the backend) and the variant kind (when coming from the assembler). Looks like the former should be somehow lowered to the latter, does this make sense?

Kind regards,

2018-07-10 20:08 GMT+02:00 Friedman, Eli <[hidden email]>:
On 7/10/2018 9:51 AM, Roger Ferrer Ibáñez via llvm-dev wrote:
H all,

I'm looking at generating PIC code for RISC-V in the context of Linux. Not sure if anyone is working on this already, any inputs are very welcome.

I'm now looking at function calls which in the RISCV backend are represented via two pseudoinstructions RISCV::TAIL and RISCV::CALL.

Currently those pseudos are lowered in MCCodeEmitter. They are expanded into AUIPC and JALR instructions and the first one needs a relocation, which for a static reloc model is R_RISCV_CALL but for PIC code should be R_RISCV_CALL_PLT.

This is not really correct.  A direct call is represented using R_RISCV_CALL; a call through a PLT is represented with R_RISCV_CALL_PLT.  In a static relocation model, no calls use the PLT; with a PIC relocation model, some, but not all, calls use the PLT.

In assembly, this is represented by changing the operand of the call instruction: a direct call is "call f", a call through the PLT is "call f@plt". In the MC layer, this is represented using MCSymbolRefExpr::VK_PLT.  In the LLVM backend, this is represented with a target-specifc flag (for example, X86II::MO_PLT).

-Eli

--
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project




--
Roger Ferrer Ibáñez

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RISCV][PIC] Lowering pseudo instructions in MCCodeEmitter vs AsmPrinter

Bruce Hoult via llvm-dev
In reply to this post by Bruce Hoult via llvm-dev
Hi Alex,

thanks a lot for the explanation. I think I now better understand the reason. I will look into the VariantKind and the MachineOperands to convey the needed information to to the MCCodeEmitter.

Kind regards,

2018-07-10 22:26 GMT+02:00 Alex Bradbury <[hidden email]>:
On 10 July 2018 at 17:51, Roger Ferrer Ibáñez via llvm-dev
<[hidden email]> wrote:
> H all,
>
> I'm looking at generating PIC code for RISC-V in the context of Linux. Not
> sure if anyone is working on this already, any inputs are very welcome.

Great, that would be a useful contribution.

> I'm now looking at function calls which in the RISCV backend are represented
> via two pseudoinstructions RISCV::TAIL and RISCV::CALL.
>
> Currently those pseudos are lowered in MCCodeEmitter. They are expanded into
> AUIPC and JALR instructions and the first one needs a relocation, which for
> a static reloc model is R_RISCV_CALL but for PIC code should be
> R_RISCV_CALL_PLT.
>
> The problem I find is that at this point it is too late to tell the exact
> relocation needed: as far as I can tell there is no way to determine the
> relocation model. Perhaps this is on purpose and the MCCodeEmitter should
> not have that knowledge. Or maybe not and it is just a matter to "push" a
> TargetMachine to it, but the way the class is constructed does not look like
> this approach is workable.
>
> So I was considering lowering these pseudo-instructions in AsmPrinter
> instead. There I can tell the exact kind of the MCOperand I want thanks to
> the fact that the AsmPrinter is constructed with a TargetMachine.
>
> That said perhaps there are extra constraints that require doing the
> lowering in MCCodeEmitter, unfortunately I can't tell exactly what is the
> advantage of lowering that late.

As there is no way of generating an R_RISCV_CALL relocation in
assembly other than using the call pseudoinstruction, the desire is
that you can produce an ELF with that relocation regardless of whether
you emit .s and then assemble it or emit the .o directly. This pushes
you towards lowering at rather a late stage. There may be better ways
of structuring the current logic to achieve that aim of course.

> Alternatively I was considering adding two new pseudos like RISCV::CALL_PLT
> and RISCV::TAIL_PLT and also lower them at MCCodeEmitter. But this looks a
> bit too bulky to me and I think I would still have the issue that the "call"
> and "tail" pseudos in the assembler would need some extra magic (i.e. when
> assembling a "call" pseudoinstruction with -fPIC) so they don't end being
> parsed as the non-PIC counterparts. I might be wrong here though.

As Eli suggests, using the same instruction with different VariantKind
and/or MachineOperand flags would be the right way to go. It seems
that `call foo` in binutils gas always produces an R_RISCV_CALL
relocation while `call foo@plt` will produce R_RISCV_CALL_PLT.

Best,

Alex



--
Roger Ferrer Ibáñez

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev