Backend: 2 address + 17bit immediate

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Backend: 2 address + 17bit immediate

Andy Nisbet
Hello,
Im (trying) to write a backend for a simple 32bit processor architecture,
with a single instruction format having no condition code registers.
www.docm.mmu.ac.uk/STAFF/A.Nisbet/Sabre.pdf  is the short 15 page document
describing the architecture of Sabre. It is a Celoxica developed
research/teaching processor, pages 5-8 contain relevant information for
targetting it from a new compiler backend, i,e, it is trivially simple with
25 actual instructions. Typo on page 5, operand A is clearly bits 9-5.

The general form for instructions is:--

opcode %a, %b, 17bit signed immediate.

%b is a source register.
%a is typically the source and the destination register for the operation,
ie %a = operation %a,%b, immediate.
%b and the immediate act like a virtual operand c that is the sum
of  register b's contents and the immediate value.
%b can be omitted if it refers to the "zero valued register %0".
The immediate can be omitted if it has a zero value.
The exceptions to this are the various forms of conditional branch
instructions that must compare the contents of 2 registers and specify a
branch target address using the immediate, (textually the immediate is a
label, in machine code the immediate is a relative offset for the PC).


I have spent some time looking at the PPC and SPARC backends, but obviously
these are much more complicated than what I require to implement.
Consequently, I am not correctly grasping the interactions between
ARCHInstrInfo.td and ARCHDAGToDAGISel.cpp I did manage to hack something
together based on a copy of SPARC (with a SABRE namespace etc) but the
instruction selection was incorrect and I obtained a "Cannot yet
select:0x..." assertion failure from SABREDAGToDAGIsel::SelectCode when I
attempted a
llc -march sabre helloworld.bc -o helloworld.s

Can anyone offer any guidance on how to proceed with debugging instruction
selection issues? Or perhaps some description of how the pattern matching
and the instruction selection works with a verbose explanation for a single
instruction (this would probably be more beneficial), relating the
Processor instruction set to the LLVM supported instruction set and the
actual code generation/printing.


WRT defining the instructions themselves: am I right in thinking that it is
sensible (for instruction selection) to represent the instruction set as a
collection of instructions targetting register register and register
immediate, so for example I would create defs for
ADDrr to match ADD %a,%b
ADDri to match ADD %a, immediate
I have used multiclass to achieve this. Previously I was attempting to
match the opcode %a,%b,immediate general form.

Clearly I also need a way to load a 32 bit constant value into a register
in order to be able to address  more than 64K of memory. I know the PPC
does something similar ...

So for example for SABRE  this instruction output would perform the
necessary ...
MOVri %a, HI16(32 bit constant)
LSHri %a,16
ORri %a, LO16(same 32 bit constant)
LD %d, %a // ie load the contents of the memory at the address stored in %a
into register %d

where the HI/LO16 are performed at code generation by LLVM. I'm a little
confused as to how to specify this as a pattern in tablegen syntax, even
with the PPC example.

Apologies for the naivety of these questions.

Thanks,
         Andy



      Dr. Andy Nisbet: URL http://www.docm.mmu.ac.uk/STAFF/A.Nisbet
Department of Computing and Mathematics, John Dalton Building, Manchester
        Metropolitan University, Chester Street, Manchester M1 5GD, UK.
Email: [hidden email], Phone:(+44)-161-247-1556; Fax:(+44)-161-247-1483.

"Before acting on this email or opening any attachments you
should read the Manchester Metropolitan University's email
disclaimer available on its website
http://www.mmu.ac.uk/emaildisclaimer "

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Backend: 2 address + 17bit immediate

Christopher Lamb
Hi Andy,

I've been working through a backend for the first time over the last several weeks, so I thought I'd share what insights I have into the subjects you mention.

On Mar 22, 2007, at 9:38 AM, Andy Nisbet wrote:

I have spent some time looking at the PPC and SPARC backends, but obviously 
these are much more complicated than what I require to implement. 
Consequently, I am not correctly grasping the interactions between 
ARCHInstrInfo.td and ARCHDAGToDAGISel.cpp I did manage to hack something 
together based on a copy of SPARC (with a SABRE namespace etc) but the 
instruction selection was incorrect and I obtained a "Cannot yet 
select:0x..." assertion failure from SABREDAGToDAGIsel::SelectCode when I 
attempted a
llc -march sabre helloworld.bc -o helloworld.s

Can anyone offer any guidance on how to proceed with debugging instruction 
selection issues?

The *InstrInfo.td will be used to generate a file called *GenDAGIsel.inc in the build directory under lib/Target that you're working on. This is a C++ file that is included into the DAGIsel.cpp file and it contains the instruction selection rules specified in the *InstrInfo.td.

When instruction selection is performed control flow first enters the Select() method of your instruction selector object (usually named something like *DAGToDAGIsel.cpp), if that method doesn't select the dag it calls another method, SelectCode(), which calls into the tblgen generated instruction selection code that is in the .inc file.

The specific assert you mention is in the tblgen generated .inc file when it fails to find a pattern to match your dag. My suggestion would be to step through the Select() function and then into the .inc file to see where you instruction selection may be going awry.


WRT defining the instructions themselves: am I right in thinking that it is 
sensible (for instruction selection) to represent the instruction set as a 
collection of instructions targetting register register and register 
immediate, so for example I would create defs for
ADDrr to match ADD %a,%b
ADDri to match ADD %a, immediate
I have used multiclass to achieve this. Previously I was attempting to 
match the opcode %a,%b,immediate general form.

This seems the sensible way to proceed, to me.

Clearly I also need a way to load a 32 bit constant value into a register 
in order to be able to address  more than 64K of memory. I know the PPC 
does something similar ...

So for example for SABRE  this instruction output would perform the 
necessary ...
MOVri %a, HI16(32 bit constant)
LSHri %a,16
ORri %a, LO16(same 32 bit constant)
LD %d, %a // ie load the contents of the memory at the address stored in %a 
into register %d

where the HI/LO16 are performed at code generation by LLVM. I'm a little 
confused as to how to specify this as a pattern in tablegen syntax, even 
with the PPC example.

Code generating immediates like this for global variables addresses and constant pool addresses involves an interaction between instruction selection and the target lowering implementation. Generating numeric immediates only requires some patterns in the instruction selector. The key is that the global addresses and the numeric immediates follow two separate but similar paths to being code gen'd.

Numeric Immediates:

You'll need NodeXForm's in your InstrInfo.td that implement the LO16/HI16 part, like:

def LO16 : SDNodeXForm<imm, [{
  // Transformation function: get the low 16 bits.
  return CurDAG->getTargetConstant((unsigned)N->getValue() & 0xFFFF, MVT::i32);
}]>;

At this point numeric immediates are simply going to be a pattern like:

def : Pat<(i32 imm:$imm), (ORri (LSHri (MOVri RZero, (HI16 imm:$imm)), 16), (LO16 imm:$imm))>;

Global/Constant Pool Addresses:

I don't completely understand why it's not possible to simply instruction select these as one does with integer immediates, but all the targets I've looked at follow a similar approach that uses customer lowering of these values using two target specific dag nodes. If you look at the Sparc target lowering call (LowerOperation()) for ISD::GlobalAddress you'll see how it gets split into a dag that includes some target specific nodes (SPISD::Hi/Lo).

Then in the InstrInfo.td look for the the SDNode declarations for these target specific nodes and then the selection patterns that match them. It's pretty similar to the pattern for integer immediates.

Hope this helps.
--
Christopher Lamb



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev