Use of LLVM in a Machine Simulator.

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Use of LLVM in a Machine Simulator.

Ralph Corderoy

Hi,

I'm slowly getting to grips with what makes up LLVM.  I intend to use it
in a machine simulator, e.g. processor, clock, RAM, UART, and other
devices, where the processor will be one of several.  It would take a
block of target instructions, e.g. ARM, and produce LLVM to simulate
those on the target machine state, and then JIT them to host
instructions and then execute.

The peripheral simulations would be in C and end up as LLVM too so
optimisations could occur across the ARM->LLVM/peripheral->LLVM
boundary.

Does this sound a good fit so far?

My main question relates to TableGen and decoding the target
instructions.  I was initially going to use something specific to the
task of decoding, e.g. New Jersey Machine Code Toolkit, but wonder if I
could/should make use of the *.td for the various processors already
known to LLVM with a new TableGen back-end?  (I know there isn't support
for ARM yet in LLVM.)  And perhaps the DAG selector is of use in
matching patterns in ARM instructions to the desired LLVM rather than
just doing one ARM instruction at a time production?  (For ARM,
substitute other ISAs, some of which aren't in LLVM.)

I'm looking for guidance so I avoid a dead-end.

Cheers,


Ralph.


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Use of LLVM in a Machine Simulator.

Joseph Altea
I would look at the backend of the gcc (ARM) cross compiler the pass the generates .s files. These run from a table interface and from AST tree of the intermediate langauge that gcc uses. It is not llvm but it is a mapping mechanism that maps to ARM on one side (half your problem) the other half is the map from llvm byte code to the ast tree which you do not need.

Probably taking an ARM assembler map and hand mapping single instructions for all the common stuff would get you going then you could implement something more optimimal if you need. In addition you can optimize llvm bytecode then gen asm or object code depending on your model (Jit I think)

gcc intermiate code is quite simple and you should be able to remap using their ARM as a model into llvm. Perhaps even use the output of the test-suite into a mapper program that takes their instruction cases and then looks them up in a symbol table and then returns th llvm equivalent map so you could possibly automate the process of the transcoding into ARM assembler (JIT)

Anyways juswt some ideas to add to your investigation. In addition please let me know what you find. I am working on seeing about making llvm go on MacPPC on OpenBSD because of the secure environment....

regards, and good luck, please let me know how its going... Joseph Aleta(IVO)


Ralph Corderoy <[hidden email]> wrote:

Hi,

I'm slowly getting to grips with what makes up LLVM. I intend to use it
in a machine simulator, e.g. processor, clock, RAM, UART, and other
devices, where the processor will be one of several. It would take a
block of target instructions, e.g. ARM, and produce LLVM to simulate
those on the target machine state, and then JIT them to host
instructions and then execute.

The peripheral simulations would be in C and end up as LLVM too so
optimisations could occur across the ARM->LLVM/peripheral->LLVM
boundary.

Does this sound a good fit so far?

My main question relates to TableGen and decoding the target
instructions. I was initially going to use something specific to the
task of decoding, e.g. New Jersey Machine Code Toolkit, but wonder if I
could/should make use of the *.td for the various processors already
known to LLVM with a new TableGen back-end? (I know there isn't support
for ARM yet in LLVM.) And perhaps the DAG selector is of use in
matching patterns in ARM instructions to the desired LLVM rather than
just doing one ARM instruction at a time production? (For ARM,
substitute other ISAs, some of which aren't in LLVM.)

I'm looking for guidance so I avoid a dead-end.

Cheers,


Ralph.


_______________________________________________
LLVM Developers mailing list
[hidden email] http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


Switch an email account to Yahoo! Mail, you could win FIFA World Cup tickets.
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Use of LLVM in a Machine Simulator.

Chris Lattner
In reply to this post by Ralph Corderoy
On Sun, 16 Apr 2006, Ralph Corderoy wrote:
> I'm slowly getting to grips with what makes up LLVM.  I intend to use it
> in a machine simulator, e.g. processor, clock, RAM, UART, and other
> devices, where the processor will be one of several.  It would take a
> block of target instructions, e.g. ARM, and produce LLVM to simulate
> those on the target machine state, and then JIT them to host
> instructions and then execute.

Ok.

> The peripheral simulations would be in C and end up as LLVM too so
> optimisations could occur across the ARM->LLVM/peripheral->LLVM
> boundary.
>
> Does this sound a good fit so far?

Sure, that makes sense.

> My main question relates to TableGen and decoding the target
> instructions.  I was initially going to use something specific to the
> task of decoding, e.g. New Jersey Machine Code Toolkit, but wonder if I
> could/should make use of the *.td for the various processors already
> known to LLVM with a new TableGen back-end?  (I know there isn't support
> for ARM yet in LLVM.)  And perhaps the DAG selector is of use in
> matching patterns in ARM instructions to the desired LLVM rather than
> just doing one ARM instruction at a time production?  (For ARM,
> substitute other ISAs, some of which aren't in LLVM.)

This would be an interesting direction to take, but it may not be the
easiest one.  The easiest direction would be to write a hand coded machine
instruction parser (or use something like the machine code toolkit) and
then have a switch statement on the opcode to emit LLVM instructions.

Of interest may be this thesis.  It talks about converting alpha code to
LLVM (among other things): http://llvm.org/pubs/2004-05-JoshiMSThesis.html

-Chris

--
http://nondot.org/sabre/
http://llvm.org/

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Use of LLVM in a Machine Simulator.

Bugzilla from ghost@cs.msu.su
In reply to this post by Ralph Corderoy
Ralph Corderoy wrote:

> I'm slowly getting to grips with what makes up LLVM.  I intend to use it
> in a machine simulator,

Interesting. We have a simulator that we plan to port to LLVM too ;-)

> e.g. processor, clock, RAM, UART, and other
> devices, where the processor will be one of several.  It would take a
> block of target instructions, e.g. ARM, and produce LLVM to simulate
> those on the target machine state, and then JIT them to host
> instructions and then execute.
>
> The peripheral simulations would be in C and end up as LLVM too so
> optimisations could occur across the ARM->LLVM/peripheral->LLVM
> boundary.
>
> Does this sound a good fit so far?

I'm not sure. I would suspect that your peripherals have some internal logic
that's independent from CPU. This means that in order to simulate the whole
system, you need to use some discrete-event simulation engine, where
processor and each peripheral will generate events that will be then
executed in order. With such scheme, I don't think you can do optimization
across ARM->LLVM/peripheral->LLVM boundary. Well, unless your
discrete-event engine is also fully in LLVM, but even then, I'm not sure
there's much potential for optimization.

Maybe, I misunderstood what you're trying to do?

> My main question relates to TableGen and decoding the target
> instructions.  I was initially going to use something specific to the
> task of decoding, e.g. New Jersey Machine Code Toolkit, but wonder if I
> could/should make use of the *.td for the various processors already
> known to LLVM with a new TableGen back-end?  (I know there isn't support
> for ARM yet in LLVM.)  And perhaps the DAG selector is of use in
> matching patterns in ARM instructions to the desired LLVM rather than
> just doing one ARM instruction at a time production?  (For ARM,
> substitute other ISAs, some of which aren't in LLVM.)
>
> I'm looking for guidance so I avoid a dead-end.

As Chris already mentioned, it does not seem like .td file format is very
suitable for that. It might be better to use some other tool or implement
bit pattern matching manually.

- Volodya

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Use of LLVM in a Machine Simulator.

Ralph Corderoy
In reply to this post by Chris Lattner

Hi Chris,

> Of interest may be this thesis.  It talks about converting alpha code
> to LLVM (among other things):
> http://llvm.org/pubs/2004-05-JoshiMSThesis.html

Thanks, it was of interest.  I didn't spot its relevance from the title.

Cheers,


Ralph.


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Use of LLVM in a Machine Simulator.

Chris Lattner
On Tue, 18 Apr 2006, Ralph Corderoy wrote:
>> Of interest may be this thesis.  It talks about converting alpha code
>> to LLVM (among other things):
>> http://llvm.org/pubs/2004-05-JoshiMSThesis.html
>
> Thanks, it was of interest.  I didn't spot its relevance from the title.

Note that the approach is still sound, but the paper is a bit dated (like
anything talking about LLVM published more than 3 months ago :) ).  In
particular, the spirit of this:

"Certain Alpha instructions have no direct and easy mapping to the LLVM
IR. One option is to represent such instructions by complex pieces of LLVM
code. For example, a ctpop instructions counts the number of set bits in a
registers. Although it can be converted into a loop in LLVM, it would be
very difficult for the back end to recognize such loops and regenerate the
ctpop instructions. Secondly, such detailed translation is not required
for optimizations like dead code elimination. Hence, we chose to represent
such instructions in a simpler manner using function calls. These function
calls can be easily recognized by the back end and translated to a single
instruction."

... is right, but the details are no longer true (LLVM does now have
support for ctpop).  Likewise, the spirit of this is correct:

"The LLVM compiler framework has the ability to define intrinsic
functions: functions about which the compiler knows nothing and hence
makes conservative transformations. One drawback of using such intrinsics
is that they interfere with the dead code elimination process: because of
conservative assumptions, LLVM cannot eliminate such function calls.
Specifically, these functions can, in theory, write to memory and hence
cannot be eliminated."

... but the details are no longer right.  In particular, you can specify
whether intrinsics have side effects, etc now, which allows them to be
DCE'd, CSE'd, hoisted out of loops, etc.

Also, note that the alpha backend described in the paper is quite
different than the current alpha backend.

If you have any specific questions, this list is the place to ask. :)

-Chris

--
http://nondot.org/sabre/
http://llvm.org/

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev