LLVM based Virtual Machine "Environment" idea sanity check.

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

LLVM based Virtual Machine "Environment" idea sanity check.

Shawn "AutoDMC" Boles
I've got an idea for a program, and after readig about 1/3 of your
documentation, I think LLVM is what I'm looking for.

What I'd like now is some help to see if my idea is "sane" and and shed
light and direction that could be provided.

I want to build a simplified "Virtual Machine" containing:

A Terminal
Hard Drives (image files)
Some Kind Of Networking Device

LLVM programs would be run "inside" this Virtual Machine, accessing the
terminal and hard drive images and networking device... but not having
access to ANYTHING on the host computer (except through the "virtual"
devices).

I had originally planned on writing my own "processor core" for this
project... but I'd rather use LLVM (Mainly because I don't have to write
my own high level tools).

Here's what I'm thinking I need to do.  It seems to me that I have to
"port" LLVM using the System Library to my "Virtual Machine" (which also
includes a bit of magic of the "Exokernel" operating system stuff).
Then I can run LLVM programs in my "port" on my "environment" to get
what I want.

Then LLVM can JIT compile programs but still only have access to my
"Virtual Machine."  I think.

Any pointers on where to read, Ideas to move on... would be much
appreciated.

Thanks.


--
   +----+       Shawn Boles        +------+/
  /(  )/|     "Chief Engineer"     |     \/
+----+ +      AutoDMC Labs        |Cert. |
|oooo|/                           |Video |
+----+   [hidden email]  +------+
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: LLVM based Virtual Machine "Environment" idea sanity check.

Domagoj Babic
Hi Shawn,

The idea reminds me a lot on the research on Java operating systems. If
you google for it, you'll find a plenty of references.

In my opinion, the main problem you're going to run into if you want to
design a serious system is interprocess communication. With Java, that's
a huge problem, because each app traditionally runs in a separate virtual
machine. Garbage collection complicates this even further.

You might also want to check Singularity project:
http://research.microsoft.com/os/singularity/

--
    Domagoj


On 9/5/06, Shawn AutoDMC Boles <[hidden email]> wrote:

> I've got an idea for a program, and after readig about 1/3 of your
> documentation, I think LLVM is what I'm looking for.
>
> What I'd like now is some help to see if my idea is "sane" and and shed
> light and direction that could be provided.
>
> I want to build a simplified "Virtual Machine" containing:
>
> A Terminal
> Hard Drives (image files)
> Some Kind Of Networking Device
>
> LLVM programs would be run "inside" this Virtual Machine, accessing the
> terminal and hard drive images and networking device... but not having
> access to ANYTHING on the host computer (except through the "virtual"
> devices).
>
> I had originally planned on writing my own "processor core" for this
> project... but I'd rather use LLVM (Mainly because I don't have to write
> my own high level tools).
>
> Here's what I'm thinking I need to do.  It seems to me that I have to
> "port" LLVM using the System Library to my "Virtual Machine" (which also
> includes a bit of magic of the "Exokernel" operating system stuff).
> Then I can run LLVM programs in my "port" on my "environment" to get
> what I want.
>
> Then LLVM can JIT compile programs but still only have access to my
> "Virtual Machine."  I think.
>
> Any pointers on where to read, Ideas to move on... would be much
> appreciated.
>
> Thanks.
>
>
> --
>    +----+       Shawn Boles        +------+/
>   /(  )/|     "Chief Engineer"     |     \/
> +----+ +      AutoDMC Labs        |Cert. |
> |oooo|/                           |Video |
> +----+   [hidden email]  +------+
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: LLVM based Virtual Machine "Environment" idea sanity check.

John Criswell
In reply to this post by Shawn "AutoDMC" Boles
Shawn "AutoDMC" Boles wrote:

> I've got an idea for a program, and after readig about 1/3 of your
> documentation, I think LLVM is what I'm looking for.
>
> What I'd like now is some help to see if my idea is "sane" and and shed
> light and direction that could be provided.
>
> I want to build a simplified "Virtual Machine" containing:
>
> A Terminal
> Hard Drives (image files)
> Some Kind Of Networking Device
>
> LLVM programs would be run "inside" this Virtual Machine, accessing the
> terminal and hard drive images and networking device... but not having
> access to ANYTHING on the host computer (except through the "virtual"
> devices).
>  
If you don't mind my asking, can you tell us a little more about your
overall goal for this project?  How stringent is your isolation
requirement?  What types of programs do you want to run on this VM
(programs written in a special language, small C applications, a full
operating system like Xen/VMWare, etc)?  LLVM can probably make
development go faster, but the sanity of your project greatly depends on
what it is for and just how much it will do.

> I had originally planned on writing my own "processor core" for this
> project... but I'd rather use LLVM (Mainly because I don't have to write
> my own high level tools).
>  
I think LLVM would probably make a good "processor core" for the very
reasons you mention.
> Here's what I'm thinking I need to do.  It seems to me that I have to
> "port" LLVM using the System Library to my "Virtual Machine" (which also
> includes a bit of magic of the "Exokernel" operating system stuff).
> Then I can run LLVM programs in my "port" on my "environment" to get
> what I want.
>  
This doesn't make sense to me.  If you're going to build your virtual
machine out of LLVM components, why would you make those components run
on the virtual machine itself?
> Then LLVM can JIT compile programs but still only have access to my
> "Virtual Machine."  I think.
>  
The difficulty of this depends on the scope of your virtual machine.  
Unlike other bytecode languages, LLVM can represent programs with buffer
overruns and type unsafe casts, both of which could undermine any
restrictions that your VM places on programs that it runs.

Now, you could use LLVM to build a safe virtual machine, but you would
have to figure out how to solve these problems. There are various
solutions of varying degrees of sanity, but I can't really give any
feedback on which is more appropriate until I know more about the
constraints of your project.

-- John T.

> Any pointers on where to read, Ideas to move on... would be much
> appreciated.
>
> Thanks.
>
>
>  

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: LLVM based Virtual Machine "Environment" idea sanity check.

Shawn "AutoDMC" Boles
> If you don't mind my asking, can you tell us a little more about your
> overall goal for this project?
 > Snip
 > I can't really give any
> feedback on which is more appropriate until I know more about the
> constraints of your project.
>
> -- John T.

Hopefully I can explain my project more fully (without being too wordy):


"CRAZY CRACKPOT IDEA"

What I want to do is create an idealized "processing node."  This
virtual machine would include:

1) A processing core
2) Access to "permanent storage"
3) Access to a "Network device"

With any luck, you should be able to run this virtual "processing node"
on an old Pentium II, on a Apple G3, on a soon-to-be-extinct PS2
(because everyone'll be selling them to get the PS3), on a Gumstix
Waysmall computer... basically anything that you can get your hands on.

Each of the computers above would run the "idealized processing node" in
their native operating system.

An outside observer would see a group of heterogeneous computers
networked with each other through the internet.

An "inside" observer would see a group of homogeneous "processing nodes"
networked with each other through Jabber (see below).


"WHAT I WANT TO USE"

I'd like to use LLVM for the Processing Core part of this "idealized
processing node."  This would allow me to exploit the programming
language frontends available and the optimizing and JIT compiling
abilities on the backend.  The key here is that the "node operator"
might want to help a project, but might not want to fully trust the code
for the project.  For example, if I run Folding@home, I have to fully
trust the programmers of the Folding@home client... trust that they
haven't hidden something evil in their program, or have a hidden
vulnerability that hurts me.

However, if the Folding@home process was running in a virtual machine,
they could run just about any code they wanted in it, without me having
to worry (in general).  Of course, this is where JIT compiling should
help out...  Basically, the process should have access to the computing
abilities of the host... but no access to any of the hosts actual
hardware devices.  As far as this hypothetical Folding@home process is
concerned... it's running directly on a simple LLVM processor with a
hard drive and a NIC... that's it.

Also, the Folding@home team has to be able to handle multiple operating
system's way of handling files and permissions.  My "processing node"
would give them a complete "raw hard drive" that they could write to,
without worrying about permissions.  They could use whatever data
storage scheme they found useful, and I would know that no matter what
they wrote, they couldn't clobber any of MY data.  Worst case, their
hard drive image could fill my physical hard drive... but that can be
remedied in the somewhat harsh but forceful way of killing the offending
process and deleting their hard drive image.

As for Networking... I was thinking of using something high level, like
Jabber.  While Jabber was designed for instant messenging... it could
easily be used for "interprocess communication."  The hypothetical
Folding@home client would contact "[hidden email]" to
get it's data.  The processing node's "Jabber ID" would be something
like "node@domain/foldingathome".  The server could send updates to the
node, and the node could send updates to the server... all without
worrying about "IP addresses."


The general all around idea *from the user's side* is to make a way for
a node to be able to run trusted/untrusted code as fast as possible in
as safe a way as possible.

The general all around idea *from the project developer's side* is to
have a system that allows the developer to write one version of a client
and run it on as many "virtual processing nodes" as possible.

The general all around idea *from the point of view of the program being
run* is that it's the only process running on top of very simple
"hardware"... a LLVM processor, a hard drive, and a NIC that speaks Jabber.

My goal is to write the blank-slate VM that is as simple and powerful as
possible... and let the application writers string the "processing
nodes" up how they want.  If they want a more complex Interprocess
Communication... they can write it on top of the
plain-text-through-Jabber protocol (maybe with XML or YAML or something
even more exciting).  Or they can write a library that makes a full UNIX
style file system on top of my raw-block "hard drive."  They can make
the nodes run single file or in a tree or any way they want... I'm just
building the "hardware platform."

As for the "processing node" itself... the user can kill it, or restart
it if it seems to have become "hung..." or the server requesting the
processing job could ask to have it's job reset or killed.  The most
important thing is that a badly written program can "nuke" and "crash"
the VM, but it shouldn't "nuke" or "crash" the host computer.

This is where I'm stumped... how can I use LLVM inside my "simplified
processing node," allowing it only access to my "virtual devices,"
getting it to run "as fast as possible" with as little possibility of it
clobbering the host computer should something untoward happen to the
LLVM code?  I'm still sifting through the documentation and feel like
I'm 90% there to understanding out LLVM works as a Virtual Machine, and
not a Compiler backend or Intermediate Language.  Or, if LLVM isn't my
"perfect match," where should I look?

--
   +----+       Shawn Boles        +------+/
  /(  )/|     "Chief Engineer"     |     \/
+----+ +      AutoDMC Labs        |Cert. |
|oooo|/                           |Video |
+----+   [hidden email]  +------+
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: LLVM based Virtual Machine "Environment" idea sanity check.

Chris Lattner
On Wed, 6 Sep 2006, Shawn Boles wrote:
> This is where I'm stumped... how can I use LLVM inside my "simplified
> processing node," allowing it only access to my "virtual devices,"
> getting it to run "as fast as possible" with as little possibility of it
> clobbering the host computer should something untoward happen to the
> LLVM code?  I'm still sifting through the documentation and feel like
> I'm 90% there to understanding out LLVM works as a Virtual Machine, and
> not a Compiler backend or Intermediate Language.  Or, if LLVM isn't my
> "perfect match," where should I look?

LLVM isn't a perfect match today.  It does provide the ability to
transport code around in a portable way and execute it, but lacks these
features:

1. Given C input code, the output isn't guaranteed to be portable.  This
    is a C limitation, not an LLVM limitation.  In practice, if you avoid a
    few constructs, and stay away for complex system interfaces, the code
    you write will be mostly portable.
2. LLVM doesn't provide safety (it doesn't prevent the program from
    clobbering the machine).

#1 is a limitation of C, not LLVM.  If you use a portable source language,
you'll get portable LLVM Code.  Several people are interested in #2 and
have solutions with various tradeoffs, if you ask here I'm sure they'll
tell you about them.

In practice, the hard part, and the value add, of your project seems to be
the higher-level design issues (what APIs / interfaces do you provide, how
to pull it together into a useful system, how to set up the middleware,
etc).  Building the initial system on top of LLVM without worrying about
#1/#2 seems like a good way to get a prototype up and running, and you
might find out that #1/#2 might get solved for you by other community
members.

-Chris

--
http://nondot.org/sabre/
http://llvm.org/
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: LLVM based Virtual Machine "Environment" idea sanity check.

Ralph Corderoy
In reply to this post by Shawn "AutoDMC" Boles

Hi Shawn,

> What I want to do is create an idealized "processing node."  This
> virtual machine would include:
>
> 1) A processing core
> 2) Access to "permanent storage"
> 3) Access to a "Network device"
>
> With any luck, you should be able to run this virtual "processing node"
> on an old Pentium II, on a Apple G3, on a soon-to-be-extinct PS2
> (because everyone'll be selling them to get the PS3), on a Gumstix
> Waysmall computer... basically anything that you can get your hands on.
>
> Each of the computers above would run the "idealized processing node" in
> their native operating system.

Have you looked at the Inferno OS?  Originally from Bell Labs, now
developed by Vita Nuova.

    http://www.vitanuova.com/inferno/index.html

It sounds just like what you're thinking of.  Even if you do want to
role your own, with or without LLVM, you'll pick up lots of useful
information from the papers on it.

Cheers,


Ralph.


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev