Newbie questions

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
55 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Newbie questions

Archie Cobbs
Hi,

I'm just learning about LLVM (really interesting) and have some newbie
questions. Feel free to ignore or disparage them if they're inappropriate :-)

My area of interest is using LLVM in a Java JVM setting. These are
just some random questions relating to that...

1. What is the status of the LLVM+Java effort? Is it GCJ-specific?
    Is there a web page? I found one link via google but it was broken.

2. I'm curious why the LLVM language includes no instructions for memory
    barriers or any kind of compare-and-swap (bus locking operation). Were
    these considered? Were they deemed too platform-specific? What about
    some definition of the atomicity of instructions (e.g., is a write of
    a 32 bit value to memory guaranteed to be atomic)? More generally does
    the LLVM language define a specific (at least partial) memory model?

3. Would it make sense to extend the LLVM language so that modules,
    variables, functions, and instructions could be annotated with
    arbitrary application-specific annotations? These would be basically
    ignored by LLVM but otherwise "ride along" with their associated items.
    Of course, the impact on annotations of code transformations would have
    to be defined (e.g., if a function is inlined, any associated annotations
    are discarded).

    The thought here is that more optimization may be possible when
    information from the higher-level language is available. E.g. the
    application could participate in transformations, using the annotations
    to answer questions asked by the LLVM transformation via callbacks.

    To give an example (perhaps this is not a real one because possibly it
    can already be captured by LLVM alone) is the use of a Java final instance
    field in a constructor. In Java we're guaranteed that the final field is
    assigned to only once, and any read of that field must follow the initial
    assignment, so even though the field is not read-only during the entire
    constructor, it can be treated as such by any optimizing transformation.

4. Has anyone written Java JNI classes+native code that "wrap" the LLVM API,
    so that the LLVM libraries can be utilized from Java code?

Thanks,
-Archie

__________________________________________________________________________
Archie Cobbs      *        CTO, Awarix        *      http://www.awarix.com

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Newbie questions

Reid Spencer
On Sun, 2006-04-23 at 09:43 -0500, Archie Cobbs wrote:
> Hi,
>
> I'm just learning about LLVM (really interesting) and have some newbie
> questions. Feel free to ignore or disparage them if they're inappropriate :-)

No worries.

>
> My area of interest is using LLVM in a Java JVM setting. These are
> just some random questions relating to that...
>
> 1. What is the status of the LLVM+Java effort?

Incomplete but significant progress has been made. Misha Brukman can
tell you more.
> Is it GCJ-specific?

No, it implements its own Java compiler and bytecode translator.

>     Is there a web page?

Not that I'm away of. But, you can obtain the source code from the llvm-
java repository via CVS. Just replace "llvm" with "llvm-java" in the
usual CVS instructions.

> I found one link via google but it was broken.

Okay, sorry, I don't know about any web site.

>
> 2. I'm curious why the LLVM language includes no instructions for memory
>     barriers or any kind of compare-and-swap (bus locking operation). Were
>     these considered?

Yes.

> Were they deemed too platform-specific?

No.

> What about
>     some definition of the atomicity of instructions (e.g., is a write of
>     a 32 bit value to memory guaranteed to be atomic)? More generally does
>     the LLVM language define a specific (at least partial) memory model?

Currently the language doesn't support atomic instructions. However,
some work has been done at UIUC to implement sufficient fundamental
instructions that could permit an entire threading and synchronization
package to be constructed.  This work is not complete, and is not in the
LLVM repository yet. I'm not sure of the status at UIUC of this effort,
as it hasn't been discussed in a while. It is definitely something that
will be needed going forward.

>
> 3. Would it make sense to extend the LLVM language so that modules,
>     variables, functions, and instructions could be annotated with
>     arbitrary application-specific annotations?

Some of these things already inherit from the "Annotable" class which
permits Annotations on the object. However, their use is discouraged and
we will, eventually, remove them from the LLVM IR. The problem is that
Annotations create problems for the various passes that need to
understand them. We have decided, from a design perspective, that (a) if
its important enough to be generally applicable, it should be part of
the LLVM IR, not tucked away in an Annotation and (b) for things
specific to a language or system that Annotations are insufficient
anyway and a higher level construction (possibly making reference to
LLVM IR objects) would be needed anyway.

> These would be basically
>     ignored by LLVM but otherwise "ride along" with their associated items.
>     Of course, the impact on annotations of code transformations would have
>     to be defined (e.g., if a function is inlined, any associated annotations
>     are discarded).

Yes, we've been through this discussion many times before and the
solution was to not support Annotations at all as discussed above. There
are numerous issues with saving the annotations in the bytecode, how
they affect the passes, what happens to them after the code is modified
by a pass (as you noted), etc.
>
>     The thought here is that more optimization may be possible when
>     information from the higher-level language is available. E.g. the
>     application could participate in transformations, using the annotations
>     to answer questions asked by the LLVM transformation via callbacks.

Its my opinion that those things should be handled by the higher-level
language's own passes on its AST where full semantic knowledge of the
language is available. Remember that LLVM provides an "Intermediate
Representation", not a high-level AST. The desire to support Annotations
is an attempt to force the IR into a higher level of abstraction than it
was designed for.  

The use of callbacks is problematic as it would require LLVM to manage
numerous dynamic libraries that correspond to those callbacks, provide a
scheme for understanding which callbacks to call in various
circumstances, etc. Consider a bytecode file that was generated as
linking bytecode files from four or five different languages and then
being delivered to another environment for further optimization and
execution. Are all those language's dynamic libraries available so the
callbacks can be called?
>
>     To give an example (perhaps this is not a real one because possibly it
>     can already be captured by LLVM alone) is the use of a Java final instance
>     field in a constructor. In Java we're guaranteed that the final field is
>     assigned to only once, and any read of that field must follow the initial
>     assignment, so even though the field is not read-only during the entire
>     constructor, it can be treated as such by any optimizing transformation.

LLVM would already recognize such a case and permit the appropriate
optimizations.

>
> 4. Has anyone written Java JNI classes+native code that "wrap" the LLVM API,
>     so that the LLVM libraries can be utilized from Java code?

No, there's no Java interface at this time. Patches accepted :)  

There is, however, a burgeoning PyPy interfaces that is being
developed.
>
> Thanks,
> -Archie
> __________________________________________________________________________
> Archie Cobbs      *        CTO, Awarix        *      http://www.awarix.com
>

You're welcome. Hope it was useful. I'm sure others will respond as
well, so stay tuned.

Reid Spencer.

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

signature.asc (196 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Newbie questions

Archie Cobbs
Reid Spencer wrote:
>> 1. What is the status of the LLVM+Java effort?
>
> Incomplete but significant progress has been made. Misha Brukman can
> tell you more.
>> Is it GCJ-specific?
>
> No, it implements its own Java compiler and bytecode translator.

Has it been hooked up to a JVM? If so, how and which ones?

Thanks for your other answers re annotations and memory model.

-Archie

__________________________________________________________________________
Archie Cobbs      *        CTO, Awarix        *      http://www.awarix.com

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Newbie questions

Chris Lattner
In reply to this post by Reid Spencer
On Sun, 23 Apr 2006, Reid Spencer wrote:
>> My area of interest is using LLVM in a Java JVM setting. These are
>> just some random questions relating to that...
>>
>> 1. What is the status of the LLVM+Java effort?
>
> Incomplete but significant progress has been made. Misha Brukman can
> tell you more.

There are actually two different LLVM + Java projects.  The 'llvm-java'
project, developed primarily by Alkis Evlogimenos <[hidden email]>,
is in LLVM CVS.  It has basic stuff working, but is far from complete.  It
is also stalled: noone is working on it any longer.

The second project is a LLVM JIT backend to GCJX.  GCJX (developed by Tom
Tromey) is far more complete, but the LLVM backend is quite new.

>> 2. I'm curious why the LLVM language includes no instructions for memory
>>     barriers or any kind of compare-and-swap (bus locking operation). Were
>>     these considered?
>
> Yes.

Someone did actually develop intrinsics for these, but they were never
contributed back to LLVM.

>> What about
>>     some definition of the atomicity of instructions (e.g., is a write of
>>     a 32 bit value to memory guaranteed to be atomic)? More generally does
>>     the LLVM language define a specific (at least partial) memory model?
>
> Currently the language doesn't support atomic instructions. However,
...
> LLVM repository yet. I'm not sure of the status at UIUC of this effort,
> as it hasn't been discussed in a while. It is definitely something that
> will be needed going forward.

My understanding is that it is stalled.  If someone wanted to do something
regarding this, it would be quite welcome.  In particular, for atomic
operations, starting with implementation of the new GCC atomic builtins
would make a lot of sense.

>>     To give an example (perhaps this is not a real one because possibly it
>>     can already be captured by LLVM alone) is the use of a Java final instance
>>     field in a constructor. In Java we're guaranteed that the final field is
>>     assigned to only once, and any read of that field must follow the initial
>>     assignment, so even though the field is not read-only during the entire
>>     constructor, it can be treated as such by any optimizing transformation.
>
> LLVM would already recognize such a case and permit the appropriate
> optimizations.

This specific optimization can also be easily handled in a LLVM JIT
environment.

-Chris

--
http://nondot.org/sabre/
http://llvm.org/

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Newbie questions

Chris Lattner
In reply to this post by Archie Cobbs
On Sun, 23 Apr 2006, Archie Cobbs wrote:
>> No, it implements its own Java compiler and bytecode translator.
>
> Has it been hooked up to a JVM? If so, how and which ones?

llvm-java has been hooked up to a class library (classpath), and
implements all of the VM (AFAIK).  That said, you'd probably be better off
working on GCJX right now, unless you'd like to do a lot of fundamental
development on the llvm-java front-end.

-Chris

--
http://nondot.org/sabre/
http://llvm.org/

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Newbie questions

Reid Spencer
In reply to this post by Archie Cobbs
On Sun, 2006-04-23 at 15:09 -0500, Archie Cobbs wrote:

> Reid Spencer wrote:
> >> 1. What is the status of the LLVM+Java effort?
> >
> > Incomplete but significant progress has been made. Misha Brukman can
> > tell you more.
> >> Is it GCJ-specific?
> >
> > No, it implements its own Java compiler and bytecode translator.
>
> Has it been hooked up to a JVM? If so, how and which ones?
I think the point of llvm-java was to avoid a JVM. That is, it converts
either Java source or Java bytecode into equivalent LLVM bytecode. I
think the big thing lacking so far are the Java library and support for
things that LLVM doesn't natively support (threading, synchronization
come to mind).  If you need more detail, Alkis (author of llvm-java) is
going to have to respond. Otherwise, you'll need to take a look at the
code.

>
> Thanks for your other answers re annotations and memory model.

You're welcome.

>
> -Archie
>
> __________________________________________________________________________
> Archie Cobbs      *        CTO, Awarix        *      http://www.awarix.com
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

signature.asc (196 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Newbie questions

Chris Lattner
On Sun, 23 Apr 2006, Reid Spencer wrote:
>> Has it been hooked up to a JVM? If so, how and which ones?
>
> I think the point of llvm-java was to avoid a JVM. That is, it converts

llvm-java is the JVM.

> either Java source or Java bytecode into equivalent LLVM bytecode. I

llvm-java only supports input from Java bytecode.

> think the big thing lacking so far are the Java library and support for

llvm-java uses classpath for it's library.

> things that LLVM doesn't natively support (threading, synchronization
> come to mind).  If you need more detail, Alkis (author of llvm-java) is
> going to have to respond. Otherwise, you'll need to take a look at the
> code.

It's actually missing quite a bit.  It is missing too much to support
programs that use System.Out, for example.  Alkis is definitely the person
to talk to if you're interested in it.

-Chris

--
http://nondot.org/sabre/
http://llvm.org/

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Newbie questions

Archie Cobbs
Chris Lattner wrote:

>> I think the point of llvm-java was to avoid a JVM. That is, it converts
>
> llvm-java is the JVM.
>
>> either Java source or Java bytecode into equivalent LLVM bytecode. I
>
> llvm-java only supports input from Java bytecode.
>
>> think the big thing lacking so far are the Java library and support for
>
> llvm-java uses classpath for it's library.
>
>> things that LLVM doesn't natively support (threading, synchronization
>> come to mind).  If you need more detail, Alkis (author of llvm-java) is
>> going to have to respond. Otherwise, you'll need to take a look at the
>> code.
>
> It's actually missing quite a bit.  It is missing too much to support
> programs that use System.Out, for example.  Alkis is definitely the
> person to talk to if you're interested in it.

Thanks.. I'm actually more interested in what would be involved to
hook up LLVM to an existing JVM. In particular JCVM (http://jcvm.sf.net).
JCVM analyzes bytecode using Soot, emits C code, compiles that with GCC,
and then loads executable code from the resulting ELF files.. given this
design, using LLVM/modules instead of Soot/GCC/ELF would not be very much
different, but would allow more cool things to happen.

The main barrier to this idea for me are (besides the usual: limited time
for fun projects) is understanding how it could work. In particular, how
would one bridge the C vs. C++ gap. JCVM is written in C, and I have lots
of C and Java experience, but zero with C++. Dumb question: can a C program
link with and invoke C++ libraries? Or perhaps a little C++ starter program
is all that is needed, then the existing code can be used via extern "C"?
Alternately, if there were Java JNI wrappers, I could invoke those... Etc.

-Archie

__________________________________________________________________________
Archie Cobbs      *        CTO, Awarix        *      http://www.awarix.com

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Newbie questions

Chris Lattner
On Sun, 23 Apr 2006, Archie Cobbs wrote:

>> It's actually missing quite a bit.  It is missing too much to support
>> programs that use System.Out, for example.  Alkis is definitely the person
>> to talk to if you're interested in it.
>
> Thanks.. I'm actually more interested in what would be involved to
> hook up LLVM to an existing JVM. In particular JCVM (http://jcvm.sf.net).
> JCVM analyzes bytecode using Soot, emits C code, compiles that with GCC,
> and then loads executable code from the resulting ELF files.. given this
> design, using LLVM/modules instead of Soot/GCC/ELF would not be very much
> different, but would allow more cool things to happen.

Okay.

> The main barrier to this idea for me are (besides the usual: limited time
> for fun projects) is understanding how it could work. In particular, how
> would one bridge the C vs. C++ gap. JCVM is written in C, and I have lots
> of C and Java experience, but zero with C++. Dumb question: can a C program
> link with and invoke C++ libraries? Or perhaps a little C++ starter program
> is all that is needed, then the existing code can be used via extern "C"?
> Alternately, if there were Java JNI wrappers, I could invoke those... Etc.

C programs certainly can use C++ libraries, as you say, with extern "C".

I don't know how well  JCVM would work with llvm-java, I guess you'd have
to try it and see.

-Chris

--
http://nondot.org/sabre/
http://llvm.org/

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Newbie questions

Vikram S. Adve-2
In reply to this post by Archie Cobbs

On Apr 23, 2006, at 9:32 PM, Archie Cobbs wrote:

> Chris Lattner wrote:
>>> I think the point of llvm-java was to avoid a JVM. That is, it  
>>> converts
>> llvm-java is the JVM.
>>> either Java source or Java bytecode into equivalent LLVM bytecode. I
>> llvm-java only supports input from Java bytecode.
>>> think the big thing lacking so far are the Java library and  
>>> support for
>> llvm-java uses classpath for it's library.
>>> things that LLVM doesn't natively support (threading,  
>>> synchronization
>>> come to mind).  If you need more detail, Alkis (author of llvm-
>>> java) is
>>> going to have to respond. Otherwise, you'll need to take a look  
>>> at the
>>> code.
>> It's actually missing quite a bit.  It is missing too much to  
>> support programs that use System.Out, for example.  Alkis is  
>> definitely the person to talk to if you're interested in it.
>
> Thanks.. I'm actually more interested in what would be involved to
> hook up LLVM to an existing JVM. In particular JCVM (http://
> jcvm.sf.net).
> JCVM analyzes bytecode using Soot, emits C code, compiles that with  
> GCC,
> and then loads executable code from the resulting ELF files.. given  
> this
> design, using LLVM/modules instead of Soot/GCC/ELF would not be  
> very much
> different, but would allow more cool things to happen.

If you're only interested in using LLVM for "cool things" (such as  
optimization), you could use it directly on the C code you emit.

Either way, one issue that you will have to deal with is preserving  
the behavior of Java exceptions (assuming you care about that).  LLVM  
does not preserve the order of potentially excepting instructions  
(e.g., a divide or a load).  This would have to be handled  
explicitly, whether you use llvm-java or simply used LLVM on the C  
code from Soot.  I don't know if/how libgcj handles this but Tom may  
be able to say more about that.

>
> The main barrier to this idea for me are (besides the usual:  
> limited time
> for fun projects) is understanding how it could work. In  
> particular, how
> would one bridge the C vs. C++ gap. JCVM is written in C, and I  
> have lots
> of C and Java experience, but zero with C++. Dumb question: can a C  
> program
> link with and invoke C++ libraries? Or perhaps a little C++ starter  
> program
> is all that is needed, then the existing code can be used via  
> extern "C"?
> Alternately, if there were Java JNI wrappers, I could invoke  
> those... Etc.
>
> -Archie
>



--Vikram
http://www.cs.uiuc.edu/~vadve
http://llvm.cs.uiuc.edu/


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Newbie questions

Archie Cobbs
Vikram Adve wrote:
> If you're only interested in using LLVM for "cool things" (such as  
> optimization), you could use it directly on the C code you emit.

Yes... though the translation to C loses some efficiency due to
"impedance mismatch". More ideal would be to go from bytecode -> LLVM
directly (I understand this part has already been done more or less).

> Either way, one issue that you will have to deal with is preserving  the
> behavior of Java exceptions (assuming you care about that).  LLVM  does
> not preserve the order of potentially excepting instructions  (e.g., a
> divide or a load).  This would have to be handled  explicitly, whether
> you use llvm-java or simply used LLVM on the C  code from Soot.  I don't
> know if/how libgcj handles this but Tom may  be able to say more about
> that.

Right.. I think we'd have to revert to signal-less exception checking
for null pointer and divide-by-zero for the time being.

But this brings up a good point.. should LLVM have an "instruction
barrier" instruction? I.e., an instruction which, within the context
of one basic block, would prevent any instructions before the barrier
from being reordered after any instructions after the barrier?

Then a JVM could use signals and still guarantee Java's "exactness"
of exceptions by bracketing each potentially-signal-generating instruction
with instruction barriers.

Someone must have already invented this and given it a better name.

Related idea.. what if all instructions (not just "invoke") could be
allowed to have an optional "except label ..."?

-Archie

__________________________________________________________________________
Archie Cobbs      *        CTO, Awarix        *      http://www.awarix.com

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Newbie questions

Chris Lattner
On Mon, 24 Apr 2006, Archie Cobbs wrote:
> Related idea.. what if all instructions (not just "invoke") could be
> allowed to have an optional "except label ..."?

This is the direction that we plan to go, when someone is interested
enough to implement it.  There are some rough high-level notes about this
idea here:
http://nondot.org/sabre/LLVMNotes/ExceptionHandlingChanges.txt

-Chris

--
http://nondot.org/sabre/
http://llvm.org/

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Newbie questions

Archie Cobbs
Chris Lattner wrote:
> On Mon, 24 Apr 2006, Archie Cobbs wrote:
>> Related idea.. what if all instructions (not just "invoke") could be
>> allowed to have an optional "except label ..."?
>
> This is the direction that we plan to go, when someone is interested
> enough to implement it.  There are some rough high-level notes about
> this idea here:
> http://nondot.org/sabre/LLVMNotes/ExceptionHandlingChanges.txt

Those ideas make sense.. one question though:

> Note that this bit is sufficient to represent many possible scenarios.  In
> particular, a Java compiler would mark just about every load, store and other
> exception inducing operation as traping.  If a load is marked potentially
> trapping, the optimizer is required to preserve it (even if its value is not
> used) unless it can prove that it dynamically cannot trap.  In practice, this
> means that standard LLVM analyses would be used to prove that exceptions
> cannot happen, then clear the bit.  As the bits are cleared, exception handlers
> can be deleted and dead loads (for example) can also be removed.

The idea of the optimizer computing that a trap can't happen is obviously
desirable, but how does the front end tell the optimizer how to figure
that out? I.e., consider this java:

   void foo(SomeClass x) {
     x.field1 = 123;
     x.field2 = 456;      // no nNullPointerException possible here
   }

Clearly an exception can happen with the first statement -- iff x is null.
But no exception is possible on the second statement. But how does the
optimizer "know" this without being Java specific? It seems like LLVM
will have to have some built-in notion of a "null pointer" generated
exception. Similarly for divide by zero, e.g.:

   void bar(int x) {
     if (x != 0)
       this.y = 100/x;   // no ArithmeticException possible here
   }

How will the optimizer "know" the exception can't happen?

------

Another random question: can a global variable be considered variable
in one function but constant in another?

Motivation: Java's "first active use" requirement for class initialization.
When invoking a static method, it's possible that a class may need to
be initialized, However, when invoking an instance method, that's not
possible.

Perhaps there should be a way in LLVM to specify predicates (or at least
properties of global variables and parameters) that are known to be true
at the start of each function... ?

-----

In general, I agree with the idea that front-end annotations are fraught
with questions and complexity. But the alternative requires expressing all
that same information explicitly in LLVM, which is what I'm wondering about.

-----

Trying to summarize this thread a bit, here is a list of some of the
issues brought up relating to the goal of "best case" Java support...

  1. Definition and clarification of the memory model.
  2. Need some instructions for atomic operations.
  3. Explicit support for exceptions from instructions other than invoke.
  4. Ensuring there are mechanisms for passing through all appropriate
     optimization-useful information from the front end to LLVM in a
     non-Java-specific way (e.g., see "active use" check above).

-Archie

__________________________________________________________________________
Archie Cobbs      *        CTO, Awarix        *      http://www.awarix.com

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Newbie questions

Alkis Evlogimenos-2
On 4/25/06, Archie Cobbs <[hidden email]> wrote:

> Chris Lattner wrote:
> > On Mon, 24 Apr 2006, Archie Cobbs wrote:
> >> Related idea.. what if all instructions (not just "invoke") could be
> >> allowed to have an optional "except label ..."?
> >
> > This is the direction that we plan to go, when someone is interested
> > enough to implement it.  There are some rough high-level notes about
> > this idea here:
> > http://nondot.org/sabre/LLVMNotes/ExceptionHandlingChanges.txt
>
> Those ideas make sense.. one question though:
>
> > Note that this bit is sufficient to represent many possible scenarios.  In
> > particular, a Java compiler would mark just about every load, store and other
> > exception inducing operation as traping.  If a load is marked potentially
> > trapping, the optimizer is required to preserve it (even if its value is not
> > used) unless it can prove that it dynamically cannot trap.  In practice, this
> > means that standard LLVM analyses would be used to prove that exceptions
> > cannot happen, then clear the bit.  As the bits are cleared, exception handlers
> > can be deleted and dead loads (for example) can also be removed.
>
> The idea of the optimizer computing that a trap can't happen is obviously
> desirable, but how does the front end tell the optimizer how to figure
> that out? I.e., consider this java:
>
>    void foo(SomeClass x) {
>      x.field1 = 123;
>      x.field2 = 456;      // no nNullPointerException possible here
>    }
>
> Clearly an exception can happen with the first statement -- iff x is null.
> But no exception is possible on the second statement. But how does the
> optimizer "know" this without being Java specific? It seems like LLVM
> will have to have some built-in notion of a "null pointer" generated
> exception. Similarly for divide by zero, e.g.:

I think this is feasible to optimize in llvm. This would require to
write an optimization pass (that can be used by any language
implementation). Since x.field1 and x.field2 will involve
getelementptr instructions, we can have some logic in an optimization
pass to prove what you are saying: if x is null then only the first
memop through a getelementptr on x will trap.

>    void bar(int x) {
>      if (x != 0)
>        this.y = 100/x;   // no ArithmeticException possible here
>    }
>
> How will the optimizer "know" the exception can't happen?

This should be pretty straight forward to implement as well (by
writing the proper optimization pass).

> ------
>
> Another random question: can a global variable be considered variable
> in one function but constant in another?

No.

> Motivation: Java's "first active use" requirement for class initialization.
> When invoking a static method, it's possible that a class may need to
> be initialized, However, when invoking an instance method, that's not
> possible.
>
> Perhaps there should be a way in LLVM to specify predicates (or at least
> properties of global variables and parameters) that are known to be true
> at the start of each function... ?

I think this will end up being the same as the null pointer trapping
instruction optimization. The implementation will very likely involve
some pointer to the description of the class. To make this fast this
pointer will be null if the class is not loaded and you trap when you
try to use it and perform initialization. So in the end the same
optimization pass that was used for successive field accesses can be
used for class initialization as well.

> -----
>
> In general, I agree with the idea that front-end annotations are fraught
> with questions and complexity. But the alternative requires expressing all
> that same information explicitly in LLVM, which is what I'm wondering about.
>
> -----
>
> Trying to summarize this thread a bit, here is a list of some of the
> issues brought up relating to the goal of "best case" Java support...
>
>   1. Definition and clarification of the memory model.
>   2. Need some instructions for atomic operations.
>   3. Explicit support for exceptions from instructions other than invoke.
>   4. Ensuring there are mechanisms for passing through all appropriate
>      optimization-useful information from the front end to LLVM in a
>      non-Java-specific way (e.g., see "active use" check above).
>
> -Archie
>
> __________________________________________________________________________
> Archie Cobbs      *        CTO, Awarix        *      http://www.awarix.com
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>


--

Alkis

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Newbie questions

Archie Cobbs
Alkis Evlogimenos wrote:

> On 4/25/06, Archie Cobbs <[hidden email]> wrote:
>> Motivation: Java's "first active use" requirement for class initialization.
>> When invoking a static method, it's possible that a class may need to
>> be initialized, However, when invoking an instance method, that's not
>> possible.
>>
>> Perhaps there should be a way in LLVM to specify predicates (or at least
>> properties of global variables and parameters) that are known to be true
>> at the start of each function... ?
>
> I think this will end up being the same as the null pointer trapping
> instruction optimization. The implementation will very likely involve
> some pointer to the description of the class. To make this fast this
> pointer will be null if the class is not loaded and you trap when you
> try to use it and perform initialization. So in the end the same
> optimization pass that was used for successive field accesses can be
> used for class initialization as well.

If that were the implementation then yes that could work. But using
a null pointer like this probably wouldn't be the case. In Java you have
to load a class before you initialize it, so the pointer to the type
structure will already be non-null.

In JCVM for example, there is a bit in type->flags that determines
whether the class is initialized or not. This bit has to be checked
before every static method invocation or static field access. You could
reserve an entire byte instead of a bit, but I don't know if that would
make it any easier to do this optimization.

------

I'm not entirely convinced (or understanding) how the "no annotations"
approach is supposed to work. For example, for optimizing away Java's
"active use" checks as discussed above. How specifically does this
optimzation get done? Saying that the implementation will "likely" use
a null pointer is not an answer because, what if the implementation
doesn't use a null pointer? I.e., my question is the more general one:
how do optimizations that are specific to the front-end language get
done? How does the front-end "secret knowledge" get passed through
somehow so it can be used for optimization purposes?

Apologies for sounding skeptical, I'm just trying to nail down an
answer to a kindof philosophical question.

------

Another question: does LLVM know about or handle signal frames? What
if code wants to unwind across a signal frame? This is another thing
that would be required for Java if e.g. you wanted to detect null
pointer access via signals. Note setjmp/longjmp works OK across signal
frames.

Thanks,
-Archie

__________________________________________________________________________
Archie Cobbs      *        CTO, Awarix        *      http://www.awarix.com

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Newbie questions

Chris Lattner
In reply to this post by Archie Cobbs
On Mon, 24 Apr 2006, Archie Cobbs wrote:
> The idea of the optimizer computing that a trap can't happen is obviously
> desirable, but how does the front end tell the optimizer how to figure

It depends on the front-end.  If you're coming from C, there is no good
way.  C can't express these properties well.  However, this this specific
case:

> that out? I.e., consider this java:
>
>  void foo(SomeClass x) {
>    x.field1 = 123;
>    x.field2 = 456;      // no nNullPointerException possible here
>  }
>
> Clearly an exception can happen with the first statement -- iff x is null.
> But no exception is possible on the second statement. But how does the
> optimizer "know" this without being Java specific?

This isn't specific to Java.  LLVM pointers can't wrap around the end of
the address space, so if the first access successed, the second must also.
LLVM won't guarantee exceptions for bogus pointers, just null pointers, so
it could do this without a problem.

> Another random question: can a global variable be considered variable
> in one function but constant in another?

No.

> Motivation: Java's "first active use" requirement for class initialization.
> When invoking a static method, it's possible that a class may need to
> be initialized, However, when invoking an instance method, that's not
> possible.

You need to modify the isConstant flag on the global after the initializer
has been run.  This requires JIT compilation.  Note that the LLVM
optimizer should already do a reasonable job of optimizing some common
cases of this, as a similar thing happens when initializing C++ static
variables.

> Perhaps there should be a way in LLVM to specify predicates (or at least
> properties of global variables and parameters) that are known to be true
> at the start of each function... ?

We don't have something like this currently.  It sounds tricky to get
right.

> Trying to summarize this thread a bit, here is a list of some of the
> issues brought up relating to the goal of "best case" Java support...
>
> 1. Definition and clarification of the memory model.
> 2. Need some instructions for atomic operations.
> 3. Explicit support for exceptions from instructions other than invoke.
> 4. Ensuring there are mechanisms for passing through all appropriate
>    optimization-useful information from the front end to LLVM in a
>    non-Java-specific way (e.g., see "active use" check above).

Yup.

-Chris

--
http://nondot.org/sabre/
http://llvm.org/

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Newbie questions

Chris Lattner
In reply to this post by Archie Cobbs
On Tue, 25 Apr 2006, Archie Cobbs wrote:
> Another question: does LLVM know about or handle signal frames? What
> if code wants to unwind across a signal frame? This is another thing
> that would be required for Java if e.g. you wanted to detect null
> pointer access via signals. Note setjmp/longjmp works OK across signal
> frames.

This is up to the unwinding library implementation for the target in
question.  LLVM will support it if your unwinder library supports it.

-Chris

--
http://nondot.org/sabre/
http://llvm.org/

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Newbie questions

Alkis Evlogimenos-2
In reply to this post by Archie Cobbs
On 4/25/06, Archie Cobbs <[hidden email]> wrote:

> Alkis Evlogimenos wrote:
> > On 4/25/06, Archie Cobbs <[hidden email]> wrote:
> >> Motivation: Java's "first active use" requirement for class initialization.
> >> When invoking a static method, it's possible that a class may need to
> >> be initialized, However, when invoking an instance method, that's not
> >> possible.
> >>
> >> Perhaps there should be a way in LLVM to specify predicates (or at least
> >> properties of global variables and parameters) that are known to be true
> >> at the start of each function... ?
> >
> > I think this will end up being the same as the null pointer trapping
> > instruction optimization. The implementation will very likely involve
> > some pointer to the description of the class. To make this fast this
> > pointer will be null if the class is not loaded and you trap when you
> > try to use it and perform initialization. So in the end the same
> > optimization pass that was used for successive field accesses can be
> > used for class initialization as well.
>
> If that were the implementation then yes that could work. But using
> a null pointer like this probably wouldn't be the case. In Java you have
> to load a class before you initialize it, so the pointer to the type
> structure will already be non-null.

That is why I said if you want it to be fast :-). My point was that if
you want this to be fast you need to find a way to make it trap when a
class is not initialized. If you employ the method you mention below
for JCVM then you need to perform optimizations to simplify the
conditionals.

> In JCVM for example, there is a bit in type->flags that determines
> whether the class is initialized or not. This bit has to be checked
> before every static method invocation or static field access. You could
> reserve an entire byte instead of a bit, but I don't know if that would
> make it any easier to do this optimization.
>
> ------
>
> I'm not entirely convinced (or understanding) how the "no annotations"
> approach is supposed to work. For example, for optimizing away Java's
> "active use" checks as discussed above. How specifically does this
> optimzation get done? Saying that the implementation will "likely" use
> a null pointer is not an answer because, what if the implementation
> doesn't use a null pointer? I.e., my question is the more general one:
> how do optimizations that are specific to the front-end language get
> done? How does the front-end "secret knowledge" get passed through
> somehow so it can be used for optimization purposes?
>
> Apologies for sounding skeptical, I'm just trying to nail down an
> answer to a kindof philosophical question.
>
> ------
>
> Another question: does LLVM know about or handle signal frames? What
> if code wants to unwind across a signal frame? This is another thing
> that would be required for Java if e.g. you wanted to detect null
> pointer access via signals. Note setjmp/longjmp works OK across signal
> frames.
>
> Thanks,
> -Archie
>
> __________________________________________________________________________
> Archie Cobbs      *        CTO, Awarix        *      http://www.awarix.com
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>


--

Alkis

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Newbie questions

Archie Cobbs
Alkis Evlogimenos wrote:

> On 4/25/06, Archie Cobbs <[hidden email]> wrote:
>> Alkis Evlogimenos wrote:
>>> On 4/25/06, Archie Cobbs <[hidden email]> wrote:
>>>> Motivation: Java's "first active use" requirement for class initialization.
>>>> When invoking a static method, it's possible that a class may need to
>>>> be initialized, However, when invoking an instance method, that's not
>>>> possible.
>>>>
>>>> Perhaps there should be a way in LLVM to specify predicates (or at least
>>>> properties of global variables and parameters) that are known to be true
>>>> at the start of each function... ?
>>> I think this will end up being the same as the null pointer trapping
>>> instruction optimization. The implementation will very likely involve
>>> some pointer to the description of the class. To make this fast this
>>> pointer will be null if the class is not loaded and you trap when you
>>> try to use it and perform initialization. So in the end the same
>>> optimization pass that was used for successive field accesses can be
>>> used for class initialization as well.
>> If that were the implementation then yes that could work. But using
>> a null pointer like this probably wouldn't be the case. In Java you have
>> to load a class before you initialize it, so the pointer to the type
>> structure will already be non-null.
>
> That is why I said if you want it to be fast :-). My point was that if
> you want this to be fast you need to find a way to make it trap when a
> class is not initialized. If you employ the method you mention below
> for JCVM then you need to perform optimizations to simplify the
> conditionals.

I get it. My point however is larger than just this one example.
You can't say "just use a null pointer" for every possible optimization
based on front end information. Maybe that happens to work for
active class checks, but it's not a general answer.

Requoting myself:

 > I.e., my question is the more general one:
 > how do optimizations that are specific to the front-end language get
 > done? How does the front-end "secret knowledge" get passed through
 > somehow so it can be used for optimization purposes?

-Archie

__________________________________________________________________________
Archie Cobbs      *        CTO, Awarix        *      http://www.awarix.com

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Newbie questions

Reid Spencer
On Wed, 2006-04-26 at 09:01 -0500, Archie Cobbs wrote:

> Requoting myself:
>
>  > I.e., my question is the more general one:
>  > how do optimizations that are specific to the front-end language get
>  > done? How does the front-end "secret knowledge" get passed through
>  > somehow so it can be used for optimization purposes?
>
> -Archie

Archie,

The quick answer is that it doesn't. The front end is responsible for
having its own AST (higher level representation) and running its own
optimizations on that. From there you generate the LLVM intermediate
representation (IR) and run on that whatever LLVM optimization passes
are appropriate for your language and the level of optimization you want
to get to. The "secret knowledge" is retained by the language's front
end. However, your front end is in control of two things: what LLVM IR
gets generated, and what passes get run on it. You can create your own
LLVM passes to "clean up" things that you generate (assuming there's a
pattern).

We have tossed around a few ideas about how to retain front-end
information in the bytecode. The current Annotation/Annotable construct
in 1.7 is scheduled for removal in 1.8. There are numerous problems with
it. One option is to just leave it up to the front end. Another option
is to allow a "blob" to be attached to the end of a bytecode file.

On another front, you might be interested in http://hlvm.org/ where a
few interested LLVM developers are thinking about just these kinds of
things and ways to bring high level support to the excellent low level
framework that LLVM provides. Note: this effort has just begun, so don't
expect to find much there for another few weeks.

Reid.

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

signature.asc (196 bytes) Download Attachment
123