New JIT APIs

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
27 messages Options
12
Reply | Threaded
Open this post in threaded view
|

New JIT APIs

Lang Hames
Hi All,

The attached patch (against r225842) contains some new JIT APIs that I've been working on. I'm going to start breaking it up, tidying it up, and submitting patches to llvm-commits soon, but while I'm working on that I thought I'd put the whole patch out for the curious to start playing around with and/or commenting on.

The aim of these new APIs is to cleanly support a wider range of JIT use cases in LLVM, and to recover some of the functionality lost when the legacy JIT was removed. In particular, I wanted to see if I could re-enable lazy compilation while following MCJIT's design philosophy of relying on the MC layer and module-at-a-time compilation. The attached patch goes some way to addressing these aims, though there's a lot still to do.

The 20,000 ft overview, for those who want to get straight to the code:

The new APIs are not built on top of the MCJIT class, as I didn't want a single class trying to be all things to all people. Instead, the new APIs consist of a set of software components for building JITs. The idea is that you should be able to take these off the shelf and compose them reasonably easily to get the behavior that you want. In the future I hope that people who are working on LLVM-based JITs, if they find this approach useful, will contribute back components that they've built locally and that they think would be useful for a wider audience. As a demonstration of the practicality of this approach the attached patch contains a class, MCJITReplacement, that composes some of the components to re-create the behavior of MCJIT. This works well enough to pass all MCJIT regression and unit tests on Darwin, and all but four regression tests on Linux. The patch also contains the desired "new" feature: Function-at-a-time lazy jitting in roughly the style of the legacy JIT. The attached lazydemo.tgz file contains a program which composes the new JIT components (including the lazy-jitting component) to lazily execute bitcode. I've tested this program on Darwin and it can run non-trivial benchmark programs, e.g. 401.bzip2 from SPEC2006.

These new APIs are named after the motivating feature: On Request Compilation, or ORC. I believe the logo potential is outstanding. I'm picturing an Orc riding a Dragon. If I'm honest this was at least 45% of my motivation for doing this project*.

You'll find the new headers in llvm/include/llvm/ExecutionEngine/OrcJIT/*.h, and the implementation files in lib/ExecutionEngine/OrcJIT/*.

I imagine there will be a number of questions about the design and implementation. I've tried to preempt a few below, but please fire away with anything I've left out.

Also, thanks to Jim Grosbach, Michael Illseman, David Blaikie, Pete Cooper, Eric Christopher, and Louis Gerbarg for taking time out to review, discuss and test this thing as I've worked on it.

Cheers,
Lang.

Possible questions:

(1)
Q. Are you trying to kill off MCJIT?
A. There are no plans to remove MCJIT. The new APIs are designed to live alongside it.

(2)
Q. What do "JIT components" look like, and how do you compose them?
A. The classes and functions you'll find in OrcJIT/*.h fall into two rough categories: Layers and Utilities. Layers are classes that implement a small common interface that makes them easy to compose:

class SomeLayer {
private:
  // Implementation details
public:
  // Implementation details

  typedef ??? Handle;

  template <typename ModuleSet>
  Handle addModuleSet(ModuleSet&& Ms);

  void removeModuleSet(Handle H);

  uint64_t getSymbolAddress(StringRef Name, bool ExportedSymbolsOnly);

  uint64_t lookupSymbolAddressIn(Handle H, StringRef Name, bool ExportedSymbolsOnly);
};

Layers are usually designed to sit one-on-top-of-another, with each doing some sort of useful work before handing off to the layer below it. The layers that are currently included in the patch are the the CompileOnDemandLayer, which breaks up modules and redirects calls to not-yet-compiled functions back into the JIT; the LazyEmitLayer, which defers adding modules to the layer below until a symbol in the module is actually requested; the IRCompilingLayer, which compiles bitcode to objects; and the ObjectLinkingLayer, which links sets of objects in memory using RuntimeDyld.

Utilities are everything that's not a layer. Ideally the heavy lifting is done by the utilities. Layers just wrap certain uses-cases to make them easy to compose.

Clients are free to use utilities directly, or compose layers, or implement new utilities or layers.

(3)
Q. Why "addModuleSet" rather than "addModule"?
A. Allowing multiple modules to be passed around together allows layers lower in the stack to perform interesting optimizations. E.g. direct calls between objects that are allocated sufficiently close in memory. To add a single Module you just add a single-element set.

(4)
Q. What happened to "finalize"?
A. In the Orc APIs, getSymbolAddress automatically finalizes as necessary before returning addresses to the client. When you get an address back from getSymbolAddress, that address is ready to call.

(5)
Q. What does "removeModuleSet" do?
A. It removes the modules represented by the handle from the JIT. The meaning of this is specific to each layer, but generally speaking it means that any memory allocated for those modules (and their corresponding Objects, linked sections, etc) has been freed, and the symbols those modules provided are now undefined. Calling getSymbolAddress for a symbol that was defined in a module that has been removed is expected to return '0'.

(5a)
Q. How are the linked sections freed? RTDyldMemoryManager doesn't have any "free.*Section" methods.
A. Each ModuleSet gets its own RTDyldMemoryManager, and that is destroyed when the module set is freed. The choice of RTDyldMemoryManager is up to the client, but the standard memory managers will free the memory allocated for the linked sections when they're destroyed.

(6)
Q. How does the CompileOnDemand layer redirect calls to the JIT?
A. It currently uses double-indirection: Function bodies are extracted into new modules, and the body of the original function is replaced with an indirect call to the extracted body. The pointer for the indirect call is initialized by the JIT to point at some inline assembly which is injected into the module, and this calls back in to the JIT to trigger compilation of the extracted body. In the future I plan to make the redirection strategy a parameter of the CompileOnDemand layer. Double-indirection is the safest: It preserves function-pointer equality and works with non-writable executable memory, however there's no reason we couldn't use single indirection (for extra speed where pointer-equality isn't required), or patchpoints (for clients who can allocate writable/executable memory), or any combination of the three. My intent is that this should be up to the client.

As a brief note: it's worth noting that the CompileOnDemand layer doesn't handle lazy compilation itself, just lazy symbol resolution (i.e. symbols are resolved on first call, not when compiling). If you've put the CompileOnDemand layer on top of the LazyEmitLayer then deferring symbol lookup automatically defers compilation. (E.g. You can remove the LazyEmitLayer in main.cpp of the lazydemo and you'll get indirection and callbacks, but no lazy compilation). 

(7)
Q. Do the new APIs support cross-target JITing like MCJIT does?
A. Yes.

(7.a)
Q. Do the new APIs support cross-target (or cross process) lazy-jitting?
A. Not yet, but all that is required is for us to add a small amount of runtime to the JIT'd process to call back in to the JIT via some RPC mechanism. There are no significant barriers to implementing this that I'm aware of.

(8)
Q. Do any of the components implement the ExecutionEngine interface?
A. None of the components do, but the MCJITReplacement class does.

(9)
Q. Does this address any of the long-standing issues with MCJIT - Stackmap parsing? Debugging? Thread-local-storage?
A. No, but it doesn't get in the way either. These features are still on the road-map (such as it exists) and I'm hoping that the modular nature of Orc will us to play around with new features like this without any risk of disturbing existing clients, and so allow us to make faster progress.

(10)
Q. Why is part X of the patch (ugly | buggy | in the wrong place) ?
A. I'm still tidying the patch up - please save patch specific feedback for for llvm-commits, otherwise we'll get cross-talk between the threads. The patches should be coming soon.

---

As mentioned above, I'm happy to answer further general questions about what these APIs can do, or where I see them going. Feedback on the patch itself should be directed to the llvm-commits list when I start posting patches there for discussion.


* Marketing slogans abound: "Very MachO". "Some warts". "Surprisingly friendly with ELF". "Not yet on speaking terms with DWARF".

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

lazydemo.tgz (3K) Download Attachment
Orc_2015_01_13.patch (177K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: New JIT APIs

Philip Reames-4
On 01/14/2015 12:05 AM, Lang Hames wrote:
Hi All,

The attached patch (against r225842) contains some new JIT APIs that I've been working on. I'm going to start breaking it up, tidying it up, and submitting patches to llvm-commits soon, but while I'm working on that I thought I'd put the whole patch out for the curious to start playing around with and/or commenting on.

The aim of these new APIs is to cleanly support a wider range of JIT use cases in LLVM, and to recover some of the functionality lost when the legacy JIT was removed. In particular, I wanted to see if I could re-enable lazy compilation while following MCJIT's design philosophy of relying on the MC layer and module-at-a-time compilation. The attached patch goes some way to addressing these aims, though there's a lot still to do.
In terms of the overall idea, I like what your proposing.  However, I want to be very clear: you are not planning on removing any functionality from the existing (fairly low level) MCJIT interface right?  We've built our own infrastructure around that and require a few features it doesn't sounds like you're planning on supporting in the new abstractions.  (The biggest one is that we "install" code into a different location from where it was compiled.) 

I really like the idea of having a low level JIT interface for advanced users and an easy starting point for folks getting started. 


The 20,000 ft overview, for those who want to get straight to the code:

The new APIs are not built on top of the MCJIT class, as I didn't want a single class trying to be all things to all people. Instead, the new APIs consist of a set of software components for building JITs. The idea is that you should be able to take these off the shelf and compose them reasonably easily to get the behavior that you want. In the future I hope that people who are working on LLVM-based JITs, if they find this approach useful, will contribute back components that they've built locally and that they think would be useful for a wider audience. As a demonstration of the practicality of this approach the attached patch contains a class, MCJITReplacement, that composes some of the components to re-create the behavior of MCJIT. This works well enough to pass all MCJIT regression and unit tests on Darwin, and all but four regression tests on Linux. The patch also contains the desired "new" feature: Function-at-a-time lazy jitting in roughly the style of the legacy JIT. The attached lazydemo.tgz file contains a program which composes the new JIT components (including the lazy-jitting component) to lazily execute bitcode. I've tested this program on Darwin and it can run non-trivial benchmark programs, e.g. 401.bzip2 from SPEC2006.

These new APIs are named after the motivating feature: On Request Compilation, or ORC. I believe the logo potential is outstanding. I'm picturing an Orc riding a Dragon. If I'm honest this was at least 45% of my motivation for doing this project*.

You'll find the new headers in llvm/include/llvm/ExecutionEngine/OrcJIT/*.h, and the implementation files in lib/ExecutionEngine/OrcJIT/*.

I imagine there will be a number of questions about the design and implementation. I've tried to preempt a few below, but please fire away with anything I've left out.

Also, thanks to Jim Grosbach, Michael Illseman, David Blaikie, Pete Cooper, Eric Christopher, and Louis Gerbarg for taking time out to review, discuss and test this thing as I've worked on it.

Cheers,
Lang.

Possible questions:

(1)
Q. Are you trying to kill off MCJIT?
A. There are no plans to remove MCJIT. The new APIs are designed to live alongside it.

(2)
Q. What do "JIT components" look like, and how do you compose them?
A. The classes and functions you'll find in OrcJIT/*.h fall into two rough categories: Layers and Utilities. Layers are classes that implement a small common interface that makes them easy to compose:

class SomeLayer {
private:
  // Implementation details
public:
  // Implementation details

  typedef ??? Handle;

  template <typename ModuleSet>
  Handle addModuleSet(ModuleSet&& Ms);

  void removeModuleSet(Handle H);

  uint64_t getSymbolAddress(StringRef Name, bool ExportedSymbolsOnly);

  uint64_t lookupSymbolAddressIn(Handle H, StringRef Name, bool ExportedSymbolsOnly);
};

Layers are usually designed to sit one-on-top-of-another, with each doing some sort of useful work before handing off to the layer below it. The layers that are currently included in the patch are the the CompileOnDemandLayer, which breaks up modules and redirects calls to not-yet-compiled functions back into the JIT; the LazyEmitLayer, which defers adding modules to the layer below until a symbol in the module is actually requested; the IRCompilingLayer, which compiles bitcode to objects; and the ObjectLinkingLayer, which links sets of objects in memory using RuntimeDyld.

Utilities are everything that's not a layer. Ideally the heavy lifting is done by the utilities. Layers just wrap certain uses-cases to make them easy to compose.

Clients are free to use utilities directly, or compose layers, or implement new utilities or layers.

(3)
Q. Why "addModuleSet" rather than "addModule"?
A. Allowing multiple modules to be passed around together allows layers lower in the stack to perform interesting optimizations. E.g. direct calls between objects that are allocated sufficiently close in memory. To add a single Module you just add a single-element set.
Please add a utility function for a single Module if you haven't already.  For a method based JIT use case, multiple Modules just aren't that useful. 

(4)
Q. What happened to "finalize"?
A. In the Orc APIs, getSymbolAddress automatically finalizes as necessary before returning addresses to the client. When you get an address back from getSymbolAddress, that address is ready to call.
As long as this is true for the high level API and *not* the low level one (as is true today), this seems fine.  I don't really like the finalize mechanism we have, but we do need a mechanism to get at the code before relocations have been applied. 

(5)
Q. What does "removeModuleSet" do?
A. It removes the modules represented by the handle from the JIT. The meaning of this is specific to each layer, but generally speaking it means that any memory allocated for those modules (and their corresponding Objects, linked sections, etc) has been freed, and the symbols those modules provided are now undefined. Calling getSymbolAddress for a symbol that was defined in a module that has been removed is expected to return '0'.

(5a)
Q. How are the linked sections freed? RTDyldMemoryManager doesn't have any "free.*Section" methods.
A. Each ModuleSet gets its own RTDyldMemoryManager, and that is destroyed when the module set is freed. The choice of RTDyldMemoryManager is up to the client, but the standard memory managers will free the memory allocated for the linked sections when they're destroyed.

(6)
Q. How does the CompileOnDemand layer redirect calls to the JIT?
A. It currently uses double-indirection: Function bodies are extracted into new modules, and the body of the original function is replaced with an indirect call to the extracted body. The pointer for the indirect call is initialized by the JIT to point at some inline assembly which is injected into the module, and this calls back in to the JIT to trigger compilation of the extracted body. In the future I plan to make the redirection strategy a parameter of the CompileOnDemand layer. Double-indirection is the safest: It preserves function-pointer equality and works with non-writable executable memory, however there's no reason we couldn't use single indirection (for extra speed where pointer-equality isn't required), or patchpoints (for clients who can allocate writable/executable memory), or any combination of the three. My intent is that this should be up to the client.

As a brief note: it's worth noting that the CompileOnDemand layer doesn't handle lazy compilation itself, just lazy symbol resolution (i.e. symbols are resolved on first call, not when compiling). If you've put the CompileOnDemand layer on top of the LazyEmitLayer then deferring symbol lookup automatically defers compilation. (E.g. You can remove the LazyEmitLayer in main.cpp of the lazydemo and you'll get indirection and callbacks, but no lazy compilation). 

(7)
Q. Do the new APIs support cross-target JITing like MCJIT does?
A. Yes.

(7.a)
Q. Do the new APIs support cross-target (or cross process) lazy-jitting?
A. Not yet, but all that is required is for us to add a small amount of runtime to the JIT'd process to call back in to the JIT via some RPC mechanism. There are no significant barriers to implementing this that I'm aware of.

(8)
Q. Do any of the components implement the ExecutionEngine interface?
A. None of the components do, but the MCJITReplacement class does.

(9)
Q. Does this address any of the long-standing issues with MCJIT - Stackmap parsing? Debugging? Thread-local-storage?
A. No, but it doesn't get in the way either. These features are still on the road-map (such as it exists) and I'm hoping that the modular nature of Orc will us to play around with new features like this without any risk of disturbing existing clients, and so allow us to make faster progress.

(10)
Q. Why is part X of the patch (ugly | buggy | in the wrong place) ?
A. I'm still tidying the patch up - please save patch specific feedback for for llvm-commits, otherwise we'll get cross-talk between the threads. The patches should be coming soon.

---

As mentioned above, I'm happy to answer further general questions about what these APIs can do, or where I see them going. Feedback on the patch itself should be directed to the llvm-commits list when I start posting patches there for discussion.


* Marketing slogans abound: "Very MachO". "Some warts". "Surprisingly friendly with ELF". "Not yet on speaking terms with DWARF".


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: New JIT APIs

Lang Hames
Hi Philip,

In terms of the overall idea, I like what your proposing.  However, I want to be very clear: you are not planning on removing any functionality from the existing (fairly low level) MCJIT interface right?

To confirm - I have no plans to remove MCJIT. I don't want to change any behavior for existing clients. The new stuff is opt-in.
 
  We've built our own infrastructure around that and require a few features it doesn't sounds like you're planning on supporting in the new abstractions.  (The biggest one is that we "install" code into a different location from where it was compiled.) 

Can you clarify what you mean by "install" here? As it stands in the patch, Orc already supports cross-target and out-of-process JITing. The ObjectLinkingLayer exposes the mapSectionAddress call for mapping sections to new locations, and the MCJITReplacement demo passes all remote-jitting regression tests on Darwin.

The intent is that Orc should eventually provide a superset of the functionality provided by MCJIT, but with the various features broken out into separate components. I'd be interested to hear about anything that's missing so that I can, if possible, add support for it.
 
I really like the idea of having a low level JIT interface for advanced users and an easy starting point for folks getting started.  

Ideally I would like Orc to cover the whole spectrum. I'm hoping we can quickly advance to the point where new JIT developers would use Orc by default, rather than MCJIT, and not miss any features. I expect advanced users will want to compose their stacks directly, while beginners would take some common configuration off the shelf. Once these patches are in tree I'm planning to add a basic lazy-jitting stack for Kaleidoscope that beginners could use as a starting point.

(3)
Q. Why "addModuleSet" rather than "addModule"?
A. Allowing multiple modules to be passed around together allows layers lower in the stack to perform interesting optimizations. E.g. direct calls between objects that are allocated sufficiently close in memory. To add a single Module you just add a single-element set.
Please add a utility function for a single Module if you haven't already.  For a method based JIT use case, multiple Modules just aren't that useful. 

Sure. This problem should be tackled in the wrappers around the components, rather than the components themselves. See MCJITReplacement::addModule for an example. 
 
(4)
Q. What happened to "finalize"?
A. In the Orc APIs, getSymbolAddress automatically finalizes as necessary before returning addresses to the client. When you get an address back from getSymbolAddress, that address is ready to call.
As long as this is true for the high level API and *not* the low level one (as is true today), this seems fine.  I don't really like the finalize mechanism we have, but we do need a mechanism to get at the code before relocations have been applied. 

You should still be able to intercept events to access memory before finalization. If you can be more specific about your use-case I'd be keen to figure out if/how it could be supported in Orc.

- Lang.

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: New JIT APIs

Philip Reames-4
On 01/14/2015 11:49 AM, Lang Hames wrote:

 
  We've built our own infrastructure around that and require a few features it doesn't sounds like you're planning on supporting in the new abstractions.  (The biggest one is that we "install" code into a different location from where it was compiled.) 

Can you clarify what you mean by "install" here? As it stands in the patch, Orc already supports cross-target and out-of-process JITing. The ObjectLinkingLayer exposes the mapSectionAddress call for mapping sections to new locations, and the MCJITReplacement demo passes all remote-jitting regression tests on Darwin.
I don't have the code in front of my right now, but I'll take a look later today and try to summarize clearly. 

The intent is that Orc should eventually provide a superset of the functionality provided by MCJIT, but with the various features broken out into separate components. I'd be interested to hear about anything that's missing so that I can, if possible, add support for it.
 
I really like the idea of having a low level JIT interface for advanced users and an easy starting point for folks getting started.  

Ideally I would like Orc to cover the whole spectrum. I'm hoping we can quickly advance to the point where new JIT developers would use Orc by default, rather than MCJIT, and not miss any features. I expect advanced users will want to compose their stacks directly, while beginners would take some common configuration off the shelf. Once these patches are in tree I'm planning to add a basic lazy-jitting stack for Kaleidoscope that beginners could use as a starting point.
 
(4)
Q. What happened to "finalize"?
A. In the Orc APIs, getSymbolAddress automatically finalizes as necessary before returning addresses to the client. When you get an address back from getSymbolAddress, that address is ready to call.
As long as this is true for the high level API and *not* the low level one (as is true today), this seems fine.  I don't really like the finalize mechanism we have, but we do need a mechanism to get at the code before relocations have been applied. 

You should still be able to intercept events to access memory before finalization. If you can be more specific about your use-case I'd be keen to figure out if/how it could be supported in Orc.

- Lang.


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: New JIT APIs

David Blaikie
In reply to this post by Lang Hames


On Wed, Jan 14, 2015 at 11:49 AM, Lang Hames <[hidden email]> wrote:
Hi Philip,

In terms of the overall idea, I like what your proposing.  However, I want to be very clear: you are not planning on removing any functionality from the existing (fairly low level) MCJIT interface right?

To confirm - I have no plans to remove MCJIT. I don't want to change any behavior for existing clients. The new stuff is opt-in.

Why not? We did work to remove the legacy JIT in favor of MCJIT for the usual reasons (less code/maintenance burden/etc) - it'd seem unfortunate to then go back to maintaining two JITs again.

You mention the intent to provide a superset of MCJIT's behavior, at which point it seems it'd be preferable to kill of MCJIT in favor of ORC (heck, we killed of the legacy JIT before MCJIT had feature parity).
 
 
  We've built our own infrastructure around that and require a few features it doesn't sounds like you're planning on supporting in the new abstractions.  (The biggest one is that we "install" code into a different location from where it was compiled.) 

Can you clarify what you mean by "install" here? As it stands in the patch, Orc already supports cross-target and out-of-process JITing. The ObjectLinkingLayer exposes the mapSectionAddress call for mapping sections to new locations, and the MCJITReplacement demo passes all remote-jitting regression tests on Darwin.

The intent is that Orc should eventually provide a superset of the functionality provided by MCJIT, but with the various features broken out into separate components. I'd be interested to hear about anything that's missing so that I can, if possible, add support for it.
 
I really like the idea of having a low level JIT interface for advanced users and an easy starting point for folks getting started.  

Ideally I would like Orc to cover the whole spectrum. I'm hoping we can quickly advance to the point where new JIT developers would use Orc by default, rather than MCJIT, and not miss any features. I expect advanced users will want to compose their stacks directly, while beginners would take some common configuration off the shelf. Once these patches are in tree I'm planning to add a basic lazy-jitting stack for Kaleidoscope that beginners could use as a starting point.

(3)
Q. Why "addModuleSet" rather than "addModule"?
A. Allowing multiple modules to be passed around together allows layers lower in the stack to perform interesting optimizations. E.g. direct calls between objects that are allocated sufficiently close in memory. To add a single Module you just add a single-element set.
Please add a utility function for a single Module if you haven't already.  For a method based JIT use case, multiple Modules just aren't that useful. 

Sure. This problem should be tackled in the wrappers around the components, rather than the components themselves. See MCJITReplacement::addModule for an example. 
 
(4)
Q. What happened to "finalize"?
A. In the Orc APIs, getSymbolAddress automatically finalizes as necessary before returning addresses to the client. When you get an address back from getSymbolAddress, that address is ready to call.
As long as this is true for the high level API and *not* the low level one (as is true today), this seems fine.  I don't really like the finalize mechanism we have, but we do need a mechanism to get at the code before relocations have been applied. 

You should still be able to intercept events to access memory before finalization. If you can be more specific about your use-case I'd be keen to figure out if/how it could be supported in Orc.

- Lang.

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: New JIT APIs

Lang Hames
Hi Dave,

To confirm - I have no plans to remove MCJIT. I don't want to change any behavior for existing clients. The new stuff is opt-in.

Why not? We did work to remove the legacy JIT in favor of MCJIT for the usual reasons (less code/maintenance burden/etc) - it'd seem unfortunate to then go back to maintaining two JITs again.

You mention the intent to provide a superset of MCJIT's behavior, at which point it seems it'd be preferable to kill of MCJIT in favor of ORC (heck, we killed of the legacy JIT before MCJIT had feature parity).
 

Not having plans at the moment doesn't preclude making plans in the future, it's just premature to think about replacing MCJIT when the "replacement" hasn't even been submitted to llvm-commits yet. :)

The bar for transitioning is higher now, since MCJIT has more substantial clients than the legacy JIT had. The impetus for transitioning is also lower: The legacy JIT required a lot of custom infrastructure to be kept around. MCJIT is much more lightweight, and shares most of its foundation (RuntimeDyld) with Orc.

If MCJITReplacement reaches full feature and performance parity with MCJIT (which I do actually want to see), and the transition can be done either transparently (by having ExecutionEngineBuilder return an MCJITReplacement instead of an MCJIT), or in a manual way that all clients are happy to buy into, then I'd be ok with deprecating and eventually removing MCJIT. That's a discussion for the future though.

So clients should rest easy: We just went through a difficult transition from the legacy JIT, and I don't want to put you through that again any time soon.

- Lang.

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: New JIT APIs

David Blaikie


On Wed, Jan 14, 2015 at 2:22 PM, Lang Hames <[hidden email]> wrote:
Hi Dave,

To confirm - I have no plans to remove MCJIT. I don't want to change any behavior for existing clients. The new stuff is opt-in.

Why not? We did work to remove the legacy JIT in favor of MCJIT for the usual reasons (less code/maintenance burden/etc) - it'd seem unfortunate to then go back to maintaining two JITs again.

You mention the intent to provide a superset of MCJIT's behavior, at which point it seems it'd be preferable to kill of MCJIT in favor of ORC (heck, we killed of the legacy JIT before MCJIT had feature parity).
 

Not having plans at the moment doesn't preclude making plans in the future, it's just premature to think about replacing MCJIT when the "replacement" hasn't even been submitted to llvm-commits yet. :)

Not necessarily - it doesn't seem unreasonable to make a plan to ensure we don't end up with duplicate functionality to debug/test/fix indefinitely before adding the duplicate. Seems to be common in the project to make replacements, introduce them as an alternative but with an explicit goal/plan from the start that this is not a perpetual state. (for example, Chandler's pass manager work and I think most of the bits that Chandler's rewritten (shuffling, inlining, etc) were this way - maybe there are counterexamples where similar/duplicate functionality was introduced without such a goal, but none come to my mind)

But I dunno, maybe other people find that to be an OK state of affairs, I'm not a code owner/authority in much/any of this.

- David
 
The bar for transitioning is higher now, since MCJIT has more substantial clients than the legacy JIT had. The impetus for transitioning is also lower: The legacy JIT required a lot of custom infrastructure to be kept around. MCJIT is much more lightweight, and shares most of its foundation (RuntimeDyld) with Orc.

If MCJITReplacement reaches full feature and performance parity with MCJIT (which I do actually want to see), and the transition can be done either transparently (by having ExecutionEngineBuilder return an MCJITReplacement instead of an MCJIT), or in a manual way that all clients are happy to buy into, then I'd be ok with deprecating and eventually removing MCJIT. That's a discussion for the future though.

So clients should rest easy: We just went through a difficult transition from the legacy JIT, and I don't want to put you through that again any time soon.

- Lang.


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: New JIT APIs

Philip Reames-4

On 01/14/2015 02:33 PM, David Blaikie wrote:


On Wed, Jan 14, 2015 at 2:22 PM, Lang Hames <[hidden email]> wrote:
Hi Dave,

To confirm - I have no plans to remove MCJIT. I don't want to change any behavior for existing clients. The new stuff is opt-in.

Why not? We did work to remove the legacy JIT in favor of MCJIT for the usual reasons (less code/maintenance burden/etc) - it'd seem unfortunate to then go back to maintaining two JITs again.

You mention the intent to provide a superset of MCJIT's behavior, at which point it seems it'd be preferable to kill of MCJIT in favor of ORC (heck, we killed of the legacy JIT before MCJIT had feature parity).
 

Not having plans at the moment doesn't preclude making plans in the future, it's just premature to think about replacing MCJIT when the "replacement" hasn't even been submitted to llvm-commits yet. :)

Not necessarily - it doesn't seem unreasonable to make a plan to ensure we don't end up with duplicate functionality to debug/test/fix indefinitely before adding the duplicate. Seems to be common in the project to make replacements, introduce them as an alternative but with an explicit goal/plan from the start that this is not a perpetual state. (for example, Chandler's pass manager work and I think most of the bits that Chandler's rewritten (shuffling, inlining, etc) were this way - maybe there are counterexamples where similar/duplicate functionality was introduced without such a goal, but none come to my mind)

But I dunno, maybe other people find that to be an OK state of affairs, I'm not a code owner/authority in much/any of this.
As a user of the JIT, I am *very* strongly in favour of Lang's espoused position. 

p.s. I don't think we know what the "right" interface is for the JIT yet.  Until we do, having multiple interfaces (with a single shared implementation, based on the rest of LLVM) seems entirely reasonable and appropriate. 

p.p.s. If Lang was proposing the replacement of MCJIT - he's not! - the review barrier would be far far higher.  He'd have to satisfy all existing - well, all vocal - users of the old interface that his new one met their needs.  This would be a much slower process and we want to let things evolve more quickly than that.  We *don't* want to be looking at an old-JIT retirement v2.  That took literally years and blocked a lot of useful work on the JIT infrastructure. 


- David
 
The bar for transitioning is higher now, since MCJIT has more substantial clients than the legacy JIT had. The impetus for transitioning is also lower: The legacy JIT required a lot of custom infrastructure to be kept around. MCJIT is much more lightweight, and shares most of its foundation (RuntimeDyld) with Orc.

If MCJITReplacement reaches full feature and performance parity with MCJIT (which I do actually want to see), and the transition can be done either transparently (by having ExecutionEngineBuilder return an MCJITReplacement instead of an MCJIT), or in a manual way that all clients are happy to buy into, then I'd be ok with deprecating and eventually removing MCJIT. That's a discussion for the future though.

So clients should rest easy: We just went through a difficult transition from the legacy JIT, and I don't want to put you through that again any time soon.

- Lang.



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: New JIT APIs

David Blaikie
In reply to this post by David Blaikie


On Wed, Jan 14, 2015 at 2:33 PM, David Blaikie <[hidden email]> wrote:


On Wed, Jan 14, 2015 at 2:22 PM, Lang Hames <[hidden email]> wrote:
Hi Dave,

To confirm - I have no plans to remove MCJIT. I don't want to change any behavior for existing clients. The new stuff is opt-in.

Why not? We did work to remove the legacy JIT in favor of MCJIT for the usual reasons (less code/maintenance burden/etc) - it'd seem unfortunate to then go back to maintaining two JITs again.

You mention the intent to provide a superset of MCJIT's behavior, at which point it seems it'd be preferable to kill of MCJIT in favor of ORC (heck, we killed of the legacy JIT before MCJIT had feature parity).
 

Not having plans at the moment doesn't preclude making plans in the future, it's just premature to think about replacing MCJIT when the "replacement" hasn't even been submitted to llvm-commits yet. :)

Not necessarily - it doesn't seem unreasonable to make a plan to ensure we don't end up with duplicate functionality to debug/test/fix indefinitely before adding the duplicate. Seems to be common in the project to make replacements, introduce them as an alternative but with an explicit goal/plan from the start that this is not a perpetual state. (for example, Chandler's pass manager work and I think most of the bits that Chandler's rewritten (shuffling, inlining, etc) were this way - maybe there are counterexamples where similar/duplicate functionality was introduced without such a goal, but none come to my mind)

Well, I suppose we've had a couple of register allocators banging around for a while now (:
 

But I dunno, maybe other people find that to be an OK state of affairs, I'm not a code owner/authority in much/any of this.

- David
 
The bar for transitioning is higher now, since MCJIT has more substantial clients than the legacy JIT had. The impetus for transitioning is also lower: The legacy JIT required a lot of custom infrastructure to be kept around. MCJIT is much more lightweight, and shares most of its foundation (RuntimeDyld) with Orc.

If MCJITReplacement reaches full feature and performance parity with MCJIT (which I do actually want to see), and the transition can be done either transparently (by having ExecutionEngineBuilder return an MCJITReplacement instead of an MCJIT), or in a manual way that all clients are happy to buy into, then I'd be ok with deprecating and eventually removing MCJIT. That's a discussion for the future though.

So clients should rest easy: We just went through a difficult transition from the legacy JIT, and I don't want to put you through that again any time soon.

- Lang.



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: New JIT APIs

David Blaikie
In reply to this post by Philip Reames-4


On Wed, Jan 14, 2015 at 2:46 PM, Philip Reames <[hidden email]> wrote:

On 01/14/2015 02:33 PM, David Blaikie wrote:


On Wed, Jan 14, 2015 at 2:22 PM, Lang Hames <[hidden email]> wrote:
Hi Dave,

To confirm - I have no plans to remove MCJIT. I don't want to change any behavior for existing clients. The new stuff is opt-in.

Why not? We did work to remove the legacy JIT in favor of MCJIT for the usual reasons (less code/maintenance burden/etc) - it'd seem unfortunate to then go back to maintaining two JITs again.

You mention the intent to provide a superset of MCJIT's behavior, at which point it seems it'd be preferable to kill of MCJIT in favor of ORC (heck, we killed of the legacy JIT before MCJIT had feature parity).
 

Not having plans at the moment doesn't preclude making plans in the future, it's just premature to think about replacing MCJIT when the "replacement" hasn't even been submitted to llvm-commits yet. :)

Not necessarily - it doesn't seem unreasonable to make a plan to ensure we don't end up with duplicate functionality to debug/test/fix indefinitely before adding the duplicate. Seems to be common in the project to make replacements, introduce them as an alternative but with an explicit goal/plan from the start that this is not a perpetual state. (for example, Chandler's pass manager work and I think most of the bits that Chandler's rewritten (shuffling, inlining, etc) were this way - maybe there are counterexamples where similar/duplicate functionality was introduced without such a goal, but none come to my mind)

But I dunno, maybe other people find that to be an OK state of affairs, I'm not a code owner/authority in much/any of this.
As a user of the JIT, I am *very* strongly in favour of Lang's espoused position. 

p.s. I don't think we know what the "right" interface is for the JIT yet.  Until we do, having multiple interfaces (with a single shared implementation, based on the rest of LLVM) seems entirely reasonable and appropriate. 

p.p.s. If Lang was proposing the replacement of MCJIT - he's not! - the review barrier would be far far higher.  He'd have to satisfy all existing - well, all vocal - users of the old interface that his new one met their needs.

Not necessarily - It could simply be the stated plan (& he has stated it) to reach feature parity. At that point it seems it'd be hard to justify keeping both around when one has a superset of features of the other.
 
  This would be a much slower process and we want to let things evolve more quickly than that.  We *don't* want to be looking at an old-JIT retirement v2.  That took literally years and blocked a lot of useful work on the JIT infrastructure. 

Not sure I follow quite why the old JIT retirement was a blocker, but introducing a new JIT API with the intention to maintain both would block less work. Could you describe this in more detail?

- David
 


- David
 
The bar for transitioning is higher now, since MCJIT has more substantial clients than the legacy JIT had. The impetus for transitioning is also lower: The legacy JIT required a lot of custom infrastructure to be kept around. MCJIT is much more lightweight, and shares most of its foundation (RuntimeDyld) with Orc.

If MCJITReplacement reaches full feature and performance parity with MCJIT (which I do actually want to see), and the transition can be done either transparently (by having ExecutionEngineBuilder return an MCJITReplacement instead of an MCJIT), or in a manual way that all clients are happy to buy into, then I'd be ok with deprecating and eventually removing MCJIT. That's a discussion for the future though.

So clients should rest easy: We just went through a difficult transition from the legacy JIT, and I don't want to put you through that again any time soon.

- Lang.




_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: New JIT APIs

Lang Hames
In reply to this post by Philip Reames-4

> p.s. I don't think we know what the "right" interface is for the JIT yet.

Orc was actually motivated in part by this. Having compassable components makes it much easier for clients to experiment with different JIT API designs, as well as new features. Hopefully any helpful lessons learned can be fed back into the tree.

Again - keeping the existing MCJIT API is a fundamental requirement for the foreseeable future*.

- Lang.

* The caveat, which I mentioned in a previous message, is that I'm happy to discuss replacing the *implementation* of MCJIT with MCJITReplacement if/when clients are satisfied that the latter provides full feature and performance parity and compatible behavior. Such a discussion is premature though.

On Wed, Jan 14, 2015 at 2:46 PM, Philip Reames <[hidden email]> wrote:

On 01/14/2015 02:33 PM, David Blaikie wrote:


On Wed, Jan 14, 2015 at 2:22 PM, Lang Hames <[hidden email]> wrote:
Hi Dave,

To confirm - I have no plans to remove MCJIT. I don't want to change any behavior for existing clients. The new stuff is opt-in.

Why not? We did work to remove the legacy JIT in favor of MCJIT for the usual reasons (less code/maintenance burden/etc) - it'd seem unfortunate to then go back to maintaining two JITs again.

You mention the intent to provide a superset of MCJIT's behavior, at which point it seems it'd be preferable to kill of MCJIT in favor of ORC (heck, we killed of the legacy JIT before MCJIT had feature parity).
 

Not having plans at the moment doesn't preclude making plans in the future, it's just premature to think about replacing MCJIT when the "replacement" hasn't even been submitted to llvm-commits yet. :)

Not necessarily - it doesn't seem unreasonable to make a plan to ensure we don't end up with duplicate functionality to debug/test/fix indefinitely before adding the duplicate. Seems to be common in the project to make replacements, introduce them as an alternative but with an explicit goal/plan from the start that this is not a perpetual state. (for example, Chandler's pass manager work and I think most of the bits that Chandler's rewritten (shuffling, inlining, etc) were this way - maybe there are counterexamples where similar/duplicate functionality was introduced without such a goal, but none come to my mind)

But I dunno, maybe other people find that to be an OK state of affairs, I'm not a code owner/authority in much/any of this.
As a user of the JIT, I am *very* strongly in favour of Lang's espoused position. 

p.s. I don't think we know what the "right" interface is for the JIT yet.  Until we do, having multiple interfaces (with a single shared implementation, based on the rest of LLVM) seems entirely reasonable and appropriate. 

p.p.s. If Lang was proposing the replacement of MCJIT - he's not! - the review barrier would be far far higher.  He'd have to satisfy all existing - well, all vocal - users of the old interface that his new one met their needs.  This would be a much slower process and we want to let things evolve more quickly than that.  We *don't* want to be looking at an old-JIT retirement v2.  That took literally years and blocked a lot of useful work on the JIT infrastructure. 


- David
 
The bar for transitioning is higher now, since MCJIT has more substantial clients than the legacy JIT had. The impetus for transitioning is also lower: The legacy JIT required a lot of custom infrastructure to be kept around. MCJIT is much more lightweight, and shares most of its foundation (RuntimeDyld) with Orc.

If MCJITReplacement reaches full feature and performance parity with MCJIT (which I do actually want to see), and the transition can be done either transparently (by having ExecutionEngineBuilder return an MCJITReplacement instead of an MCJIT), or in a manual way that all clients are happy to buy into, then I'd be ok with deprecating and eventually removing MCJIT. That's a discussion for the future though.

So clients should rest easy: We just went through a difficult transition from the legacy JIT, and I don't want to put you through that again any time soon.

- Lang.




_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: New JIT APIs

Lang Hames
In reply to this post by David Blaikie
Hi Dave,

Not necessarily - It could simply be the stated plan (& he has stated it) to reach feature parity. At that point it seems it'd be hard to justify keeping both around when one has a superset of features of the other.
 

It's worth noting the distinction between API/feature replacement (e.g. the removal of the old JIT, which was seriously disruptive) and replacement of MCJIT's implementation, which should be a no-op.

I have thought far enough ahead to imagine replacing the implementation of MCJIT with MCJITReplacement. I just wanted to emphatically re-assure people that I'm not going to break anything by replacing MCJIT's implementation hastily, or without consultation.

As for API changes though, I can't imagine LLVM without the MCJIT API any time in the near future. There are big clients who are happy with this API. It has some warts, mostly to do with it deriving from ExecutionEngine, but it basically makes sense given MCJIT's purpose.  

If, in the distant future, all clients have moved onto some new Orc-based API then we could consider discarding the MCJIT API, but it wouldn't buy us much if the implementation has already been moved over to MCJITReplacement. Any new Orc-based API that supports laziness is likely use a superset of the components required to compose MCJITReplacement, so the only thing you'd save yourself is a page or two of glue code.

I'd be inclined to defer any further discussion of MCJIT succession for now. We'll definitely follow best practices: We won't leave two redundant JITs in tree, but we also won't consider removing MCJIT until it is actually redundant, and that's not on the horizon.

- Lang.

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: New JIT APIs

Philip Reames-4
In reply to this post by David Blaikie

On 01/14/2015 02:56 PM, David Blaikie wrote:


On Wed, Jan 14, 2015 at 2:46 PM, Philip Reames <[hidden email]> wrote:

On 01/14/2015 02:33 PM, David Blaikie wrote:


On Wed, Jan 14, 2015 at 2:22 PM, Lang Hames <[hidden email]> wrote:
Hi Dave,

To confirm - I have no plans to remove MCJIT. I don't want to change any behavior for existing clients. The new stuff is opt-in.

Why not? We did work to remove the legacy JIT in favor of MCJIT for the usual reasons (less code/maintenance burden/etc) - it'd seem unfortunate to then go back to maintaining two JITs again.

You mention the intent to provide a superset of MCJIT's behavior, at which point it seems it'd be preferable to kill of MCJIT in favor of ORC (heck, we killed of the legacy JIT before MCJIT had feature parity).
 

Not having plans at the moment doesn't preclude making plans in the future, it's just premature to think about replacing MCJIT when the "replacement" hasn't even been submitted to llvm-commits yet. :)

Not necessarily - it doesn't seem unreasonable to make a plan to ensure we don't end up with duplicate functionality to debug/test/fix indefinitely before adding the duplicate. Seems to be common in the project to make replacements, introduce them as an alternative but with an explicit goal/plan from the start that this is not a perpetual state. (for example, Chandler's pass manager work and I think most of the bits that Chandler's rewritten (shuffling, inlining, etc) were this way - maybe there are counterexamples where similar/duplicate functionality was introduced without such a goal, but none come to my mind)

But I dunno, maybe other people find that to be an OK state of affairs, I'm not a code owner/authority in much/any of this.
As a user of the JIT, I am *very* strongly in favour of Lang's espoused position. 

p.s. I don't think we know what the "right" interface is for the JIT yet.  Until we do, having multiple interfaces (with a single shared implementation, based on the rest of LLVM) seems entirely reasonable and appropriate. 

p.p.s. If Lang was proposing the replacement of MCJIT - he's not! - the review barrier would be far far higher.  He'd have to satisfy all existing - well, all vocal - users of the old interface that his new one met their needs.

Not necessarily - It could simply be the stated plan (& he has stated it) to reach feature parity. At that point it seems it'd be hard to justify keeping both around when one has a superset of features of the other.
 
  This would be a much slower process and we want to let things evolve more quickly than that.  We *don't* want to be looking at an old-JIT retirement v2.  That took literally years and blocked a lot of useful work on the JIT infrastructure. 

Not sure I follow quite why the old JIT retirement was a blocker, but introducing a new JIT API with the intention to maintain both would block less work. Could you describe this in more detail?
I think Lang's response covered all the relevant points and we're far off topic at this point.  Let me know if there was something you think Lang's comments didn't address that you'd like me to. 

- David
 


- David
 
The bar for transitioning is higher now, since MCJIT has more substantial clients than the legacy JIT had. The impetus for transitioning is also lower: The legacy JIT required a lot of custom infrastructure to be kept around. MCJIT is much more lightweight, and shares most of its foundation (RuntimeDyld) with Orc.

If MCJITReplacement reaches full feature and performance parity with MCJIT (which I do actually want to see), and the transition can be done either transparently (by having ExecutionEngineBuilder return an MCJITReplacement instead of an MCJIT), or in a manual way that all clients are happy to buy into, then I'd be ok with deprecating and eventually removing MCJIT. That's a discussion for the future though.

So clients should rest easy: We just went through a difficult transition from the legacy JIT, and I don't want to put you through that again any time soon.

- Lang.





_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: New JIT APIs

David Blaikie
In reply to this post by Lang Hames


On Wed, Jan 14, 2015 at 3:30 PM, Lang Hames <[hidden email]> wrote:
Hi Dave,

Not necessarily - It could simply be the stated plan (& he has stated it) to reach feature parity. At that point it seems it'd be hard to justify keeping both around when one has a superset of features of the other.
 

It's worth noting the distinction between API/feature replacement (e.g. the removal of the old JIT, which was seriously disruptive) and replacement of MCJIT's implementation, which should be a no-op.

I have thought far enough ahead to imagine replacing the implementation of MCJIT with MCJITReplacement. I just wanted to emphatically re-assure people that I'm not going to break anything by replacing MCJIT's implementation hastily, or without consultation.

OK - that sounds great/reasonable/etc.
 
As for API changes though, I can't imagine LLVM without the MCJIT API any time in the near future. There are big clients who are happy with this API. It has some warts, mostly to do with it deriving from ExecutionEngine, but it basically makes sense given MCJIT's purpose.  

If, in the distant future, all clients have moved onto some new Orc-based API then we could consider discarding the MCJIT API, but it wouldn't buy us much if the implementation has already been moved over to MCJITReplacement.

Yeah - providing a convenience utility based on ORC for some common (even just historic) use case, hopefully it's cheap enough (& if it isn't, we can improve ORC until it is /really/ easy to write that wrapper & comes as close to zero cost to keep as possible) that there's no cost to keeping it around.
 
Any new Orc-based API that supports laziness is likely use a superset of the components required to compose MCJITReplacement, so the only thing you'd save yourself is a page or two of glue code.

I'd be inclined to defer any further discussion of MCJIT succession for now. We'll definitely follow best practices: We won't leave two redundant JITs in tree, but we also won't consider removing MCJIT until it is actually redundant, and that's not on the horizon.

- Lang.


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: New JIT APIs

Armin Steinhoff-2
In reply to this post by Lang Hames

Hi Lang,

we are using the JIT API of TCC and  the MCJIT API in order to import external code into a running control application process.

The MCJIT API can only be used once to JIT compile external souces to excuteable code into the address space of a running process.

Has your JIT API the same restriction ?  It would be very nice if your JIT API could provide a similar functionalty as provided by TCC.

Best Regards

Armin


Lang Hames schrieb:
Hi All,

The attached patch (against r225842) contains some new JIT APIs that I've been working on. I'm going to start breaking it up, tidying it up, and submitting patches to llvm-commits soon, but while I'm working on that I thought I'd put the whole patch out for the curious to start playing around with and/or commenting on.

The aim of these new APIs is to cleanly support a wider range of JIT use cases in LLVM, and to recover some of the functionality lost when the legacy JIT was removed. In particular, I wanted to see if I could re-enable lazy compilation while following MCJIT's design philosophy of relying on the MC layer and module-at-a-time compilation. The attached patch goes some way to addressing these aims, though there's a lot still to do.

The 20,000 ft overview, for those who want to get straight to the code:

The new APIs are not built on top of the MCJIT class, as I didn't want a single class trying to be all things to all people. Instead, the new APIs consist of a set of software components for building JITs. The idea is that you should be able to take these off the shelf and compose them reasonably easily to get the behavior that you want. In the future I hope that people who are working on LLVM-based JITs, if they find this approach useful, will contribute back components that they've built locally and that they think would be useful for a wider audience. As a demonstration of the practicality of this approach the attached patch contains a class, MCJITReplacement, that composes some of the components to re-create the behavior of MCJIT. This works well enough to pass all MCJIT regression and unit tests on Darwin, and all but four regression tests on Linux. The patch also contains the desired "new" feature: Function-at-a-time lazy jitting in roughly the style of the legacy JIT. The attached lazydemo.tgz file contains a program which composes the new JIT components (including the lazy-jitting component) to lazily execute bitcode. I've tested this program on Darwin and it can run non-trivial benchmark programs, e.g. 401.bzip2 from SPEC2006.

These new APIs are named after the motivating feature: On Request Compilation, or ORC. I believe the logo potential is outstanding. I'm picturing an Orc riding a Dragon. If I'm honest this was at least 45% of my motivation for doing this project*.

You'll find the new headers in llvm/include/llvm/ExecutionEngine/OrcJIT/*.h, and the implementation files in lib/ExecutionEngine/OrcJIT/*.

I imagine there will be a number of questions about the design and implementation. I've tried to preempt a few below, but please fire away with anything I've left out.

Also, thanks to Jim Grosbach, Michael Illseman, David Blaikie, Pete Cooper, Eric Christopher, and Louis Gerbarg for taking time out to review, discuss and test this thing as I've worked on it.

Cheers,
Lang.

Possible questions:

(1)
Q. Are you trying to kill off MCJIT?
A. There are no plans to remove MCJIT. The new APIs are designed to live alongside it.

(2)
Q. What do "JIT components" look like, and how do you compose them?
A. The classes and functions you'll find in OrcJIT/*.h fall into two rough categories: Layers and Utilities. Layers are classes that implement a small common interface that makes them easy to compose:

class SomeLayer {
private:
  // Implementation details
public:
  // Implementation details

  typedef ??? Handle;

  template <typename ModuleSet>
  Handle addModuleSet(ModuleSet&& Ms);

  void removeModuleSet(Handle H);

  uint64_t getSymbolAddress(StringRef Name, bool ExportedSymbolsOnly);

  uint64_t lookupSymbolAddressIn(Handle H, StringRef Name, bool ExportedSymbolsOnly);
};

Layers are usually designed to sit one-on-top-of-another, with each doing some sort of useful work before handing off to the layer below it. The layers that are currently included in the patch are the the CompileOnDemandLayer, which breaks up modules and redirects calls to not-yet-compiled functions back into the JIT; the LazyEmitLayer, which defers adding modules to the layer below until a symbol in the module is actually requested; the IRCompilingLayer, which compiles bitcode to objects; and the ObjectLinkingLayer, which links sets of objects in memory using RuntimeDyld.

Utilities are everything that's not a layer. Ideally the heavy lifting is done by the utilities. Layers just wrap certain uses-cases to make them easy to compose.

Clients are free to use utilities directly, or compose layers, or implement new utilities or layers.

(3)
Q. Why "addModuleSet" rather than "addModule"?
A. Allowing multiple modules to be passed around together allows layers lower in the stack to perform interesting optimizations. E.g. direct calls between objects that are allocated sufficiently close in memory. To add a single Module you just add a single-element set.

(4)
Q. What happened to "finalize"?
A. In the Orc APIs, getSymbolAddress automatically finalizes as necessary before returning addresses to the client. When you get an address back from getSymbolAddress, that address is ready to call.

(5)
Q. What does "removeModuleSet" do?
A. It removes the modules represented by the handle from the JIT. The meaning of this is specific to each layer, but generally speaking it means that any memory allocated for those modules (and their corresponding Objects, linked sections, etc) has been freed, and the symbols those modules provided are now undefined. Calling getSymbolAddress for a symbol that was defined in a module that has been removed is expected to return '0'.

(5a)
Q. How are the linked sections freed? RTDyldMemoryManager doesn't have any "free.*Section" methods.
A. Each ModuleSet gets its own RTDyldMemoryManager, and that is destroyed when the module set is freed. The choice of RTDyldMemoryManager is up to the client, but the standard memory managers will free the memory allocated for the linked sections when they're destroyed.

(6)
Q. How does the CompileOnDemand layer redirect calls to the JIT?
A. It currently uses double-indirection: Function bodies are extracted into new modules, and the body of the original function is replaced with an indirect call to the extracted body. The pointer for the indirect call is initialized by the JIT to point at some inline assembly which is injected into the module, and this calls back in to the JIT to trigger compilation of the extracted body. In the future I plan to make the redirection strategy a parameter of the CompileOnDemand layer. Double-indirection is the safest: It preserves function-pointer equality and works with non-writable executable memory, however there's no reason we couldn't use single indirection (for extra speed where pointer-equality isn't required), or patchpoints (for clients who can allocate writable/executable memory), or any combination of the three. My intent is that this should be up to the client.

As a brief note: it's worth noting that the CompileOnDemand layer doesn't handle lazy compilation itself, just lazy symbol resolution (i.e. symbols are resolved on first call, not when compiling). If you've put the CompileOnDemand layer on top of the LazyEmitLayer then deferring symbol lookup automatically defers compilation. (E.g. You can remove the LazyEmitLayer in main.cpp of the lazydemo and you'll get indirection and callbacks, but no lazy compilation). 

(7)
Q. Do the new APIs support cross-target JITing like MCJIT does?
A. Yes.

(7.a)
Q. Do the new APIs support cross-target (or cross process) lazy-jitting?
A. Not yet, but all that is required is for us to add a small amount of runtime to the JIT'd process to call back in to the JIT via some RPC mechanism. There are no significant barriers to implementing this that I'm aware of.

(8)
Q. Do any of the components implement the ExecutionEngine interface?
A. None of the components do, but the MCJITReplacement class does.

(9)
Q. Does this address any of the long-standing issues with MCJIT - Stackmap parsing? Debugging? Thread-local-storage?
A. No, but it doesn't get in the way either. These features are still on the road-map (such as it exists) and I'm hoping that the modular nature of Orc will us to play around with new features like this without any risk of disturbing existing clients, and so allow us to make faster progress.

(10)
Q. Why is part X of the patch (ugly | buggy | in the wrong place) ?
A. I'm still tidying the patch up - please save patch specific feedback for for llvm-commits, otherwise we'll get cross-talk between the threads. The patches should be coming soon.

---

As mentioned above, I'm happy to answer further general questions about what these APIs can do, or where I see them going. Feedback on the patch itself should be directed to the llvm-commits list when I start posting patches there for discussion.


* Marketing slogans abound: "Very MachO". "Some warts". "Surprisingly friendly with ELF". "Not yet on speaking terms with DWARF".


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: New JIT APIs

Lang Hames
Hi Armin,

> The MCJIT API can only be used once to JIT compile external souces to excuteable code into the address space of a running process.

I'm not sure exactly what you mean by "can only be used once" in this context. Regardless, the new APIs are definitely designed to make it easier to lead, unload and replace modules, and I hope they will support a wider range of use cases off-the-shelf than MCJIT does.

Cheers,
Lang.

On Fri, Jan 16, 2015 at 2:41 AM, Armin Steinhoff <[hidden email]> wrote:

Hi Lang,

we are using the JIT API of TCC and  the MCJIT API in order to import external code into a running control application process.

The MCJIT API can only be used once to JIT compile external souces to excuteable code into the address space of a running process.

Has your JIT API the same restriction ?  It would be very nice if your JIT API could provide a similar functionalty as provided by TCC.

Best Regards

Armin


Lang Hames schrieb:
Hi All,

The attached patch (against r225842) contains some new JIT APIs that I've been working on. I'm going to start breaking it up, tidying it up, and submitting patches to llvm-commits soon, but while I'm working on that I thought I'd put the whole patch out for the curious to start playing around with and/or commenting on.

The aim of these new APIs is to cleanly support a wider range of JIT use cases in LLVM, and to recover some of the functionality lost when the legacy JIT was removed. In particular, I wanted to see if I could re-enable lazy compilation while following MCJIT's design philosophy of relying on the MC layer and module-at-a-time compilation. The attached patch goes some way to addressing these aims, though there's a lot still to do.

The 20,000 ft overview, for those who want to get straight to the code:

The new APIs are not built on top of the MCJIT class, as I didn't want a single class trying to be all things to all people. Instead, the new APIs consist of a set of software components for building JITs. The idea is that you should be able to take these off the shelf and compose them reasonably easily to get the behavior that you want. In the future I hope that people who are working on LLVM-based JITs, if they find this approach useful, will contribute back components that they've built locally and that they think would be useful for a wider audience. As a demonstration of the practicality of this approach the attached patch contains a class, MCJITReplacement, that composes some of the components to re-create the behavior of MCJIT. This works well enough to pass all MCJIT regression and unit tests on Darwin, and all but four regression tests on Linux. The patch also contains the desired "new" feature: Function-at-a-time lazy jitting in roughly the style of the legacy JIT. The attached lazydemo.tgz file contains a program which composes the new JIT components (including the lazy-jitting component) to lazily execute bitcode. I've tested this program on Darwin and it can run non-trivial benchmark programs, e.g. 401.bzip2 from SPEC2006.

These new APIs are named after the motivating feature: On Request Compilation, or ORC. I believe the logo potential is outstanding. I'm picturing an Orc riding a Dragon. If I'm honest this was at least 45% of my motivation for doing this project*.

You'll find the new headers in llvm/include/llvm/ExecutionEngine/OrcJIT/*.h, and the implementation files in lib/ExecutionEngine/OrcJIT/*.

I imagine there will be a number of questions about the design and implementation. I've tried to preempt a few below, but please fire away with anything I've left out.

Also, thanks to Jim Grosbach, Michael Illseman, David Blaikie, Pete Cooper, Eric Christopher, and Louis Gerbarg for taking time out to review, discuss and test this thing as I've worked on it.

Cheers,
Lang.

Possible questions:

(1)
Q. Are you trying to kill off MCJIT?
A. There are no plans to remove MCJIT. The new APIs are designed to live alongside it.

(2)
Q. What do "JIT components" look like, and how do you compose them?
A. The classes and functions you'll find in OrcJIT/*.h fall into two rough categories: Layers and Utilities. Layers are classes that implement a small common interface that makes them easy to compose:

class SomeLayer {
private:
  // Implementation details
public:
  // Implementation details

  typedef ??? Handle;

  template <typename ModuleSet>
  Handle addModuleSet(ModuleSet&& Ms);

  void removeModuleSet(Handle H);

  uint64_t getSymbolAddress(StringRef Name, bool ExportedSymbolsOnly);

  uint64_t lookupSymbolAddressIn(Handle H, StringRef Name, bool ExportedSymbolsOnly);
};

Layers are usually designed to sit one-on-top-of-another, with each doing some sort of useful work before handing off to the layer below it. The layers that are currently included in the patch are the the CompileOnDemandLayer, which breaks up modules and redirects calls to not-yet-compiled functions back into the JIT; the LazyEmitLayer, which defers adding modules to the layer below until a symbol in the module is actually requested; the IRCompilingLayer, which compiles bitcode to objects; and the ObjectLinkingLayer, which links sets of objects in memory using RuntimeDyld.

Utilities are everything that's not a layer. Ideally the heavy lifting is done by the utilities. Layers just wrap certain uses-cases to make them easy to compose.

Clients are free to use utilities directly, or compose layers, or implement new utilities or layers.

(3)
Q. Why "addModuleSet" rather than "addModule"?
A. Allowing multiple modules to be passed around together allows layers lower in the stack to perform interesting optimizations. E.g. direct calls between objects that are allocated sufficiently close in memory. To add a single Module you just add a single-element set.

(4)
Q. What happened to "finalize"?
A. In the Orc APIs, getSymbolAddress automatically finalizes as necessary before returning addresses to the client. When you get an address back from getSymbolAddress, that address is ready to call.

(5)
Q. What does "removeModuleSet" do?
A. It removes the modules represented by the handle from the JIT. The meaning of this is specific to each layer, but generally speaking it means that any memory allocated for those modules (and their corresponding Objects, linked sections, etc) has been freed, and the symbols those modules provided are now undefined. Calling getSymbolAddress for a symbol that was defined in a module that has been removed is expected to return '0'.

(5a)
Q. How are the linked sections freed? RTDyldMemoryManager doesn't have any "free.*Section" methods.
A. Each ModuleSet gets its own RTDyldMemoryManager, and that is destroyed when the module set is freed. The choice of RTDyldMemoryManager is up to the client, but the standard memory managers will free the memory allocated for the linked sections when they're destroyed.

(6)
Q. How does the CompileOnDemand layer redirect calls to the JIT?
A. It currently uses double-indirection: Function bodies are extracted into new modules, and the body of the original function is replaced with an indirect call to the extracted body. The pointer for the indirect call is initialized by the JIT to point at some inline assembly which is injected into the module, and this calls back in to the JIT to trigger compilation of the extracted body. In the future I plan to make the redirection strategy a parameter of the CompileOnDemand layer. Double-indirection is the safest: It preserves function-pointer equality and works with non-writable executable memory, however there's no reason we couldn't use single indirection (for extra speed where pointer-equality isn't required), or patchpoints (for clients who can allocate writable/executable memory), or any combination of the three. My intent is that this should be up to the client.

As a brief note: it's worth noting that the CompileOnDemand layer doesn't handle lazy compilation itself, just lazy symbol resolution (i.e. symbols are resolved on first call, not when compiling). If you've put the CompileOnDemand layer on top of the LazyEmitLayer then deferring symbol lookup automatically defers compilation. (E.g. You can remove the LazyEmitLayer in main.cpp of the lazydemo and you'll get indirection and callbacks, but no lazy compilation). 

(7)
Q. Do the new APIs support cross-target JITing like MCJIT does?
A. Yes.

(7.a)
Q. Do the new APIs support cross-target (or cross process) lazy-jitting?
A. Not yet, but all that is required is for us to add a small amount of runtime to the JIT'd process to call back in to the JIT via some RPC mechanism. There are no significant barriers to implementing this that I'm aware of.

(8)
Q. Do any of the components implement the ExecutionEngine interface?
A. None of the components do, but the MCJITReplacement class does.

(9)
Q. Does this address any of the long-standing issues with MCJIT - Stackmap parsing? Debugging? Thread-local-storage?
A. No, but it doesn't get in the way either. These features are still on the road-map (such as it exists) and I'm hoping that the modular nature of Orc will us to play around with new features like this without any risk of disturbing existing clients, and so allow us to make faster progress.

(10)
Q. Why is part X of the patch (ugly | buggy | in the wrong place) ?
A. I'm still tidying the patch up - please save patch specific feedback for for llvm-commits, otherwise we'll get cross-talk between the threads. The patches should be coming soon.

---

As mentioned above, I'm happy to answer further general questions about what these APIs can do, or where I see them going. Feedback on the patch itself should be directed to the llvm-commits list when I start posting patches there for discussion.


* Marketing slogans abound: "Very MachO". "Some warts". "Surprisingly friendly with ELF". "Not yet on speaking terms with DWARF".


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: New JIT APIs

Armin Steinhoff-2
Hi Lang,

Lang Hames schrieb:
Hi Armin,

> The MCJIT API can only be used once to JIT compile external souces to excuteable code into the address space of a running process.

That means: after the first successfull JIT compile it isn't possible to do it again (within the same active process) ... because of some resource issues.


I'm not sure exactly what you mean by "can only be used once" in this context. Regardless, the new APIs are definitely designed to make it easier to lead, unload and replace modules, and I hope they will support a wider range of use cases off-the-shelf than MCJIT does.

OK ... sound interesting,  I will test it.


Regards

Armin



Cheers,
Lang.

On Fri, Jan 16, 2015 at 2:41 AM, Armin Steinhoff <[hidden email]> wrote:

Hi Lang,

we are using the JIT API of TCC and  the MCJIT API in order to import external code into a running control application process.

The MCJIT API can only be used once to JIT compile external souces to excuteable code into the address space of a running process.

Has your JIT API the same restriction ?  It would be very nice if your JIT API could provide a similar functionalty as provided by TCC.

Best Regards

Armin


Lang Hames schrieb:
Hi All,

The attached patch (against r225842) contains some new JIT APIs that I've been working on. I'm going to start breaking it up, tidying it up, and submitting patches to llvm-commits soon, but while I'm working on that I thought I'd put the whole patch out for the curious to start playing around with and/or commenting on.

The aim of these new APIs is to cleanly support a wider range of JIT use cases in LLVM, and to recover some of the functionality lost when the legacy JIT was removed. In particular, I wanted to see if I could re-enable lazy compilation while following MCJIT's design philosophy of relying on the MC layer and module-at-a-time compilation. The attached patch goes some way to addressing these aims, though there's a lot still to do.

The 20,000 ft overview, for those who want to get straight to the code:

The new APIs are not built on top of the MCJIT class, as I didn't want a single class trying to be all things to all people. Instead, the new APIs consist of a set of software components for building JITs. The idea is that you should be able to take these off the shelf and compose them reasonably easily to get the behavior that you want. In the future I hope that people who are working on LLVM-based JITs, if they find this approach useful, will contribute back components that they've built locally and that they think would be useful for a wider audience. As a demonstration of the practicality of this approach the attached patch contains a class, MCJITReplacement, that composes some of the components to re-create the behavior of MCJIT. This works well enough to pass all MCJIT regression and unit tests on Darwin, and all but four regression tests on Linux. The patch also contains the desired "new" feature: Function-at-a-time lazy jitting in roughly the style of the legacy JIT. The attached lazydemo.tgz file contains a program which composes the new JIT components (including the lazy-jitting component) to lazily execute bitcode. I've tested this program on Darwin and it can run non-trivial benchmark programs, e.g. 401.bzip2 from SPEC2006.

These new APIs are named after the motivating feature: On Request Compilation, or ORC. I believe the logo potential is outstanding. I'm picturing an Orc riding a Dragon. If I'm honest this was at least 45% of my motivation for doing this project*.

You'll find the new headers in llvm/include/llvm/ExecutionEngine/OrcJIT/*.h, and the implementation files in lib/ExecutionEngine/OrcJIT/*.

I imagine there will be a number of questions about the design and implementation. I've tried to preempt a few below, but please fire away with anything I've left out.

Also, thanks to Jim Grosbach, Michael Illseman, David Blaikie, Pete Cooper, Eric Christopher, and Louis Gerbarg for taking time out to review, discuss and test this thing as I've worked on it.

Cheers,
Lang.

Possible questions:

(1)
Q. Are you trying to kill off MCJIT?
A. There are no plans to remove MCJIT. The new APIs are designed to live alongside it.

(2)
Q. What do "JIT components" look like, and how do you compose them?
A. The classes and functions you'll find in OrcJIT/*.h fall into two rough categories: Layers and Utilities. Layers are classes that implement a small common interface that makes them easy to compose:

class SomeLayer {
private:
  // Implementation details
public:
  // Implementation details

  typedef ??? Handle;

  template <typename ModuleSet>
  Handle addModuleSet(ModuleSet&& Ms);

  void removeModuleSet(Handle H);

  uint64_t getSymbolAddress(StringRef Name, bool ExportedSymbolsOnly);

  uint64_t lookupSymbolAddressIn(Handle H, StringRef Name, bool ExportedSymbolsOnly);
};

Layers are usually designed to sit one-on-top-of-another, with each doing some sort of useful work before handing off to the layer below it. The layers that are currently included in the patch are the the CompileOnDemandLayer, which breaks up modules and redirects calls to not-yet-compiled functions back into the JIT; the LazyEmitLayer, which defers adding modules to the layer below until a symbol in the module is actually requested; the IRCompilingLayer, which compiles bitcode to objects; and the ObjectLinkingLayer, which links sets of objects in memory using RuntimeDyld.

Utilities are everything that's not a layer. Ideally the heavy lifting is done by the utilities. Layers just wrap certain uses-cases to make them easy to compose.

Clients are free to use utilities directly, or compose layers, or implement new utilities or layers.

(3)
Q. Why "addModuleSet" rather than "addModule"?
A. Allowing multiple modules to be passed around together allows layers lower in the stack to perform interesting optimizations. E.g. direct calls between objects that are allocated sufficiently close in memory. To add a single Module you just add a single-element set.

(4)
Q. What happened to "finalize"?
A. In the Orc APIs, getSymbolAddress automatically finalizes as necessary before returning addresses to the client. When you get an address back from getSymbolAddress, that address is ready to call.

(5)
Q. What does "removeModuleSet" do?
A. It removes the modules represented by the handle from the JIT. The meaning of this is specific to each layer, but generally speaking it means that any memory allocated for those modules (and their corresponding Objects, linked sections, etc) has been freed, and the symbols those modules provided are now undefined. Calling getSymbolAddress for a symbol that was defined in a module that has been removed is expected to return '0'.

(5a)
Q. How are the linked sections freed? RTDyldMemoryManager doesn't have any "free.*Section" methods.
A. Each ModuleSet gets its own RTDyldMemoryManager, and that is destroyed when the module set is freed. The choice of RTDyldMemoryManager is up to the client, but the standard memory managers will free the memory allocated for the linked sections when they're destroyed.

(6)
Q. How does the CompileOnDemand layer redirect calls to the JIT?
A. It currently uses double-indirection: Function bodies are extracted into new modules, and the body of the original function is replaced with an indirect call to the extracted body. The pointer for the indirect call is initialized by the JIT to point at some inline assembly which is injected into the module, and this calls back in to the JIT to trigger compilation of the extracted body. In the future I plan to make the redirection strategy a parameter of the CompileOnDemand layer. Double-indirection is the safest: It preserves function-pointer equality and works with non-writable executable memory, however there's no reason we couldn't use single indirection (for extra speed where pointer-equality isn't required), or patchpoints (for clients who can allocate writable/executable memory), or any combination of the three. My intent is that this should be up to the client.

As a brief note: it's worth noting that the CompileOnDemand layer doesn't handle lazy compilation itself, just lazy symbol resolution (i.e. symbols are resolved on first call, not when compiling). If you've put the CompileOnDemand layer on top of the LazyEmitLayer then deferring symbol lookup automatically defers compilation. (E.g. You can remove the LazyEmitLayer in main.cpp of the lazydemo and you'll get indirection and callbacks, but no lazy compilation). 

(7)
Q. Do the new APIs support cross-target JITing like MCJIT does?
A. Yes.

(7.a)
Q. Do the new APIs support cross-target (or cross process) lazy-jitting?
A. Not yet, but all that is required is for us to add a small amount of runtime to the JIT'd process to call back in to the JIT via some RPC mechanism. There are no significant barriers to implementing this that I'm aware of.

(8)
Q. Do any of the components implement the ExecutionEngine interface?
A. None of the components do, but the MCJITReplacement class does.

(9)
Q. Does this address any of the long-standing issues with MCJIT - Stackmap parsing? Debugging? Thread-local-storage?
A. No, but it doesn't get in the way either. These features are still on the road-map (such as it exists) and I'm hoping that the modular nature of Orc will us to play around with new features like this without any risk of disturbing existing clients, and so allow us to make faster progress.

(10)
Q. Why is part X of the patch (ugly | buggy | in the wrong place) ?
A. I'm still tidying the patch up - please save patch specific feedback for for llvm-commits, otherwise we'll get cross-talk between the threads. The patches should be coming soon.

---

As mentioned above, I'm happy to answer further general questions about what these APIs can do, or where I see them going. Feedback on the patch itself should be directed to the llvm-commits list when I start posting patches there for discussion.


* Marketing slogans abound: "Very MachO". "Some warts". "Surprisingly friendly with ELF". "Not yet on speaking terms with DWARF".


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev




_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: New JIT APIs

Caldarale, Charles R
> From: [hidden email] [mailto:[hidden email]]
> On Behalf Of Armin Steinhoff
> Subject: Re: [LLVMdev] New JIT APIs

> The MCJIT API can only be used once to JIT compile external souces to excuteable code
> into the address space of a running process.

> That means: after the first successfull JIT compile it isn't possible to do it again
> within the same active process) ... because of some resource issues.

We compile many thousands of modules and execute them in the same process without running out of resources.  We do recycle the LLVMContext and IRBuilder objects after some number of compilations to keep the constant pool from getting out of hand.

 - Chuck


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: New JIT APIs

Philip Reames-4
In reply to this post by Armin Steinhoff-2

On 01/16/2015 02:08 PM, Armin Steinhoff wrote:
Hi Lang,

Lang Hames schrieb:
Hi Armin,

> The MCJIT API can only be used once to JIT compile external souces to excuteable code into the address space of a running process.

That means: after the first successfull JIT compile it isn't possible to do it again (within the same active process) ... because of some resource issues.
Er, this is definitely something specific to your use case or environment.  I'm doing thousands of compiles in the same process on an extremely regular basis with no problems. 


I'm not sure exactly what you mean by "can only be used once" in this context. Regardless, the new APIs are definitely designed to make it easier to lead, unload and replace modules, and I hope they will support a wider range of use cases off-the-shelf than MCJIT does.

OK ... sound interesting,  I will test it.


Regards

Armin



Cheers,
Lang.

On Fri, Jan 16, 2015 at 2:41 AM, Armin Steinhoff <[hidden email]> wrote:

Hi Lang,

we are using the JIT API of TCC and  the MCJIT API in order to import external code into a running control application process.

The MCJIT API can only be used once to JIT compile external souces to excuteable code into the address space of a running process.

Has your JIT API the same restriction ?  It would be very nice if your JIT API could provide a similar functionalty as provided by TCC.

Best Regards

Armin


Lang Hames schrieb:
Hi All,

The attached patch (against r225842) contains some new JIT APIs that I've been working on. I'm going to start breaking it up, tidying it up, and submitting patches to llvm-commits soon, but while I'm working on that I thought I'd put the whole patch out for the curious to start playing around with and/or commenting on.

The aim of these new APIs is to cleanly support a wider range of JIT use cases in LLVM, and to recover some of the functionality lost when the legacy JIT was removed. In particular, I wanted to see if I could re-enable lazy compilation while following MCJIT's design philosophy of relying on the MC layer and module-at-a-time compilation. The attached patch goes some way to addressing these aims, though there's a lot still to do.

The 20,000 ft overview, for those who want to get straight to the code:

The new APIs are not built on top of the MCJIT class, as I didn't want a single class trying to be all things to all people. Instead, the new APIs consist of a set of software components for building JITs. The idea is that you should be able to take these off the shelf and compose them reasonably easily to get the behavior that you want. In the future I hope that people who are working on LLVM-based JITs, if they find this approach useful, will contribute back components that they've built locally and that they think would be useful for a wider audience. As a demonstration of the practicality of this approach the attached patch contains a class, MCJITReplacement, that composes some of the components to re-create the behavior of MCJIT. This works well enough to pass all MCJIT regression and unit tests on Darwin, and all but four regression tests on Linux. The patch also contains the desired "new" feature: Function-at-a-time lazy jitting in roughly the style of the legacy JIT. The attached lazydemo.tgz file contains a program which composes the new JIT components (including the lazy-jitting component) to lazily execute bitcode. I've tested this program on Darwin and it can run non-trivial benchmark programs, e.g. 401.bzip2 from SPEC2006.

These new APIs are named after the motivating feature: On Request Compilation, or ORC. I believe the logo potential is outstanding. I'm picturing an Orc riding a Dragon. If I'm honest this was at least 45% of my motivation for doing this project*.

You'll find the new headers in llvm/include/llvm/ExecutionEngine/OrcJIT/*.h, and the implementation files in lib/ExecutionEngine/OrcJIT/*.

I imagine there will be a number of questions about the design and implementation. I've tried to preempt a few below, but please fire away with anything I've left out.

Also, thanks to Jim Grosbach, Michael Illseman, David Blaikie, Pete Cooper, Eric Christopher, and Louis Gerbarg for taking time out to review, discuss and test this thing as I've worked on it.

Cheers,
Lang.

Possible questions:

(1)
Q. Are you trying to kill off MCJIT?
A. There are no plans to remove MCJIT. The new APIs are designed to live alongside it.

(2)
Q. What do "JIT components" look like, and how do you compose them?
A. The classes and functions you'll find in OrcJIT/*.h fall into two rough categories: Layers and Utilities. Layers are classes that implement a small common interface that makes them easy to compose:

class SomeLayer {
private:
  // Implementation details
public:
  // Implementation details

  typedef ??? Handle;

  template <typename ModuleSet>
  Handle addModuleSet(ModuleSet&& Ms);

  void removeModuleSet(Handle H);

  uint64_t getSymbolAddress(StringRef Name, bool ExportedSymbolsOnly);

  uint64_t lookupSymbolAddressIn(Handle H, StringRef Name, bool ExportedSymbolsOnly);
};

Layers are usually designed to sit one-on-top-of-another, with each doing some sort of useful work before handing off to the layer below it. The layers that are currently included in the patch are the the CompileOnDemandLayer, which breaks up modules and redirects calls to not-yet-compiled functions back into the JIT; the LazyEmitLayer, which defers adding modules to the layer below until a symbol in the module is actually requested; the IRCompilingLayer, which compiles bitcode to objects; and the ObjectLinkingLayer, which links sets of objects in memory using RuntimeDyld.

Utilities are everything that's not a layer. Ideally the heavy lifting is done by the utilities. Layers just wrap certain uses-cases to make them easy to compose.

Clients are free to use utilities directly, or compose layers, or implement new utilities or layers.

(3)
Q. Why "addModuleSet" rather than "addModule"?
A. Allowing multiple modules to be passed around together allows layers lower in the stack to perform interesting optimizations. E.g. direct calls between objects that are allocated sufficiently close in memory. To add a single Module you just add a single-element set.

(4)
Q. What happened to "finalize"?
A. In the Orc APIs, getSymbolAddress automatically finalizes as necessary before returning addresses to the client. When you get an address back from getSymbolAddress, that address is ready to call.

(5)
Q. What does "removeModuleSet" do?
A. It removes the modules represented by the handle from the JIT. The meaning of this is specific to each layer, but generally speaking it means that any memory allocated for those modules (and their corresponding Objects, linked sections, etc) has been freed, and the symbols those modules provided are now undefined. Calling getSymbolAddress for a symbol that was defined in a module that has been removed is expected to return '0'.

(5a)
Q. How are the linked sections freed? RTDyldMemoryManager doesn't have any "free.*Section" methods.
A. Each ModuleSet gets its own RTDyldMemoryManager, and that is destroyed when the module set is freed. The choice of RTDyldMemoryManager is up to the client, but the standard memory managers will free the memory allocated for the linked sections when they're destroyed.

(6)
Q. How does the CompileOnDemand layer redirect calls to the JIT?
A. It currently uses double-indirection: Function bodies are extracted into new modules, and the body of the original function is replaced with an indirect call to the extracted body. The pointer for the indirect call is initialized by the JIT to point at some inline assembly which is injected into the module, and this calls back in to the JIT to trigger compilation of the extracted body. In the future I plan to make the redirection strategy a parameter of the CompileOnDemand layer. Double-indirection is the safest: It preserves function-pointer equality and works with non-writable executable memory, however there's no reason we couldn't use single indirection (for extra speed where pointer-equality isn't required), or patchpoints (for clients who can allocate writable/executable memory), or any combination of the three. My intent is that this should be up to the client.

As a brief note: it's worth noting that the CompileOnDemand layer doesn't handle lazy compilation itself, just lazy symbol resolution (i.e. symbols are resolved on first call, not when compiling). If you've put the CompileOnDemand layer on top of the LazyEmitLayer then deferring symbol lookup automatically defers compilation. (E.g. You can remove the LazyEmitLayer in main.cpp of the lazydemo and you'll get indirection and callbacks, but no lazy compilation). 

(7)
Q. Do the new APIs support cross-target JITing like MCJIT does?
A. Yes.

(7.a)
Q. Do the new APIs support cross-target (or cross process) lazy-jitting?
A. Not yet, but all that is required is for us to add a small amount of runtime to the JIT'd process to call back in to the JIT via some RPC mechanism. There are no significant barriers to implementing this that I'm aware of.

(8)
Q. Do any of the components implement the ExecutionEngine interface?
A. None of the components do, but the MCJITReplacement class does.

(9)
Q. Does this address any of the long-standing issues with MCJIT - Stackmap parsing? Debugging? Thread-local-storage?
A. No, but it doesn't get in the way either. These features are still on the road-map (such as it exists) and I'm hoping that the modular nature of Orc will us to play around with new features like this without any risk of disturbing existing clients, and so allow us to make faster progress.

(10)
Q. Why is part X of the patch (ugly | buggy | in the wrong place) ?
A. I'm still tidying the patch up - please save patch specific feedback for for llvm-commits, otherwise we'll get cross-talk between the threads. The patches should be coming soon.

---

As mentioned above, I'm happy to answer further general questions about what these APIs can do, or where I see them going. Feedback on the patch itself should be directed to the llvm-commits list when I start posting patches there for discussion.


* Marketing slogans abound: "Very MachO". "Some warts". "Surprisingly friendly with ELF". "Not yet on speaking terms with DWARF".


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev




_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: New JIT APIs

Armin Steinhoff-2
In reply to this post by Caldarale, Charles R

Hi all,

is

  delete EE;   // execution engine
  llvm_shutdown();

sufficient ?

Regards

Armin


Armin Steinhoff schrieb:
Caldarale, Charles R schrieb:
From: [hidden email] [[hidden email]]
On Behalf Of Armin Steinhoff
Subject: Re: [LLVMdev] New JIT APIs
   
The MCJIT API can only be used once to JIT compile external souces to excuteable code 
into the address space of a running process.
That means: after the first successfull JIT compile it isn't possible to do it again 
within the same active process) ... because of some resource issues.
We compile many thousands of modules and execute them in the same process without running out of resources.  We do recycle the LLVMContext and IRBuilder objects after some number of compilations to keep the constant pool from getting out of hand.

 - Chuck


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev




_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
12