RFC: Separate machine IR from lib/CodeGen into lib/MIR

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
20 messages Options
Reply | Threaded
Open this post in threaded view
|

RFC: Separate machine IR from lib/CodeGen into lib/MIR

Alex L
Hi all,

The CodeGen library is a big bag of interdependent bits. This caused
a circular dependency in the MIR serialization commit (r237954), which got
reverted in r238007.

I propose separating the machine IR out of CodeGen and into its own
MIR library, living at lib/MIR. This touches every target but it's mostly a
mechanical change that renames the header files, although a couple of pre-rename
commits are needed to decouple several things in CodeGen. I've attached a proof
of concept patch that renames the files and applies cleanly to r238038.

The MIR serialization code will go under the new MIR library and not under
CodeGen which will break the dependency. The CodeGen library will link MIR
automatically so the build files for tools that link with CodeGen won't have to
be changed.

Thanks for reading,
Alex


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

mir-move.patch (391K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Separate machine IR from lib/CodeGen into lib/MIR

Duncan P. N. Exon Smith

> On 2015-May-26, at 09:46, Alex L <[hidden email]> wrote:
>
> Hi all,
>
> The CodeGen library is a big bag of interdependent bits. This caused
> a circular dependency in the MIR serialization commit (r237954), which got
> reverted in r238007.
>
> I propose separating the machine IR out of CodeGen and into its own
> MIR library, living at lib/MIR. This touches every target but it's mostly a
> mechanical change that renames the header files, although a couple of pre-rename
> commits are needed to decouple several things in CodeGen. I've attached a proof
> of concept patch that renames the files and applies cleanly to r238038.
>
> The MIR serialization code will go under the new MIR library and not under
> CodeGen which will break the dependency. The CodeGen library will link MIR
> automatically so the build files for tools that link with CodeGen won't have to
> be changed.
>
> Thanks for reading,
> Alex
>

FWIW, I'm in favour of this.  I'd be interested to hear if there's
a reason *not* to do this.

Here's some more context:

  http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150518/278031.html
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Separate machine IR from lib/CodeGen into lib/MIR

Owen Anderson-2

On May 26, 2015, at 1:37 PM, Duncan P. N. Exon Smith <[hidden email]> wrote:


On 2015-May-26, at 09:46, Alex L <[hidden email]> wrote:

Hi all,

The CodeGen library is a big bag of interdependent bits. This caused
a circular dependency in the MIR serialization commit (r237954), which got
reverted in r238007.

I propose separating the machine IR out of CodeGen and into its own
MIR library, living at lib/MIR. This touches every target but it's mostly a
mechanical change that renames the header files, although a couple of pre-rename
commits are needed to decouple several things in CodeGen. I've attached a proof
of concept patch that renames the files and applies cleanly to r238038.

The MIR serialization code will go under the new MIR library and not under
CodeGen which will break the dependency. The CodeGen library will link MIR
automatically so the build files for tools that link with CodeGen won't have to
be changed.

Thanks for reading,
Alex


FWIW, I'm in favour of this.  I'd be interested to hear if there's
a reason *not* to do this.

Here's some more context:

 http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150518/278031.html

I have no objections, though I think lib/MachineIR would make a prettier bikeshed ;-)

—Owen

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Separate machine IR from lib/CodeGen into lib/MIR

Chris Lattner-2
On May 26, 2015, at 1:46 PM, Owen Anderson <[hidden email]> wrote:
The MIR serialization code will go under the new MIR library and not under
CodeGen which will break the dependency. The CodeGen library will link MIR
automatically so the build files for tools that link with CodeGen won't have to
be changed.

Thanks for reading,
Alex


FWIW, I'm in favour of this.  I'd be interested to hear if there's
a reason *not* to do this.

Here's some more context:

 http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150518/278031.html

I have no objections, though I think lib/MachineIR would make a prettier bikeshed ;-)

+1.  Codegen should also be renamed to something else that starts with “Machine” as well.

-Chris


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Separate machine IR from lib/CodeGen into lib/MIR

Quentin Colombet
+1.

Could those two be subdirectories of one “Machine-Related-Stuff” directory?
E.g.,
MachineStuff/IR
MachineStuff/CodeGen

Where MachineStuff is something meaningful :).

That way, they keep a logic bound, more formal than the naming convention.

My 2c.

Q.
On May 26, 2015, at 9:28 PM, Chris Lattner <[hidden email]> wrote:

On May 26, 2015, at 1:46 PM, Owen Anderson <[hidden email]> wrote:
The MIR serialization code will go under the new MIR library and not under
CodeGen which will break the dependency. The CodeGen library will link MIR
automatically so the build files for tools that link with CodeGen won't have to
be changed.

Thanks for reading,
Alex


FWIW, I'm in favour of this.  I'd be interested to hear if there's
a reason *not* to do this.

Here's some more context:

 http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150518/278031.html

I have no objections, though I think lib/MachineIR would make a prettier bikeshed ;-)

+1.  Codegen should also be renamed to something else that starts with “Machine” as well.

-Chris

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Separate machine IR from lib/CodeGen into lib/MIR

Chris Lattner-2

On May 26, 2015, at 11:20 PM, Quentin Colombet <[hidden email]> wrote:

+1.

Could those two be subdirectories of one “Machine-Related-Stuff” directory?
E.g.,
MachineStuff/IR
MachineStuff/CodeGen

Where MachineStuff is something meaningful :).

That way, they keep a logic bound, more formal than the naming convention.

Something like?

lib/Machine/IR
lib/Machine/Passes

Unless there will be many subdirectories, it seems slightly better to flatten out the layer.

Also, if we’re getting crazy here, CodeGen in clang should be renamed to IRGen, AsmPrinter should be renamed to MCGen, and SelectionDAG should be replaced ;-)

-Chris




My 2c.

Q.
On May 26, 2015, at 9:28 PM, Chris Lattner <[hidden email]> wrote:

On May 26, 2015, at 1:46 PM, Owen Anderson <[hidden email]> wrote:
The MIR serialization code will go under the new MIR library and not under
CodeGen which will break the dependency. The CodeGen library will link MIR
automatically so the build files for tools that link with CodeGen won't have to
be changed.

Thanks for reading,
Alex


FWIW, I'm in favour of this.  I'd be interested to hear if there's
a reason *not* to do this.

Here's some more context:

 http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150518/278031.html

I have no objections, though I think lib/MachineIR would make a prettier bikeshed ;-)

+1.  Codegen should also be renamed to something else that starts with “Machine” as well.

-Chris

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Separate machine IR from lib/CodeGen into lib/MIR

Reid Kleckner-2
In reply to this post by Owen Anderson-2
On Tue, May 26, 2015 at 1:46 PM, Owen Anderson <[hidden email]> wrote:
I have no objections, though I think lib/MachineIR would make a prettier bikeshed ;-)

+1 for this. I think you probably want to limit the scope of your change to just splitting out MachineIR and not trying to rename CodeGen, AsmPrinter, etc. :) 

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Separate machine IR from lib/CodeGen into lib/MIR

Duncan P. N. Exon Smith

> On 2015 May 27, at 10:13, Reid Kleckner <[hidden email]> wrote:
>
> On Tue, May 26, 2015 at 1:46 PM, Owen Anderson <[hidden email]> wrote:
> I have no objections, though I think lib/MachineIR would make a prettier bikeshed ;-)
>
> +1 for this. I think you probably want to limit the scope of your change to just splitting out MachineIR and not trying to rename CodeGen, AsmPrinter, etc. :)

Agreed :).  The rest seems out-of-scope for Alex's internship.
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Separate machine IR from lib/CodeGen into lib/MIR

Chandler Carruth-2
In reply to this post by Chris Lattner-2
On Wed, May 27, 2015 at 8:15 AM Chris Lattner <[hidden email]> wrote:
On May 26, 2015, at 11:20 PM, Quentin Colombet <[hidden email]> wrote:

+1.

Could those two be subdirectories of one “Machine-Related-Stuff” directory?
E.g.,
MachineStuff/IR
MachineStuff/CodeGen

Where MachineStuff is something meaningful :).

That way, they keep a logic bound, more formal than the naming convention.

Something like?

lib/Machine/IR
lib/Machine/Passes

Unless there will be many subdirectories, it seems slightly better to flatten out the layer.

I strongly prefer breaking it out into subdirectories. There are a bunch of reasons:

1) It will grow.
2) Without it, we cannot have separate libraries, which will lose some options for shrinking the size of libraries.
3) Without separate libraries we can't as easily enforce the layering separations between these things. For example, making this split will be *extremely* hard currently because there is a lot of inappropriate dependencies that will block splitting things out.

However, "IR" and "Passes" cover only two of the things in CodeGen. There is also the implementation of a lot of common infrastructure used by targets' code generators.

My initial suggestion would be to just sink CodeGen into Machine/CodeGen, add the .../IR and .../Passes directories, and then start extracting things from CodeGen into the two more narrow directories. I think there is likely some stuff that should continue to live in a "code generator" infrastructure directory as it is neither part of the machine IR, nor is it part of any particular pass.

My suggested layering would be:

Passes depend on IR, CodeGen depends on both Passes and IR. The idea is that anything passes require should be embedded into the IR.

However, this won't currently work. There are things that seem to be parallel but independent of the machine IR and are used by any machine passes. There are also things that clearly use the machine passes. Currently, I'm not sure how to cleanly divide this library up without really significant refactoring of every part of the code generator.

While I would like to see this happen, is it really a good idea to put this in the critical path of getting MIR serialized and deserialized?
 

Also, if we’re getting crazy here, CodeGen in clang should be renamed to IRGen, AsmPrinter should be renamed to MCGen, and SelectionDAG should be replaced ;-)

I'm happy to actually do the CodeGen -> IRGen rename. I actually built the change but didn't submit it because of the concerns of some out-of-tree users of Clang. I still have all the perl scripts and such I used sitting around.

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Separate machine IR from lib/CodeGen into lib/MIR

Chandler Carruth-2
In reply to this post by Duncan P. N. Exon Smith
On Wed, May 27, 2015 at 10:32 AM Duncan P. N. Exon Smith <[hidden email]> wrote:

> On 2015 May 27, at 10:13, Reid Kleckner <[hidden email]> wrote:
>
> On Tue, May 26, 2015 at 1:46 PM, Owen Anderson <[hidden email]> wrote:
> I have no objections, though I think lib/MachineIR would make a prettier bikeshed ;-)
>
> +1 for this. I think you probably want to limit the scope of your change to just splitting out MachineIR and not trying to rename CodeGen, AsmPrinter, etc. :)

Agreed :).  The rest seems out-of-scope for Alex's internship.

I think even this is going to be out-of-scope unless the *existing* code happens to have a clean layering separation. I don't see one currently, but I might be missing it.

Specifically, I'm *strongly* opposed to splitting this out while there are circular references between any MI implementation and any CodeGen implementation. Not all of our linkers or platforms even allow this, but the errors tend to be brittle and sporadic rather than consistent. 

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Separate machine IR from lib/CodeGen into lib/MIR

Duncan P. N. Exon Smith
In reply to this post by Chris Lattner-2

> On 2015 May 27, at 08:08, Chris Lattner <[hidden email]> wrote:
>
>
>> On May 26, 2015, at 11:20 PM, Quentin Colombet <[hidden email]> wrote:
>>
>> +1.
>>
>> Could those two be subdirectories of one “Machine-Related-Stuff” directory?
>> E.g.,
>> MachineStuff/IR
>> MachineStuff/CodeGen
>>
>> Where MachineStuff is something meaningful :).
>>
>> That way, they keep a logic bound, more formal than the naming convention.
>
> Something like?
>
> lib/Machine/IR
> lib/Machine/Passes
>
> Unless there will be many subdirectories, it seems slightly better to flatten out the layer.

There are two different layers we can flatten.  We already have two
layers in lib/CodeGen.  If we flatten the second layer and go with
Quentin's idea, we get something like this:

lib/Machine/MIR
lib/Machine/Passes
lib/Machine/AsmPrinter (or MCGen)
lib/Machine/SelectionDAG (or MIRGen?)

(IMO, lib/Machine/IR is too close to lib/IR.)

If we flatten the first, we get this:

lib/MachineIR
lib/CodeGen (or MachineIRGen)
lib/CodeGen/AsmPrinter (or MCGen)
lib/CodeGen/SelectionDAG (or ...)

Both seem pretty reasonable to me, but the first one seems a little
cleaner.

>
> Also, if we’re getting crazy here, CodeGen in clang should be renamed to IRGen, AsmPrinter should be renamed to MCGen, and SelectionDAG should be replaced ;-)

+1

>
> -Chris
>
>>
>> My 2c.
>>
>> Q.
>>> On May 26, 2015, at 9:28 PM, Chris Lattner <[hidden email]> wrote:
>>>
>>> On May 26, 2015, at 1:46 PM, Owen Anderson <[hidden email]> wrote:
>>>>>> The MIR serialization code will go under the new MIR library and not under
>>>>>> CodeGen which will break the dependency. The CodeGen library will link MIR
>>>>>> automatically so the build files for tools that link with CodeGen won't have to
>>>>>> be changed.
>>>>>>
>>>>>> Thanks for reading,
>>>>>> Alex
>>>>>>
>>>>>
>>>>> FWIW, I'm in favour of this.  I'd be interested to hear if there's
>>>>> a reason *not* to do this.
>>>>>
>>>>> Here's some more context:
>>>>>
>>>>>  http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150518/278031.html
>>>>
>>>> I have no objections, though I think lib/MachineIR would make a prettier bikeshed ;-)
>>>
>>> +1.  Codegen should also be renamed to something else that starts with “Machine” as well.
>>>
>>> -Chris
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> [hidden email]         http://llvm.cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Separate machine IR from lib/CodeGen into lib/MIR

Duncan P. N. Exon Smith
In reply to this post by Chandler Carruth-2
> On 2015 May 27, at 10:24, Chandler Carruth <[hidden email]> wrote:
>
>> On Wed, May 27, 2015 at 8:15 AM Chris Lattner <[hidden email]> wrote:
>>> On May 26, 2015, at 11:20 PM, Quentin Colombet <[hidden email]> wrote:
>>>
>>> +1.
>>>
>>> Could those two be subdirectories of one “Machine-Related-Stuff” directory?
>>> E.g.,
>>> MachineStuff/IR
>>> MachineStuff/CodeGen
>>>
>>> Where MachineStuff is something meaningful :).
>>>
>>> That way, they keep a logic bound, more formal than the naming convention.
>>
>> Something like?
>>
>> lib/Machine/IR
>> lib/Machine/Passes
>>
>> Unless there will be many subdirectories, it seems slightly better to flatten out the layer.
>
> I strongly prefer breaking it out into subdirectories. There are a bunch of reasons:
>
> 1) It will grow.
> 2) Without it, we cannot have separate libraries, which will lose some options for shrinking the size of libraries.
> 3) Without separate libraries we can't as easily enforce the layering separations between these things. For example, making this split will be *extremely* hard currently because there is a lot of inappropriate dependencies that will block splitting things out.
>
> However, "IR" and "Passes" cover only two of the things in CodeGen. There is also the implementation of a lot of common infrastructure used by targets' code generators.
>
> My initial suggestion would be to just sink CodeGen into Machine/CodeGen, add the .../IR and .../Passes directories, and then start extracting things from CodeGen into the two more narrow directories. I think there is likely some stuff that should continue to live in a "code generator" infrastructure directory as it is neither part of the machine IR, nor is it part of any particular pass.
>
> My suggested layering would be:
>
> Passes depend on IR, CodeGen depends on both Passes and IR. The idea is that anything passes require should be embedded into the IR.
>

(Oops, missed this until after I sent my own response.

One thing I'd add to this from my email is that I think
lib/Machine/IR is likely to get confused with lib/IR for the same
reasons that lib/CodeGen is confusing between LLVM and Clang.  IMO
lib/Machine/MIR is "safer".)

> However, this won't currently work. There are things that seem to be parallel but independent of the machine IR and are used by any machine passes. There are also things that clearly use the machine passes. Currently, I'm not sure how to cleanly divide this library up without really significant refactoring of every part of the code generator.
>
> While I would like to see this happen, is it really a good idea to put this in the critical path of getting MIR serialized and deserialized?

Not if it's as hard as you're saying.  My impression was that Alex
was able to move the IR stuff he needed into a separately library
pretty trivially (based on the P.O.C. patch he posted), but if it's
not trivial he should just move on.

>> Also, if we’re getting crazy here, CodeGen in clang should be renamed to IRGen, AsmPrinter should be renamed to MCGen, and SelectionDAG should be replaced ;-)
>
> I'm happy to actually do the CodeGen -> IRGen rename. I actually built the change but didn't submit it because of the concerns of some out-of-tree users of Clang. I still have all the perl scripts and such I used sitting around.

I'm a really big fan of this.  If you supply the perl scripts
somehow (attached to a PR that you reference in the commit?) then
out-of-tree users shouldn't have a problem.  Unless I'm missing
something?
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Separate machine IR from lib/CodeGen into lib/MIR

Hal Finkel
----- Original Message -----

> From: "Duncan P. N. Exon Smith" <[hidden email]>
> To: "Chandler Carruth" <[hidden email]>
> Cc: "LLVM Developers Mailing List" <[hidden email]>
> Sent: Wednesday, May 27, 2015 12:59:23 PM
> Subject: Re: [LLVMdev] RFC: Separate machine IR from lib/CodeGen into lib/MIR
>
> > On 2015 May 27, at 10:24, Chandler Carruth <[hidden email]>
> > wrote:
> >
> >> On Wed, May 27, 2015 at 8:15 AM Chris Lattner <[hidden email]>
> >> wrote:
> >>> On May 26, 2015, at 11:20 PM, Quentin Colombet
> >>> <[hidden email]> wrote:
> >>>
> >>> +1.
> >>>
> >>> Could those two be subdirectories of one “Machine-Related-Stuff”
> >>> directory?
> >>> E.g.,
> >>> MachineStuff/IR
> >>> MachineStuff/CodeGen
> >>>
> >>> Where MachineStuff is something meaningful :).
> >>>
> >>> That way, they keep a logic bound, more formal than the naming
> >>> convention.
> >>
> >> Something like?
> >>
> >> lib/Machine/IR
> >> lib/Machine/Passes
> >>
> >> Unless there will be many subdirectories, it seems slightly better
> >> to flatten out the layer.
> >
> > I strongly prefer breaking it out into subdirectories. There are a
> > bunch of reasons:
> >
> > 1) It will grow.
> > 2) Without it, we cannot have separate libraries, which will lose
> > some options for shrinking the size of libraries.
> > 3) Without separate libraries we can't as easily enforce the
> > layering separations between these things. For example, making
> > this split will be *extremely* hard currently because there is a
> > lot of inappropriate dependencies that will block splitting things
> > out.
> >
> > However, "IR" and "Passes" cover only two of the things in CodeGen.
> > There is also the implementation of a lot of common infrastructure
> > used by targets' code generators.
> >
> > My initial suggestion would be to just sink CodeGen into
> > Machine/CodeGen, add the .../IR and .../Passes directories, and
> > then start extracting things from CodeGen into the two more narrow
> > directories. I think there is likely some stuff that should
> > continue to live in a "code generator" infrastructure directory as
> > it is neither part of the machine IR, nor is it part of any
> > particular pass.
> >
> > My suggested layering would be:
> >
> > Passes depend on IR, CodeGen depends on both Passes and IR. The
> > idea is that anything passes require should be embedded into the
> > IR.
> >
>
> (Oops, missed this until after I sent my own response.
>
> One thing I'd add to this from my email is that I think
> lib/Machine/IR is likely to get confused with lib/IR for the same
> reasons that lib/CodeGen is confusing between LLVM and Clang.  IMO
> lib/Machine/MIR is "safer".)

+1 -- Please only use IR to refer to LLVM IR. We should use some different abbreviation for any other IR we have in the backend.

>
> > However, this won't currently work. There are things that seem to
> > be parallel but independent of the machine IR and are used by any
> > machine passes. There are also things that clearly use the machine
> > passes. Currently, I'm not sure how to cleanly divide this library
> > up without really significant refactoring of every part of the
> > code generator.
> >
> > While I would like to see this happen, is it really a good idea to
> > put this in the critical path of getting MIR serialized and
> > deserialized?
>
> Not if it's as hard as you're saying.  My impression was that Alex
> was able to move the IR stuff he needed into a separately library
> pretty trivially (based on the P.O.C. patch he posted), but if it's
> not trivial he should just move on.
>
> >> Also, if we’re getting crazy here, CodeGen in clang should be
> >> renamed to IRGen, AsmPrinter should be renamed to MCGen, and
> >> SelectionDAG should be replaced ;-)
> >
> > I'm happy to actually do the CodeGen -> IRGen rename. I actually
> > built the change but didn't submit it because of the concerns of
> > some out-of-tree users of Clang. I still have all the perl scripts
> > and such I used sitting around.
>
> I'm a really big fan of this.

+1 -- Me too.

 -Hal

 If you supply the perl scripts
> somehow (attached to a PR that you reference in the commit?) then
> out-of-tree users shouldn't have a problem.  Unless I'm missing
> something?
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Separate machine IR from lib/CodeGen into lib/MIR

Alex L
In reply to this post by Duncan P. N. Exon Smith


2015-05-27 10:59 GMT-07:00 Duncan P. N. Exon Smith <[hidden email]>:
> On 2015 May 27, at 10:24, Chandler Carruth <[hidden email]> wrote:
>
>> On Wed, May 27, 2015 at 8:15 AM Chris Lattner <[hidden email]> wrote:
>>> On May 26, 2015, at 11:20 PM, Quentin Colombet <[hidden email]> wrote:
>>>
>>> +1.
>>>
>>> Could those two be subdirectories of one “Machine-Related-Stuff” directory?
>>> E.g.,
>>> MachineStuff/IR
>>> MachineStuff/CodeGen
>>>
>>> Where MachineStuff is something meaningful :).
>>>
>>> That way, they keep a logic bound, more formal than the naming convention.
>>
>> Something like?
>>
>> lib/Machine/IR
>> lib/Machine/Passes
>>
>> Unless there will be many subdirectories, it seems slightly better to flatten out the layer.
>
> I strongly prefer breaking it out into subdirectories. There are a bunch of reasons:
>
> 1) It will grow.
> 2) Without it, we cannot have separate libraries, which will lose some options for shrinking the size of libraries.
> 3) Without separate libraries we can't as easily enforce the layering separations between these things. For example, making this split will be *extremely* hard currently because there is a lot of inappropriate dependencies that will block splitting things out.
>
> However, "IR" and "Passes" cover only two of the things in CodeGen. There is also the implementation of a lot of common infrastructure used by targets' code generators.
>
> My initial suggestion would be to just sink CodeGen into Machine/CodeGen, add the .../IR and .../Passes directories, and then start extracting things from CodeGen into the two more narrow directories. I think there is likely some stuff that should continue to live in a "code generator" infrastructure directory as it is neither part of the machine IR, nor is it part of any particular pass.
>
> My suggested layering would be:
>
> Passes depend on IR, CodeGen depends on both Passes and IR. The idea is that anything passes require should be embedded into the IR.
>

(Oops, missed this until after I sent my own response.

One thing I'd add to this from my email is that I think
lib/Machine/IR is likely to get confused with lib/IR for the same
reasons that lib/CodeGen is confusing between LLVM and Clang.  IMO
lib/Machine/MIR is "safer".)

> However, this won't currently work. There are things that seem to be parallel but independent of the machine IR and are used by any machine passes. There are also things that clearly use the machine passes. Currently, I'm not sure how to cleanly divide this library up without really significant refactoring of every part of the code generator.
>
> While I would like to see this happen, is it really a good idea to put this in the critical path of getting MIR serialized and deserialized?

Not if it's as hard as you're saying.  My impression was that Alex
was able to move the IR stuff he needed into a separately library
pretty trivially (based on the P.O.C. patch he posted), but if it's
not trivial he should just move on.

While it's true that there are things that prevent a straightforward library division, I don't think that they will pose such a big problem.
I have a brief list of things that have to be done before I can move the Machine IR stuff from CodeGen to some other library:
- Move the SplitCriticalEdge method from MachineBasicBlock (http://reviews.llvm.org/D10064).
- Move the UnpackMachineBundles and FinalizeMachineBundles passes from MachineInstrBundle.cpp. (http://reviews.llvm.org/D10070 + 1 upcoming patch).
- Refactor SlotIndexes.h: keep the SlotIndexes pass and move the rest to SlotIndex.h. Introduce a new container class in SlotIndex.h that will extract the map
  between machine instructions and slot indexes from the SlotIndexes pass and remove the dependency on the pass from MachineBasicBlock's print method.
- WinEHFuncInfo - Either move parseEHInfo from WinEHPrepare.cpp to new file WinEHFuncInfo.cpp or move the declaration of parseEHInfo from WinEHFuncInfo.h
  to a new header WinEHPrepare.h
- Get rid of several unused #includes in MachineIR cpp files that include passes.

After that it should be possible to move the Machine IR files out of CodeGen without them actually including any CodeGen headers.
And all the Machine IR passes like MachineFunctionPass, MachineFunctionAnalysis, etc. will remain in CodeGen.

Alex.

 

>> Also, if we’re getting crazy here, CodeGen in clang should be renamed to IRGen, AsmPrinter should be renamed to MCGen, and SelectionDAG should be replaced ;-)
>
> I'm happy to actually do the CodeGen -> IRGen rename. I actually built the change but didn't submit it because of the concerns of some out-of-tree users of Clang. I still have all the perl scripts and such I used sitting around.

I'm a really big fan of this.  If you supply the perl scripts
somehow (attached to a PR that you reference in the commit?) then
out-of-tree users shouldn't have a problem.  Unless I'm missing
something?
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Separate machine IR from lib/CodeGen into lib/MIR

Duncan P. N. Exon Smith

> On 2015 May 27, at 14:01, Alex L <[hidden email]> wrote:
>
>
>
> 2015-05-27 10:59 GMT-07:00 Duncan P. N. Exon Smith <[hidden email]>:
> > On 2015 May 27, at 10:24, Chandler Carruth <[hidden email]> wrote:
> >
> >> On Wed, May 27, 2015 at 8:15 AM Chris Lattner <[hidden email]> wrote:
> >>> On May 26, 2015, at 11:20 PM, Quentin Colombet <[hidden email]> wrote:
> >>>
> >>> +1.
> >>>
> >>> Could those two be subdirectories of one “Machine-Related-Stuff” directory?
> >>> E.g.,
> >>> MachineStuff/IR
> >>> MachineStuff/CodeGen
> >>>
> >>> Where MachineStuff is something meaningful :).
> >>>
> >>> That way, they keep a logic bound, more formal than the naming convention.
> >>
> >> Something like?
> >>
> >> lib/Machine/IR
> >> lib/Machine/Passes
> >>
> >> Unless there will be many subdirectories, it seems slightly better to flatten out the layer.
> >
> > I strongly prefer breaking it out into subdirectories. There are a bunch of reasons:
> >
> > 1) It will grow.
> > 2) Without it, we cannot have separate libraries, which will lose some options for shrinking the size of libraries.
> > 3) Without separate libraries we can't as easily enforce the layering separations between these things. For example, making this split will be *extremely* hard currently because there is a lot of inappropriate dependencies that will block splitting things out.
> >
> > However, "IR" and "Passes" cover only two of the things in CodeGen. There is also the implementation of a lot of common infrastructure used by targets' code generators.
> >
> > My initial suggestion would be to just sink CodeGen into Machine/CodeGen, add the .../IR and .../Passes directories, and then start extracting things from CodeGen into the two more narrow directories. I think there is likely some stuff that should continue to live in a "code generator" infrastructure directory as it is neither part of the machine IR, nor is it part of any particular pass.
> >
> > My suggested layering would be:
> >
> > Passes depend on IR, CodeGen depends on both Passes and IR. The idea is that anything passes require should be embedded into the IR.
> >
>
> (Oops, missed this until after I sent my own response.
>
> One thing I'd add to this from my email is that I think
> lib/Machine/IR is likely to get confused with lib/IR for the same
> reasons that lib/CodeGen is confusing between LLVM and Clang.  IMO
> lib/Machine/MIR is "safer".)
>
> > However, this won't currently work. There are things that seem to be parallel but independent of the machine IR and are used by any machine passes. There are also things that clearly use the machine passes. Currently, I'm not sure how to cleanly divide this library up without really significant refactoring of every part of the code generator.
> >
> > While I would like to see this happen, is it really a good idea to put this in the critical path of getting MIR serialized and deserialized?
>
> Not if it's as hard as you're saying.  My impression was that Alex
> was able to move the IR stuff he needed into a separately library
> pretty trivially (based on the P.O.C. patch he posted), but if it's
> not trivial he should just move on.
>
> While it's true that there are things that prevent a straightforward library division, I don't think that they will pose such a big problem.
> I have a brief list of things that have to be done before I can move the Machine IR stuff from CodeGen to some other library:
> - Move the SplitCriticalEdge method from MachineBasicBlock (http://reviews.llvm.org/D10064).
> - Move the UnpackMachineBundles and FinalizeMachineBundles passes from MachineInstrBundle.cpp. (http://reviews.llvm.org/D10070 + 1 upcoming patch).
> - Refactor SlotIndexes.h: keep the SlotIndexes pass and move the rest to SlotIndex.h. Introduce a new container class in SlotIndex.h that will extract the map
>   between machine instructions and slot indexes from the SlotIndexes pass and remove the dependency on the pass from MachineBasicBlock's print method.
> - WinEHFuncInfo - Either move parseEHInfo from WinEHPrepare.cpp to new file WinEHFuncInfo.cpp or move the declaration of parseEHInfo from WinEHFuncInfo.h
>   to a new header WinEHPrepare.h
> - Get rid of several unused #includes in MachineIR cpp files that include passes.
>
> After that it should be possible to move the Machine IR files out of CodeGen without them actually including any CodeGen headers.
> And all the Machine IR passes like MachineFunctionPass, MachineFunctionAnalysis, etc. will remain in CodeGen.
>
> Alex.
>

Something Chandler pointed out on IRC is that #include dependencies
aren't enough -- we also need to root out any symbol dependencies.
Have you checked for symbols (mostly, functions) that are declared
in a header you're planning to move, but defined in a source file
you're *not* planning to move?

(Apparently there have been failed attempts in the past to bring
sanity here that I wasn't aware of...)

>  
>
> >> Also, if we’re getting crazy here, CodeGen in clang should be renamed to IRGen, AsmPrinter should be renamed to MCGen, and SelectionDAG should be replaced ;-)
> >
> > I'm happy to actually do the CodeGen -> IRGen rename. I actually built the change but didn't submit it because of the concerns of some out-of-tree users of Clang. I still have all the perl scripts and such I used sitting around.
>
> I'm a really big fan of this.  If you supply the perl scripts
> somehow (attached to a PR that you reference in the commit?) then
> out-of-tree users shouldn't have a problem.  Unless I'm missing
> something?
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Separate machine IR from lib/CodeGen into lib/MIR

Alex L
I didn't specifically check for them, but the WinEHFuncInfo header file problem that I described above
is an example of a symbol dependency that can be resolved in a fairly straight forward manner.

The remaining files seem pretty sane, but there might be some symbol dependencies that I haven't
caught yet.

Alex.

2015-05-27 14:18 GMT-07:00 Duncan P. N. Exon Smith <[hidden email]>:

> On 2015 May 27, at 14:01, Alex L <[hidden email]> wrote:
>
>
>
> 2015-05-27 10:59 GMT-07:00 Duncan P. N. Exon Smith <[hidden email]>:
> > On 2015 May 27, at 10:24, Chandler Carruth <[hidden email]> wrote:
> >
> >> On Wed, May 27, 2015 at 8:15 AM Chris Lattner <[hidden email]> wrote:
> >>> On May 26, 2015, at 11:20 PM, Quentin Colombet <[hidden email]> wrote:
> >>>
> >>> +1.
> >>>
> >>> Could those two be subdirectories of one “Machine-Related-Stuff” directory?
> >>> E.g.,
> >>> MachineStuff/IR
> >>> MachineStuff/CodeGen
> >>>
> >>> Where MachineStuff is something meaningful :).
> >>>
> >>> That way, they keep a logic bound, more formal than the naming convention.
> >>
> >> Something like?
> >>
> >> lib/Machine/IR
> >> lib/Machine/Passes
> >>
> >> Unless there will be many subdirectories, it seems slightly better to flatten out the layer.
> >
> > I strongly prefer breaking it out into subdirectories. There are a bunch of reasons:
> >
> > 1) It will grow.
> > 2) Without it, we cannot have separate libraries, which will lose some options for shrinking the size of libraries.
> > 3) Without separate libraries we can't as easily enforce the layering separations between these things. For example, making this split will be *extremely* hard currently because there is a lot of inappropriate dependencies that will block splitting things out.
> >
> > However, "IR" and "Passes" cover only two of the things in CodeGen. There is also the implementation of a lot of common infrastructure used by targets' code generators.
> >
> > My initial suggestion would be to just sink CodeGen into Machine/CodeGen, add the .../IR and .../Passes directories, and then start extracting things from CodeGen into the two more narrow directories. I think there is likely some stuff that should continue to live in a "code generator" infrastructure directory as it is neither part of the machine IR, nor is it part of any particular pass.
> >
> > My suggested layering would be:
> >
> > Passes depend on IR, CodeGen depends on both Passes and IR. The idea is that anything passes require should be embedded into the IR.
> >
>
> (Oops, missed this until after I sent my own response.
>
> One thing I'd add to this from my email is that I think
> lib/Machine/IR is likely to get confused with lib/IR for the same
> reasons that lib/CodeGen is confusing between LLVM and Clang.  IMO
> lib/Machine/MIR is "safer".)
>
> > However, this won't currently work. There are things that seem to be parallel but independent of the machine IR and are used by any machine passes. There are also things that clearly use the machine passes. Currently, I'm not sure how to cleanly divide this library up without really significant refactoring of every part of the code generator.
> >
> > While I would like to see this happen, is it really a good idea to put this in the critical path of getting MIR serialized and deserialized?
>
> Not if it's as hard as you're saying.  My impression was that Alex
> was able to move the IR stuff he needed into a separately library
> pretty trivially (based on the P.O.C. patch he posted), but if it's
> not trivial he should just move on.
>
> While it's true that there are things that prevent a straightforward library division, I don't think that they will pose such a big problem.
> I have a brief list of things that have to be done before I can move the Machine IR stuff from CodeGen to some other library:
> - Move the SplitCriticalEdge method from MachineBasicBlock (http://reviews.llvm.org/D10064).
> - Move the UnpackMachineBundles and FinalizeMachineBundles passes from MachineInstrBundle.cpp. (http://reviews.llvm.org/D10070 + 1 upcoming patch).
> - Refactor SlotIndexes.h: keep the SlotIndexes pass and move the rest to SlotIndex.h. Introduce a new container class in SlotIndex.h that will extract the map
>   between machine instructions and slot indexes from the SlotIndexes pass and remove the dependency on the pass from MachineBasicBlock's print method.
> - WinEHFuncInfo - Either move parseEHInfo from WinEHPrepare.cpp to new file WinEHFuncInfo.cpp or move the declaration of parseEHInfo from WinEHFuncInfo.h
>   to a new header WinEHPrepare.h
> - Get rid of several unused #includes in MachineIR cpp files that include passes.
>
> After that it should be possible to move the Machine IR files out of CodeGen without them actually including any CodeGen headers.
> And all the Machine IR passes like MachineFunctionPass, MachineFunctionAnalysis, etc. will remain in CodeGen.
>
> Alex.
>

Something Chandler pointed out on IRC is that #include dependencies
aren't enough -- we also need to root out any symbol dependencies.
Have you checked for symbols (mostly, functions) that are declared
in a header you're planning to move, but defined in a source file
you're *not* planning to move?

(Apparently there have been failed attempts in the past to bring
sanity here that I wasn't aware of...)

>
>
> >> Also, if we’re getting crazy here, CodeGen in clang should be renamed to IRGen, AsmPrinter should be renamed to MCGen, and SelectionDAG should be replaced ;-)
> >
> > I'm happy to actually do the CodeGen -> IRGen rename. I actually built the change but didn't submit it because of the concerns of some out-of-tree users of Clang. I still have all the perl scripts and such I used sitting around.
>
> I'm a really big fan of this.  If you supply the perl scripts
> somehow (attached to a PR that you reference in the commit?) then
> out-of-tree users shouldn't have a problem.  Unless I'm missing
> something?
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Separate machine IR from lib/CodeGen into lib/MIR

David Majnemer
In reply to this post by Hal Finkel


On Wed, May 27, 2015 at 11:11 AM, Hal Finkel <[hidden email]> wrote:
----- Original Message -----
> From: "Duncan P. N. Exon Smith" <[hidden email]>
> To: "Chandler Carruth" <[hidden email]>
> Cc: "LLVM Developers Mailing List" <[hidden email]>
> Sent: Wednesday, May 27, 2015 12:59:23 PM
> Subject: Re: [LLVMdev] RFC: Separate machine IR from lib/CodeGen into lib/MIR
>
> > On 2015 May 27, at 10:24, Chandler Carruth <[hidden email]>
> > wrote:
> >
> >> On Wed, May 27, 2015 at 8:15 AM Chris Lattner <[hidden email]>
> >> wrote:
> >>> On May 26, 2015, at 11:20 PM, Quentin Colombet
> >>> <[hidden email]> wrote:
> >>>
> >>> +1.
> >>>
> >>> Could those two be subdirectories of one “Machine-Related-Stuff”
> >>> directory?
> >>> E.g.,
> >>> MachineStuff/IR
> >>> MachineStuff/CodeGen
> >>>
> >>> Where MachineStuff is something meaningful :).
> >>>
> >>> That way, they keep a logic bound, more formal than the naming
> >>> convention.
> >>
> >> Something like?
> >>
> >> lib/Machine/IR
> >> lib/Machine/Passes
> >>
> >> Unless there will be many subdirectories, it seems slightly better
> >> to flatten out the layer.
> >
> > I strongly prefer breaking it out into subdirectories. There are a
> > bunch of reasons:
> >
> > 1) It will grow.
> > 2) Without it, we cannot have separate libraries, which will lose
> > some options for shrinking the size of libraries.
> > 3) Without separate libraries we can't as easily enforce the
> > layering separations between these things. For example, making
> > this split will be *extremely* hard currently because there is a
> > lot of inappropriate dependencies that will block splitting things
> > out.
> >
> > However, "IR" and "Passes" cover only two of the things in CodeGen.
> > There is also the implementation of a lot of common infrastructure
> > used by targets' code generators.
> >
> > My initial suggestion would be to just sink CodeGen into
> > Machine/CodeGen, add the .../IR and .../Passes directories, and
> > then start extracting things from CodeGen into the two more narrow
> > directories. I think there is likely some stuff that should
> > continue to live in a "code generator" infrastructure directory as
> > it is neither part of the machine IR, nor is it part of any
> > particular pass.
> >
> > My suggested layering would be:
> >
> > Passes depend on IR, CodeGen depends on both Passes and IR. The
> > idea is that anything passes require should be embedded into the
> > IR.
> >
>
> (Oops, missed this until after I sent my own response.
>
> One thing I'd add to this from my email is that I think
> lib/Machine/IR is likely to get confused with lib/IR for the same
> reasons that lib/CodeGen is confusing between LLVM and Clang.  IMO
> lib/Machine/MIR is "safer".)

+1 -- Please only use IR to refer to LLVM IR. We should use some different abbreviation for any other IR we have in the backend.

>
> > However, this won't currently work. There are things that seem to
> > be parallel but independent of the machine IR and are used by any
> > machine passes. There are also things that clearly use the machine
> > passes. Currently, I'm not sure how to cleanly divide this library
> > up without really significant refactoring of every part of the
> > code generator.
> >
> > While I would like to see this happen, is it really a good idea to
> > put this in the critical path of getting MIR serialized and
> > deserialized?
>
> Not if it's as hard as you're saying.  My impression was that Alex
> was able to move the IR stuff he needed into a separately library
> pretty trivially (based on the P.O.C. patch he posted), but if it's
> not trivial he should just move on.
>
> >> Also, if we’re getting crazy here, CodeGen in clang should be
> >> renamed to IRGen, AsmPrinter should be renamed to MCGen, and
> >> SelectionDAG should be replaced ;-)
> >
> > I'm happy to actually do the CodeGen -> IRGen rename. I actually
> > built the change but didn't submit it because of the concerns of
> > some out-of-tree users of Clang. I still have all the perl scripts
> > and such I used sitting around.
>
> I'm a really big fan of this.

+1 -- Me too.

Renaming CodeGen to IRGen will make it easier for newcomers to grok clang, big +1 from me.
 

 -Hal

 If you supply the perl scripts
> somehow (attached to a PR that you reference in the commit?) then
> out-of-tree users shouldn't have a problem.  Unless I'm missing
> something?
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Separate machine IR from lib/CodeGen into lib/MIR

Pete Cooper
In reply to this post by Duncan P. N. Exon Smith

> On May 27, 2015, at 2:18 PM, Duncan P. N. Exon Smith <[hidden email]> wrote:
>
>
>> On 2015 May 27, at 14:01, Alex L <[hidden email]> wrote:
>>
>>
>>
>> 2015-05-27 10:59 GMT-07:00 Duncan P. N. Exon Smith <[hidden email]>:
>>> On 2015 May 27, at 10:24, Chandler Carruth <[hidden email]> wrote:
>>>
>>>> On Wed, May 27, 2015 at 8:15 AM Chris Lattner <[hidden email]> wrote:
>>>>> On May 26, 2015, at 11:20 PM, Quentin Colombet <[hidden email]> wrote:
>>>>>
>>>>> +1.
>>>>>
>>>>> Could those two be subdirectories of one “Machine-Related-Stuff” directory?
>>>>> E.g.,
>>>>> MachineStuff/IR
>>>>> MachineStuff/CodeGen
>>>>>
>>>>> Where MachineStuff is something meaningful :).
>>>>>
>>>>> That way, they keep a logic bound, more formal than the naming convention.
>>>>
>>>> Something like?
>>>>
>>>> lib/Machine/IR
>>>> lib/Machine/Passes
>>>>
>>>> Unless there will be many subdirectories, it seems slightly better to flatten out the layer.
>>>
>>> I strongly prefer breaking it out into subdirectories. There are a bunch of reasons:
>>>
>>> 1) It will grow.
>>> 2) Without it, we cannot have separate libraries, which will lose some options for shrinking the size of libraries.
>>> 3) Without separate libraries we can't as easily enforce the layering separations between these things. For example, making this split will be *extremely* hard currently because there is a lot of inappropriate dependencies that will block splitting things out.
>>>
>>> However, "IR" and "Passes" cover only two of the things in CodeGen. There is also the implementation of a lot of common infrastructure used by targets' code generators.
>>>
>>> My initial suggestion would be to just sink CodeGen into Machine/CodeGen, add the .../IR and .../Passes directories, and then start extracting things from CodeGen into the two more narrow directories. I think there is likely some stuff that should continue to live in a "code generator" infrastructure directory as it is neither part of the machine IR, nor is it part of any particular pass.
>>>
>>> My suggested layering would be:
>>>
>>> Passes depend on IR, CodeGen depends on both Passes and IR. The idea is that anything passes require should be embedded into the IR.
>>>
>>
>> (Oops, missed this until after I sent my own response.
>>
>> One thing I'd add to this from my email is that I think
>> lib/Machine/IR is likely to get confused with lib/IR for the same
>> reasons that lib/CodeGen is confusing between LLVM and Clang.  IMO
>> lib/Machine/MIR is "safer".)
>>
>>> However, this won't currently work. There are things that seem to be parallel but independent of the machine IR and are used by any machine passes. There are also things that clearly use the machine passes. Currently, I'm not sure how to cleanly divide this library up without really significant refactoring of every part of the code generator.
>>>
>>> While I would like to see this happen, is it really a good idea to put this in the critical path of getting MIR serialized and deserialized?
>>
>> Not if it's as hard as you're saying.  My impression was that Alex
>> was able to move the IR stuff he needed into a separately library
>> pretty trivially (based on the P.O.C. patch he posted), but if it's
>> not trivial he should just move on.
>>
>> While it's true that there are things that prevent a straightforward library division, I don't think that they will pose such a big problem.
>> I have a brief list of things that have to be done before I can move the Machine IR stuff from CodeGen to some other library:
>> - Move the SplitCriticalEdge method from MachineBasicBlock (http://reviews.llvm.org/D10064).
>> - Move the UnpackMachineBundles and FinalizeMachineBundles passes from MachineInstrBundle.cpp. (http://reviews.llvm.org/D10070 + 1 upcoming patch).
>> - Refactor SlotIndexes.h: keep the SlotIndexes pass and move the rest to SlotIndex.h. Introduce a new container class in SlotIndex.h that will extract the map
>>  between machine instructions and slot indexes from the SlotIndexes pass and remove the dependency on the pass from MachineBasicBlock's print method.
>> - WinEHFuncInfo - Either move parseEHInfo from WinEHPrepare.cpp to new file WinEHFuncInfo.cpp or move the declaration of parseEHInfo from WinEHFuncInfo.h
>>  to a new header WinEHPrepare.h
>> - Get rid of several unused #includes in MachineIR cpp files that include passes.
>>
>> After that it should be possible to move the Machine IR files out of CodeGen without them actually including any CodeGen headers.
>> And all the Machine IR passes like MachineFunctionPass, MachineFunctionAnalysis, etc. will remain in CodeGen.
>>
>> Alex.
>>
>
> Something Chandler pointed out on IRC is that #include dependencies
> aren't enough -- we also need to root out any symbol dependencies.
> Have you checked for symbols (mostly, functions) that are declared
> in a header you're planning to move, but defined in a source file
> you're *not* planning to move?
>
> (Apparently there have been failed attempts in the past to bring
> sanity here that I wasn't aware of…)
The solution to this is to have a tool which links only the parts we need.  So for example, write a tool which only verifies machine IR.  We parse it and verify it, and exit.  That only needs the MIR and parser and will root out any dependencies on IR/CodeGen.

+1 for everything else BTW, not that we needed another +1.

Cheers,
Pete

>
>>
>>
>>>> Also, if we’re getting crazy here, CodeGen in clang should be renamed to IRGen, AsmPrinter should be renamed to MCGen, and SelectionDAG should be replaced ;-)
>>>
>>> I'm happy to actually do the CodeGen -> IRGen rename. I actually built the change but didn't submit it because of the concerns of some out-of-tree users of Clang. I still have all the perl scripts and such I used sitting around.
>>
>> I'm a really big fan of this.  If you supply the perl scripts
>> somehow (attached to a PR that you reference in the commit?) then
>> out-of-tree users shouldn't have a problem.  Unless I'm missing
>> something?
>> _______________________________________________
>> LLVM Developers mailing list
>> [hidden email]         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Separate machine IR from lib/CodeGen into lib/MIR

Sean Silva-2
In reply to this post by Hal Finkel


On Wed, May 27, 2015 at 11:11 AM, Hal Finkel <[hidden email]> wrote:
----- Original Message -----
> From: "Duncan P. N. Exon Smith" <[hidden email]>
> To: "Chandler Carruth" <[hidden email]>
> Cc: "LLVM Developers Mailing List" <[hidden email]>
> Sent: Wednesday, May 27, 2015 12:59:23 PM
> Subject: Re: [LLVMdev] RFC: Separate machine IR from lib/CodeGen into lib/MIR
>
> > On 2015 May 27, at 10:24, Chandler Carruth <[hidden email]>
> > wrote:
> >
> >> On Wed, May 27, 2015 at 8:15 AM Chris Lattner <[hidden email]>
> >> wrote:
> >>> On May 26, 2015, at 11:20 PM, Quentin Colombet
> >>> <[hidden email]> wrote:
> >>>
> >>> +1.
> >>>
> >>> Could those two be subdirectories of one “Machine-Related-Stuff”
> >>> directory?
> >>> E.g.,
> >>> MachineStuff/IR
> >>> MachineStuff/CodeGen
> >>>
> >>> Where MachineStuff is something meaningful :).
> >>>
> >>> That way, they keep a logic bound, more formal than the naming
> >>> convention.
> >>
> >> Something like?
> >>
> >> lib/Machine/IR
> >> lib/Machine/Passes
> >>
> >> Unless there will be many subdirectories, it seems slightly better
> >> to flatten out the layer.
> >
> > I strongly prefer breaking it out into subdirectories. There are a
> > bunch of reasons:
> >
> > 1) It will grow.
> > 2) Without it, we cannot have separate libraries, which will lose
> > some options for shrinking the size of libraries.
> > 3) Without separate libraries we can't as easily enforce the
> > layering separations between these things. For example, making
> > this split will be *extremely* hard currently because there is a
> > lot of inappropriate dependencies that will block splitting things
> > out.
> >
> > However, "IR" and "Passes" cover only two of the things in CodeGen.
> > There is also the implementation of a lot of common infrastructure
> > used by targets' code generators.
> >
> > My initial suggestion would be to just sink CodeGen into
> > Machine/CodeGen, add the .../IR and .../Passes directories, and
> > then start extracting things from CodeGen into the two more narrow
> > directories. I think there is likely some stuff that should
> > continue to live in a "code generator" infrastructure directory as
> > it is neither part of the machine IR, nor is it part of any
> > particular pass.
> >
> > My suggested layering would be:
> >
> > Passes depend on IR, CodeGen depends on both Passes and IR. The
> > idea is that anything passes require should be embedded into the
> > IR.
> >
>
> (Oops, missed this until after I sent my own response.
>
> One thing I'd add to this from my email is that I think
> lib/Machine/IR is likely to get confused with lib/IR for the same
> reasons that lib/CodeGen is confusing between LLVM and Clang.  IMO
> lib/Machine/MIR is "safer".)

+1 -- Please only use IR to refer to LLVM IR. We should use some different abbreviation for any other IR we have in the backend.

Also, "MIR" is also widely used in the literature and compiler textbooks to refer to what we call "IR", so "MIR" is confusing anyway.

-- Sean Silva
 

>
> > However, this won't currently work. There are things that seem to
> > be parallel but independent of the machine IR and are used by any
> > machine passes. There are also things that clearly use the machine
> > passes. Currently, I'm not sure how to cleanly divide this library
> > up without really significant refactoring of every part of the
> > code generator.
> >
> > While I would like to see this happen, is it really a good idea to
> > put this in the critical path of getting MIR serialized and
> > deserialized?
>
> Not if it's as hard as you're saying.  My impression was that Alex
> was able to move the IR stuff he needed into a separately library
> pretty trivially (based on the P.O.C. patch he posted), but if it's
> not trivial he should just move on.
>
> >> Also, if we’re getting crazy here, CodeGen in clang should be
> >> renamed to IRGen, AsmPrinter should be renamed to MCGen, and
> >> SelectionDAG should be replaced ;-)
> >
> > I'm happy to actually do the CodeGen -> IRGen rename. I actually
> > built the change but didn't submit it because of the concerns of
> > some out-of-tree users of Clang. I still have all the perl scripts
> > and such I used sitting around.
>
> I'm a really big fan of this.

+1 -- Me too.

 -Hal

 If you supply the perl scripts
> somehow (attached to a PR that you reference in the commit?) then
> out-of-tree users shouldn't have a problem.  Unless I'm missing
> something?
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Separate machine IR from lib/CodeGen into lib/MIR

Alex L
In reply to this post by Pete Cooper


2015-05-27 16:53 GMT-07:00 Pete Cooper <[hidden email]>:

> On May 27, 2015, at 2:18 PM, Duncan P. N. Exon Smith <[hidden email]> wrote:
>
>
>> On 2015 May 27, at 14:01, Alex L <[hidden email]> wrote:
>>
>>
>>
>> 2015-05-27 10:59 GMT-07:00 Duncan P. N. Exon Smith <[hidden email]>:
>>> On 2015 May 27, at 10:24, Chandler Carruth <[hidden email]> wrote:
>>>
>>>> On Wed, May 27, 2015 at 8:15 AM Chris Lattner <[hidden email]> wrote:
>>>>> On May 26, 2015, at 11:20 PM, Quentin Colombet <[hidden email]> wrote:
>>>>>
>>>>> +1.
>>>>>
>>>>> Could those two be subdirectories of one “Machine-Related-Stuff” directory?
>>>>> E.g.,
>>>>> MachineStuff/IR
>>>>> MachineStuff/CodeGen
>>>>>
>>>>> Where MachineStuff is something meaningful :).
>>>>>
>>>>> That way, they keep a logic bound, more formal than the naming convention.
>>>>
>>>> Something like?
>>>>
>>>> lib/Machine/IR
>>>> lib/Machine/Passes
>>>>
>>>> Unless there will be many subdirectories, it seems slightly better to flatten out the layer.
>>>
>>> I strongly prefer breaking it out into subdirectories. There are a bunch of reasons:
>>>
>>> 1) It will grow.
>>> 2) Without it, we cannot have separate libraries, which will lose some options for shrinking the size of libraries.
>>> 3) Without separate libraries we can't as easily enforce the layering separations between these things. For example, making this split will be *extremely* hard currently because there is a lot of inappropriate dependencies that will block splitting things out.
>>>
>>> However, "IR" and "Passes" cover only two of the things in CodeGen. There is also the implementation of a lot of common infrastructure used by targets' code generators.
>>>
>>> My initial suggestion would be to just sink CodeGen into Machine/CodeGen, add the .../IR and .../Passes directories, and then start extracting things from CodeGen into the two more narrow directories. I think there is likely some stuff that should continue to live in a "code generator" infrastructure directory as it is neither part of the machine IR, nor is it part of any particular pass.
>>>
>>> My suggested layering would be:
>>>
>>> Passes depend on IR, CodeGen depends on both Passes and IR. The idea is that anything passes require should be embedded into the IR.
>>>
>>
>> (Oops, missed this until after I sent my own response.
>>
>> One thing I'd add to this from my email is that I think
>> lib/Machine/IR is likely to get confused with lib/IR for the same
>> reasons that lib/CodeGen is confusing between LLVM and Clang.  IMO
>> lib/Machine/MIR is "safer".)
>>
>>> However, this won't currently work. There are things that seem to be parallel but independent of the machine IR and are used by any machine passes. There are also things that clearly use the machine passes. Currently, I'm not sure how to cleanly divide this library up without really significant refactoring of every part of the code generator.
>>>
>>> While I would like to see this happen, is it really a good idea to put this in the critical path of getting MIR serialized and deserialized?
>>
>> Not if it's as hard as you're saying.  My impression was that Alex
>> was able to move the IR stuff he needed into a separately library
>> pretty trivially (based on the P.O.C. patch he posted), but if it's
>> not trivial he should just move on.
>>
>> While it's true that there are things that prevent a straightforward library division, I don't think that they will pose such a big problem.
>> I have a brief list of things that have to be done before I can move the Machine IR stuff from CodeGen to some other library:
>> - Move the SplitCriticalEdge method from MachineBasicBlock (http://reviews.llvm.org/D10064).
>> - Move the UnpackMachineBundles and FinalizeMachineBundles passes from MachineInstrBundle.cpp. (http://reviews.llvm.org/D10070 + 1 upcoming patch).
>> - Refactor SlotIndexes.h: keep the SlotIndexes pass and move the rest to SlotIndex.h. Introduce a new container class in SlotIndex.h that will extract the map
>>  between machine instructions and slot indexes from the SlotIndexes pass and remove the dependency on the pass from MachineBasicBlock's print method.
>> - WinEHFuncInfo - Either move parseEHInfo from WinEHPrepare.cpp to new file WinEHFuncInfo.cpp or move the declaration of parseEHInfo from WinEHFuncInfo.h
>>  to a new header WinEHPrepare.h
>> - Get rid of several unused #includes in MachineIR cpp files that include passes.
>>
>> After that it should be possible to move the Machine IR files out of CodeGen without them actually including any CodeGen headers.
>> And all the Machine IR passes like MachineFunctionPass, MachineFunctionAnalysis, etc. will remain in CodeGen.
>>
>> Alex.
>>
>
> Something Chandler pointed out on IRC is that #include dependencies
> aren't enough -- we also need to root out any symbol dependencies.
> Have you checked for symbols (mostly, functions) that are declared
> in a header you're planning to move, but defined in a source file
> you're *not* planning to move?
>
> (Apparently there have been failed attempts in the past to bring
> sanity here that I wasn't aware of…)
The solution to this is to have a tool which links only the parts we need.  So for example, write a tool which only verifies machine IR.  We parse it and verify it, and exit.  That only needs the MIR and parser and will root out any dependencies on IR/CodeGen.

This might be a bit of a problem as such a tool would need the LLVMTargetMachine which is currently implemented in CodeGen. 
As well as that, every target depends on CodeGen, thus we won't be able to create any machine IR tools without linking in CodeGen given the current proposal.
Such a tool would require much bigger changes IMHO.

Cheers,
Alex

 

+1 for everything else BTW, not that we needed another +1.

Cheers,
Pete
>
>>
>>
>>>> Also, if we’re getting crazy here, CodeGen in clang should be renamed to IRGen, AsmPrinter should be renamed to MCGen, and SelectionDAG should be replaced ;-)
>>>
>>> I'm happy to actually do the CodeGen -> IRGen rename. I actually built the change but didn't submit it because of the concerns of some out-of-tree users of Clang. I still have all the perl scripts and such I used sitting around.
>>
>> I'm a really big fan of this.  If you supply the perl scripts
>> somehow (attached to a PR that you reference in the commit?) then
>> out-of-tree users shouldn't have a problem.  Unless I'm missing
>> something?
>> _______________________________________________
>> LLVM Developers mailing list
>> [hidden email]         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev