[RFC] Embedding command line options in bitcode (PR21471)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
41 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Embedding command line options in bitcode (PR21471)

Bob Wilson-3

On Nov 19, 2014, at 3:28 PM, Eric Christopher <[hidden email]> wrote:

So my general concern here is that lots of command line options that don't need to be or shouldn't be IR level constructs become oddly named string options. It's bad enough that we're doing that with the target cpu and target feature string let alone the rest of it.

I’m not sure whether you’re concerned about the “oddly named” part or the “shouldn’t be IR” part. If the latter, this is why I previously suggested that we stage this so that after this patch goes in (assuming that we can reach consensus on that), we should consider each option separately. If there are some that don’t belong in IR, then we certainly shouldn’t do that. We need review each of them carefully to make that decision.


If we're talking about things that end up changing the subtarget used we definitely need to make this part of the lookup key - at which point we're going to want a unified interface to all of them. I haven't applied your patch to see what the IR changes look like here.

From looking at the patch it seems that the command line option for NaNs should be a general attribute on the function rather than a TargetMachine option anyhow, so it seems that should be migrated down. Whether or not it is exposed as an llvm attribute or a string based key value pair is an interesting discussion. Also, would we want to, instead, expose the backend options as clang IR level options? Seems to make sense from that perspective and then they can add the appropriate attributes to the functions themselves. (-mlongcalls for the long calls option for example).

Thanks!

-eric

On Tue Nov 18 2014 at 12:26:49 PM Akira Hatanaka <[hidden email]> wrote:
Updated patch is attached. Note this is just a work-in-progress patch and I plan to address the feedback comments later if this patch is in the right direction.

This is how the command line options are parsed and used by the backend passes:

1. Tools such as clang and llc call cl::ParseCommandLineOptions. Any of the options that should be written to the bitcode as function or module attributes (these options are declared as CodeGenOpt, which is a subclass of cl::opt) are saved to Function or ModuleIntAttrs in CodeGenOptions.cpp.
2. setFunctionAttribute is called to move the function and module options saved in step 1 to the bitcode. Also, options stored in TargetOptions are moved to the bitcode too. Pre-existing attributes in the bitcode are overwritten by the options in the command line. 
3. The backend passes and subtarget classes call getFunctionAttribute to read the attributes stored to the bitcode in step 2. Function::hasFnAttribute can be called directly (for example, NoNaNsFPMath in the patch), if the default value is known to be "false".

I also made the following changes in the patch:

1. In the constructor of MachineFunction, call the version of getSubtargetImpl that takes a Function parameter, so that it gets the Subtarget specific to the Function being compiled.
2. Change ARMTargetMachine::SubtargetMap to be a DenseMap<Function*, unique_ptr<ARMSubtarget>>. This is just a temporary change to ease debugging and should be reverted to a StringMap or changed to another type of map later.


On Mon, Nov 17, 2014 at 11:40 AM, Eric Christopher <[hidden email]> wrote:


On Mon Nov 17 2014 at 9:56:40 AM Bob Wilson <[hidden email]> wrote:

> On Nov 14, 2014, at 2:44 PM, Duncan P. N. Exon Smith <[hidden email]> wrote:
>
> +chrisb
>
>> On 2014-Nov-13, at 16:33, Akira Hatanaka <[hidden email]> wrote:
>>
>> I'm working on fixing PR21471, which is about embedding codegen command line options into the bitcode as function or module-level attributes so that they don't get ignored when doing LTO.
>>
>> http://llvm.org/bugs/show_bug.cgi?id=21471
>>
>> I have an initial patch (attached to this email) which enables clang/llvm to recognize one command line option, write it to the IR, and read it out in a backend pass. I'm looking to get feedback from the community on whether I'm headed in the right direction or whether there are alternate ideas before I go all the way on fixing the PR. Specifically, I'd like to know the answers to the following questions:
>>
>> 1. How do we make sure we continue to be able to use the command line options we've been using for llc and other tools?
>> 2. How to handle cases where two functions in a module have different sets of command line options?
>> 3. Where should the command line options or module/function attributes be stored once they are read out from the IR?
>>
>> The short description of the approach I took in my patch is that command line options that are important to codegen are collected by cl::ParseCommandLineOptions, written to the bitcode as function or module attributes, and read out directly by the optimization passes that need them. cl::opt options are replaced with CodeGenOpt options (subclass of cl::opt), which are needed only to parse the command line and provide the default value when the corresponding options are not in the bitcode.
>
> I like this approach, since it means the frontend doesn't have to understand
> options in order to pass them on to the backend.

I agree. It’s not the approach that was I was expecting, but after reading the patches, I like it.

If we can get consensus on the approach, I suggest that we stage the work to first adopt Akira’s patches and then we can incrementally convert various options over to use it. There may be complications for some options, so I’d like to see separate patches for each one (possibly with a few similar options combined in one patch).


I'm not sold quite yet :)

Akira: Can you show an example of how you expect this to work in the IR and how the backend is going to process?

Thanks.

-eric
 
>
> The static variables should be straightforward to migrate to an LLVMContext
> once ParseCommandLineOptions stores things there instead of in globals.
>
>> diff --git a/lib/CodeGen/CodeGenOption.cpp b/lib/CodeGen/CodeGenOption.cpp
>> new file mode 100644
>> index 0000000..2d74c2f
>> --- /dev/null
>> +++ b/lib/CodeGen/CodeGenOption.cpp
>> @@ -0,0 +1,59 @@
>> +//===- CodeGen/CodeGenOptions.cpp - Code-gen option.           --*- C++ -*-===//
>> +//
>> +//                     The LLVM Compiler Infrastructure
>> +//
>> +// This file is distributed under the University of Illinois Open Source
>> +// License. See LICENSE.TXT for details.
>> +//
>> +//===----------------------------------------------------------------------===//
>> +//
>> +//===----------------------------------------------------------------------===//
>> +
>> +#include "llvm/CodeGen/CodeGenOption.h"
>> +#include "llvm/IR/Attributes.h"
>> +#include "llvm/IR/Module.h"
>> +
>> +using namespace llvm;
>> +
>> +static std::map<std::string, bool> FunctionBoolAttrs;
>> +static std::map<std::string, bool> ModuleBoolAttrs;
>> +
>
> @Chris, should these be using ManagedStatic?
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev




_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Embedding command line options in bitcode (PR21471)

Akira Hatanaka
In reply to this post by Eric Christopher
On Wed, Nov 19, 2014 at 3:28 PM, Eric Christopher <[hidden email]> wrote:
So my general concern here is that lots of command line options that don't need to be or shouldn't be IR level constructs become oddly named string options. It's bad enough that we're doing that with the target cpu and target feature string let alone the rest of it.

If we're talking about things that end up changing the subtarget used we definitely need to make this part of the lookup key - at which point we're going to want a unified interface to all of them. I haven't applied your patch to see what the IR changes look like here.


With the patches applied, the attributes clang embeds in the IR look like this:

$ clang -cc1 ... -mllvm -arm-long-calls -mllvm -arm-reserve-r9 -menable-no-nans

attributes #0 = { nounwind readonly ssp "NoNaNsFPMath" "arm-long-calls"="1" "arm-reserve-r9"="1" "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="arm7tdmi" "target-features"="-neon,-crypto,+apcs" "unsafe-fp-math"="false" "use-soft-float"="true" }

I'm guessing you want the front-end to convert command line options like "arm-long-calls" or "arm-reserve-r9" to something that doesn't look as odd?

From looking at the patch it seems that the command line option for NaNs should be a general attribute on the function rather than a TargetMachine option anyhow, so it seems that should be migrated down. Whether or not it is exposed as an llvm attribute or a string based key value pair is an interesting discussion. Also, would we want to, instead, expose the backend options as clang IR level options? Seems to make sense from that perspective and then they can add the appropriate attributes to the functions themselves. (-mlongcalls for the long calls option for example).

Thanks!

-eric

On Tue Nov 18 2014 at 12:26:49 PM Akira Hatanaka <[hidden email]> wrote:
Updated patch is attached. Note this is just a work-in-progress patch and I plan to address the feedback comments later if this patch is in the right direction.

This is how the command line options are parsed and used by the backend passes:

1. Tools such as clang and llc call cl::ParseCommandLineOptions. Any of the options that should be written to the bitcode as function or module attributes (these options are declared as CodeGenOpt, which is a subclass of cl::opt) are saved to Function or ModuleIntAttrs in CodeGenOptions.cpp.
2. setFunctionAttribute is called to move the function and module options saved in step 1 to the bitcode. Also, options stored in TargetOptions are moved to the bitcode too. Pre-existing attributes in the bitcode are overwritten by the options in the command line. 
3. The backend passes and subtarget classes call getFunctionAttribute to read the attributes stored to the bitcode in step 2. Function::hasFnAttribute can be called directly (for example, NoNaNsFPMath in the patch), if the default value is known to be "false".

I also made the following changes in the patch:

1. In the constructor of MachineFunction, call the version of getSubtargetImpl that takes a Function parameter, so that it gets the Subtarget specific to the Function being compiled.
2. Change ARMTargetMachine::SubtargetMap to be a DenseMap<Function*, unique_ptr<ARMSubtarget>>. This is just a temporary change to ease debugging and should be reverted to a StringMap or changed to another type of map later.


On Mon, Nov 17, 2014 at 11:40 AM, Eric Christopher <[hidden email]> wrote:


On Mon Nov 17 2014 at 9:56:40 AM Bob Wilson <[hidden email]> wrote:

> On Nov 14, 2014, at 2:44 PM, Duncan P. N. Exon Smith <[hidden email]> wrote:
>
> +chrisb
>
>> On 2014-Nov-13, at 16:33, Akira Hatanaka <[hidden email]> wrote:
>>
>> I'm working on fixing PR21471, which is about embedding codegen command line options into the bitcode as function or module-level attributes so that they don't get ignored when doing LTO.
>>
>> http://llvm.org/bugs/show_bug.cgi?id=21471
>>
>> I have an initial patch (attached to this email) which enables clang/llvm to recognize one command line option, write it to the IR, and read it out in a backend pass. I'm looking to get feedback from the community on whether I'm headed in the right direction or whether there are alternate ideas before I go all the way on fixing the PR. Specifically, I'd like to know the answers to the following questions:
>>
>> 1. How do we make sure we continue to be able to use the command line options we've been using for llc and other tools?
>> 2. How to handle cases where two functions in a module have different sets of command line options?
>> 3. Where should the command line options or module/function attributes be stored once they are read out from the IR?
>>
>> The short description of the approach I took in my patch is that command line options that are important to codegen are collected by cl::ParseCommandLineOptions, written to the bitcode as function or module attributes, and read out directly by the optimization passes that need them. cl::opt options are replaced with CodeGenOpt options (subclass of cl::opt), which are needed only to parse the command line and provide the default value when the corresponding options are not in the bitcode.
>
> I like this approach, since it means the frontend doesn't have to understand
> options in order to pass them on to the backend.

I agree. It’s not the approach that was I was expecting, but after reading the patches, I like it.

If we can get consensus on the approach, I suggest that we stage the work to first adopt Akira’s patches and then we can incrementally convert various options over to use it. There may be complications for some options, so I’d like to see separate patches for each one (possibly with a few similar options combined in one patch).


I'm not sold quite yet :)

Akira: Can you show an example of how you expect this to work in the IR and how the backend is going to process?

Thanks.

-eric
 
>
> The static variables should be straightforward to migrate to an LLVMContext
> once ParseCommandLineOptions stores things there instead of in globals.
>
>> diff --git a/lib/CodeGen/CodeGenOption.cpp b/lib/CodeGen/CodeGenOption.cpp
>> new file mode 100644
>> index 0000000..2d74c2f
>> --- /dev/null
>> +++ b/lib/CodeGen/CodeGenOption.cpp
>> @@ -0,0 +1,59 @@
>> +//===- CodeGen/CodeGenOptions.cpp - Code-gen option.           --*- C++ -*-===//
>> +//
>> +//                     The LLVM Compiler Infrastructure
>> +//
>> +// This file is distributed under the University of Illinois Open Source
>> +// License. See LICENSE.TXT for details.
>> +//
>> +//===----------------------------------------------------------------------===//
>> +//
>> +//===----------------------------------------------------------------------===//
>> +
>> +#include "llvm/CodeGen/CodeGenOption.h"
>> +#include "llvm/IR/Attributes.h"
>> +#include "llvm/IR/Module.h"
>> +
>> +using namespace llvm;
>> +
>> +static std::map<std::string, bool> FunctionBoolAttrs;
>> +static std::map<std::string, bool> ModuleBoolAttrs;
>> +
>
> @Chris, should these be using ManagedStatic?
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev




_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Embedding command line options in bitcode (PR21471)

Eric Christopher
In reply to this post by Bob Wilson-3


On Wed Nov 19 2014 at 4:38:02 PM Bob Wilson <[hidden email]> wrote:
On Nov 19, 2014, at 3:28 PM, Eric Christopher <[hidden email]> wrote:

So my general concern here is that lots of command line options that don't need to be or shouldn't be IR level constructs become oddly named string options. It's bad enough that we're doing that with the target cpu and target feature string let alone the rest of it.

I’m not sure whether you’re concerned about the “oddly named” part or the “shouldn’t be IR” part. If the latter, this is why I previously suggested that we stage this so that after this patch goes in (assuming that we can reach consensus on that), we should consider each option separately. If there are some that don’t belong in IR, then we certainly shouldn’t do that. We need review each of them carefully to make that decision.


I don't think nearly all of them need to be in the IR :)

How about the rest of my mail?

-eric
 

If we're talking about things that end up changing the subtarget used we definitely need to make this part of the lookup key - at which point we're going to want a unified interface to all of them. I haven't applied your patch to see what the IR changes look like here.

From looking at the patch it seems that the command line option for NaNs should be a general attribute on the function rather than a TargetMachine option anyhow, so it seems that should be migrated down. Whether or not it is exposed as an llvm attribute or a string based key value pair is an interesting discussion. Also, would we want to, instead, expose the backend options as clang IR level options? Seems to make sense from that perspective and then they can add the appropriate attributes to the functions themselves. (-mlongcalls for the long calls option for example).

Thanks!

-eric

On Tue Nov 18 2014 at 12:26:49 PM Akira Hatanaka <[hidden email]> wrote:
Updated patch is attached. Note this is just a work-in-progress patch and I plan to address the feedback comments later if this patch is in the right direction.

This is how the command line options are parsed and used by the backend passes:

1. Tools such as clang and llc call cl::ParseCommandLineOptions. Any of the options that should be written to the bitcode as function or module attributes (these options are declared as CodeGenOpt, which is a subclass of cl::opt) are saved to Function or ModuleIntAttrs in CodeGenOptions.cpp.
2. setFunctionAttribute is called to move the function and module options saved in step 1 to the bitcode. Also, options stored in TargetOptions are moved to the bitcode too. Pre-existing attributes in the bitcode are overwritten by the options in the command line. 
3. The backend passes and subtarget classes call getFunctionAttribute to read the attributes stored to the bitcode in step 2. Function::hasFnAttribute can be called directly (for example, NoNaNsFPMath in the patch), if the default value is known to be "false".

I also made the following changes in the patch:

1. In the constructor of MachineFunction, call the version of getSubtargetImpl that takes a Function parameter, so that it gets the Subtarget specific to the Function being compiled.
2. Change ARMTargetMachine::SubtargetMap to be a DenseMap<Function*, unique_ptr<ARMSubtarget>>. This is just a temporary change to ease debugging and should be reverted to a StringMap or changed to another type of map later.


On Mon, Nov 17, 2014 at 11:40 AM, Eric Christopher <[hidden email]> wrote:


On Mon Nov 17 2014 at 9:56:40 AM Bob Wilson <[hidden email]> wrote:

> On Nov 14, 2014, at 2:44 PM, Duncan P. N. Exon Smith <[hidden email]> wrote:
>
> +chrisb
>
>> On 2014-Nov-13, at 16:33, Akira Hatanaka <[hidden email]> wrote:
>>
>> I'm working on fixing PR21471, which is about embedding codegen command line options into the bitcode as function or module-level attributes so that they don't get ignored when doing LTO.
>>
>> http://llvm.org/bugs/show_bug.cgi?id=21471
>>
>> I have an initial patch (attached to this email) which enables clang/llvm to recognize one command line option, write it to the IR, and read it out in a backend pass. I'm looking to get feedback from the community on whether I'm headed in the right direction or whether there are alternate ideas before I go all the way on fixing the PR. Specifically, I'd like to know the answers to the following questions:
>>
>> 1. How do we make sure we continue to be able to use the command line options we've been using for llc and other tools?
>> 2. How to handle cases where two functions in a module have different sets of command line options?
>> 3. Where should the command line options or module/function attributes be stored once they are read out from the IR?
>>
>> The short description of the approach I took in my patch is that command line options that are important to codegen are collected by cl::ParseCommandLineOptions, written to the bitcode as function or module attributes, and read out directly by the optimization passes that need them. cl::opt options are replaced with CodeGenOpt options (subclass of cl::opt), which are needed only to parse the command line and provide the default value when the corresponding options are not in the bitcode.
>
> I like this approach, since it means the frontend doesn't have to understand
> options in order to pass them on to the backend.

I agree. It’s not the approach that was I was expecting, but after reading the patches, I like it.

If we can get consensus on the approach, I suggest that we stage the work to first adopt Akira’s patches and then we can incrementally convert various options over to use it. There may be complications for some options, so I’d like to see separate patches for each one (possibly with a few similar options combined in one patch).


I'm not sold quite yet :)

Akira: Can you show an example of how you expect this to work in the IR and how the backend is going to process?

Thanks.

-eric
 
>
> The static variables should be straightforward to migrate to an LLVMContext
> once ParseCommandLineOptions stores things there instead of in globals.
>
>> diff --git a/lib/CodeGen/CodeGenOption.cpp b/lib/CodeGen/CodeGenOption.cpp
>> new file mode 100644
>> index 0000000..2d74c2f
>> --- /dev/null
>> +++ b/lib/CodeGen/CodeGenOption.cpp
>> @@ -0,0 +1,59 @@
>> +//===- CodeGen/CodeGenOptions.cpp - Code-gen option.           --*- C++ -*-===//
>> +//
>> +//                     The LLVM Compiler Infrastructure
>> +//
>> +// This file is distributed under the University of Illinois Open Source
>> +// License. See LICENSE.TXT for details.
>> +//
>> +//===----------------------------------------------------------------------===//
>> +//
>> +//===----------------------------------------------------------------------===//
>> +
>> +#include "llvm/CodeGen/CodeGenOption.h"
>> +#include "llvm/IR/Attributes.h"
>> +#include "llvm/IR/Module.h"
>> +
>> +using namespace llvm;
>> +
>> +static std::map<std::string, bool> FunctionBoolAttrs;
>> +static std::map<std::string, bool> ModuleBoolAttrs;
>> +
>
> @Chris, should these be using ManagedStatic?
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev




_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Embedding command line options in bitcode (PR21471)

Eric Christopher
In reply to this post by Akira Hatanaka


On Wed Nov 19 2014 at 4:39:42 PM Akira Hatanaka <[hidden email]> wrote:
On Wed, Nov 19, 2014 at 3:28 PM, Eric Christopher <[hidden email]> wrote:
So my general concern here is that lots of command line options that don't need to be or shouldn't be IR level constructs become oddly named string options. It's bad enough that we're doing that with the target cpu and target feature string let alone the rest of it.

If we're talking about things that end up changing the subtarget used we definitely need to make this part of the lookup key - at which point we're going to want a unified interface to all of them. I haven't applied your patch to see what the IR changes look like here.


With the patches applied, the attributes clang embeds in the IR look like this:

$ clang -cc1 ... -mllvm -arm-long-calls -mllvm -arm-reserve-r9 -menable-no-nans

attributes #0 = { nounwind readonly ssp "NoNaNsFPMath" "arm-long-calls"="1" "arm-reserve-r9"="1" "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="arm7tdmi" "target-features"="-neon,-crypto,+apcs" "unsafe-fp-math"="false" "use-soft-float"="true" }

I'm guessing you want the front-end to convert command line options like "arm-long-calls" or "arm-reserve-r9" to something that doesn't look as odd?


Right, that was part of the idea, the rest is that we define things that actually make sense as front end command line options and use those to communicate via IR constructs or api calls I.e. reserved registers don't need to be a specific back end option, but rather can be passed along when creating a target machine, etc.

-eric
 
From looking at the patch it seems that the command line option for NaNs should be a general attribute on the function rather than a TargetMachine option anyhow, so it seems that should be migrated down. Whether or not it is exposed as an llvm attribute or a string based key value pair is an interesting discussion. Also, would we want to, instead, expose the backend options as clang IR level options? Seems to make sense from that perspective and then they can add the appropriate attributes to the functions themselves. (-mlongcalls for the long calls option for example).

Thanks!

-eric

On Tue Nov 18 2014 at 12:26:49 PM Akira Hatanaka <[hidden email]> wrote:
Updated patch is attached. Note this is just a work-in-progress patch and I plan to address the feedback comments later if this patch is in the right direction.

This is how the command line options are parsed and used by the backend passes:

1. Tools such as clang and llc call cl::ParseCommandLineOptions. Any of the options that should be written to the bitcode as function or module attributes (these options are declared as CodeGenOpt, which is a subclass of cl::opt) are saved to Function or ModuleIntAttrs in CodeGenOptions.cpp.
2. setFunctionAttribute is called to move the function and module options saved in step 1 to the bitcode. Also, options stored in TargetOptions are moved to the bitcode too. Pre-existing attributes in the bitcode are overwritten by the options in the command line. 
3. The backend passes and subtarget classes call getFunctionAttribute to read the attributes stored to the bitcode in step 2. Function::hasFnAttribute can be called directly (for example, NoNaNsFPMath in the patch), if the default value is known to be "false".

I also made the following changes in the patch:

1. In the constructor of MachineFunction, call the version of getSubtargetImpl that takes a Function parameter, so that it gets the Subtarget specific to the Function being compiled.
2. Change ARMTargetMachine::SubtargetMap to be a DenseMap<Function*, unique_ptr<ARMSubtarget>>. This is just a temporary change to ease debugging and should be reverted to a StringMap or changed to another type of map later.


On Mon, Nov 17, 2014 at 11:40 AM, Eric Christopher <[hidden email]> wrote:


On Mon Nov 17 2014 at 9:56:40 AM Bob Wilson <[hidden email]> wrote:

> On Nov 14, 2014, at 2:44 PM, Duncan P. N. Exon Smith <[hidden email]> wrote:
>
> +chrisb
>
>> On 2014-Nov-13, at 16:33, Akira Hatanaka <[hidden email]> wrote:
>>
>> I'm working on fixing PR21471, which is about embedding codegen command line options into the bitcode as function or module-level attributes so that they don't get ignored when doing LTO.
>>
>> http://llvm.org/bugs/show_bug.cgi?id=21471
>>
>> I have an initial patch (attached to this email) which enables clang/llvm to recognize one command line option, write it to the IR, and read it out in a backend pass. I'm looking to get feedback from the community on whether I'm headed in the right direction or whether there are alternate ideas before I go all the way on fixing the PR. Specifically, I'd like to know the answers to the following questions:
>>
>> 1. How do we make sure we continue to be able to use the command line options we've been using for llc and other tools?
>> 2. How to handle cases where two functions in a module have different sets of command line options?
>> 3. Where should the command line options or module/function attributes be stored once they are read out from the IR?
>>
>> The short description of the approach I took in my patch is that command line options that are important to codegen are collected by cl::ParseCommandLineOptions, written to the bitcode as function or module attributes, and read out directly by the optimization passes that need them. cl::opt options are replaced with CodeGenOpt options (subclass of cl::opt), which are needed only to parse the command line and provide the default value when the corresponding options are not in the bitcode.
>
> I like this approach, since it means the frontend doesn't have to understand
> options in order to pass them on to the backend.

I agree. It’s not the approach that was I was expecting, but after reading the patches, I like it.

If we can get consensus on the approach, I suggest that we stage the work to first adopt Akira’s patches and then we can incrementally convert various options over to use it. There may be complications for some options, so I’d like to see separate patches for each one (possibly with a few similar options combined in one patch).


I'm not sold quite yet :)

Akira: Can you show an example of how you expect this to work in the IR and how the backend is going to process?

Thanks.

-eric
 
>
> The static variables should be straightforward to migrate to an LLVMContext
> once ParseCommandLineOptions stores things there instead of in globals.
>
>> diff --git a/lib/CodeGen/CodeGenOption.cpp b/lib/CodeGen/CodeGenOption.cpp
>> new file mode 100644
>> index 0000000..2d74c2f
>> --- /dev/null
>> +++ b/lib/CodeGen/CodeGenOption.cpp
>> @@ -0,0 +1,59 @@
>> +//===- CodeGen/CodeGenOptions.cpp - Code-gen option.           --*- C++ -*-===//
>> +//
>> +//                     The LLVM Compiler Infrastructure
>> +//
>> +// This file is distributed under the University of Illinois Open Source
>> +// License. See LICENSE.TXT for details.
>> +//
>> +//===----------------------------------------------------------------------===//
>> +//
>> +//===----------------------------------------------------------------------===//
>> +
>> +#include "llvm/CodeGen/CodeGenOption.h"
>> +#include "llvm/IR/Attributes.h"
>> +#include "llvm/IR/Module.h"
>> +
>> +using namespace llvm;
>> +
>> +static std::map<std::string, bool> FunctionBoolAttrs;
>> +static std::map<std::string, bool> ModuleBoolAttrs;
>> +
>
> @Chris, should these be using ManagedStatic?
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Embedding command line options in bitcode (PR21471)

Chandler Carruth-2
In reply to this post by Chris Bieneman-2
So I am deeply concerned about the direction this is taking. I'm trying to catch up on the thread, but I think Chris already highlighted my issue:

On Fri, Nov 14, 2014 at 4:57 PM, Chris Bieneman <[hidden email]> wrote:
1. How do we make sure we continue to be able to use the command line options we've been using for llc and other tools?

In discussions about the new cl::opt API I believe the general idea was that most of the options expressed using cl::opt are actually only relevant as debug options, so I think one big part of this work is really going to be identifying a subset of the current options which are actually relevant to expose in the IR.

I think this is critical.

The whole idea of cl::opt API is for *debugging* options. IE, not supported, expected variations on how passes behave. Those should always be controlled (at the LLVM API layer) through constructors and parameters, not through a side-layer.

There are parts of LLVM currently abusing the cl::opt mechanism to control fundamental functionality, but we should *absolutely* not bake any part of that or support for that into the IR! We should go find and fix those places to use reasonable APIs. Once we have that, I am very supportive of getting a good system for transmitting these options in bitcode and such in order to better support LTO. However, I think that in essentially every case there are going to be two options:

1) Turn these options into function attributes because they can reasonably live as function attributes and different variations can co-exist within a module.

2) Keep the options as module-level options, but insist that they match for all modules being merged in LTO.

3) (very rare) have clean, well specified merge semantics to merge different options from different modules in LTO. I think these only come up quite rarely. The only really good example I know of would be something like "library link dependencies" where it is a list that we clearly just take the union to merge.


3. Where should the command line options or module/function attributes be stored once they are read out from the IR?

My suggestion would be the OptionStore that I proposed here: http://reviews.llvm.org/D6207

Not to knock the option store (i quite like it), but I think that should be reserved for the cl::opt-style (but with your new API which is way better) debugging options, and never touch the IR.

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Embedding command line options in bitcode (PR21471)

Bob Wilson-3
In reply to this post by Eric Christopher

On Nov 19, 2014, at 4:52 PM, Eric Christopher <[hidden email]> wrote:



On Wed Nov 19 2014 at 4:39:42 PM Akira Hatanaka <[hidden email]> wrote:
On Wed, Nov 19, 2014 at 3:28 PM, Eric Christopher <[hidden email]> wrote:
So my general concern here is that lots of command line options that don't need to be or shouldn't be IR level constructs become oddly named string options. It's bad enough that we're doing that with the target cpu and target feature string let alone the rest of it.

If we're talking about things that end up changing the subtarget used we definitely need to make this part of the lookup key - at which point we're going to want a unified interface to all of them. I haven't applied your patch to see what the IR changes look like here.


With the patches applied, the attributes clang embeds in the IR look like this:

$ clang -cc1 ... -mllvm -arm-long-calls -mllvm -arm-reserve-r9 -menable-no-nans

attributes #0 = { nounwind readonly ssp "NoNaNsFPMath" "arm-long-calls"="1" "arm-reserve-r9"="1" "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="arm7tdmi" "target-features"="-neon,-crypto,+apcs" "unsafe-fp-math"="false" "use-soft-float"="true" }

I'm guessing you want the front-end to convert command line options like "arm-long-calls" or "arm-reserve-r9" to something that doesn't look as odd?


Right, that was part of the idea, the rest is that we define things that actually make sense as front end command line options and use those to communicate via IR constructs or api calls I.e. reserved registers don't need to be a specific back end option, but rather can be passed along when creating a target machine, etc.

-eric

No, reserved registers (I assume you’re referring to “arm-reserve-r9” in Akira’s example) don’t necessarily need to be backend options. But, those kinds of options do need to be recorded in the IR and the target machine needs to reflect the relevant IR settings. Otherwise, we’ll still be left with a broken LTO model. E.G., I build one LTO object with “arm-reserve-r9” and another without it, but the LTO compilation still needs to respect those options for the code in each object. The details of how a particular option should be handled will vary depending on the option, and we need to discuss each one separately.

It’s pretty clear, though, that there is a class of backend options that translate nicely to function attributes. This RFC is about the mechanism that we use to handle those (without specifying which specific options will be handled that way).

 
From looking at the patch it seems that the command line option for NaNs should be a general attribute on the function rather than a TargetMachine option anyhow, so it seems that should be migrated down. Whether or not it is exposed as an llvm attribute or a string based key value pair is an interesting discussion. Also, would we want to, instead, expose the backend options as clang IR level options? Seems to make sense from that perspective and then they can add the appropriate attributes to the functions themselves. (-mlongcalls for the long calls option for example).

Thanks!

-eric

Again, I think we can evaluate specific options on a case-by-case basis. For some of them, it might make sense to expose them as clang options. Like you, I had previously assumed we would teach the front-end about all the backend options. I was a little surprised by the approach Akira suggested here, but after giving it some thought, I think it’s a great idea. There are a LOT of target-specific backend options. Some of them do not need to be IR, but many do. Teaching clang about all of them would be a lot of work, and I don’t see any clear benefit from that. This approach lets clang remain relatively independent of the backend options. We still need to think carefully about which backend options use this mechanism, of course.

Are there other reasons why you think we should add all those options to clang? I just don’t see much to be gained by that.

Do you have objections to the use of string-based key/value pairs? I believe the intention all along, going back to Bill’s work on this a few years ago, was that we would add new attributes for target-independent things and use strings for the target-specific options. That seems like a reasonable starting point to me.


On Tue Nov 18 2014 at 12:26:49 PM Akira Hatanaka <[hidden email]> wrote:
Updated patch is attached. Note this is just a work-in-progress patch and I plan to address the feedback comments later if this patch is in the right direction.

This is how the command line options are parsed and used by the backend passes:

1. Tools such as clang and llc call cl::ParseCommandLineOptions. Any of the options that should be written to the bitcode as function or module attributes (these options are declared as CodeGenOpt, which is a subclass of cl::opt) are saved to Function or ModuleIntAttrs in CodeGenOptions.cpp.
2. setFunctionAttribute is called to move the function and module options saved in step 1 to the bitcode. Also, options stored in TargetOptions are moved to the bitcode too. Pre-existing attributes in the bitcode are overwritten by the options in the command line. 
3. The backend passes and subtarget classes call getFunctionAttribute to read the attributes stored to the bitcode in step 2. Function::hasFnAttribute can be called directly (for example, NoNaNsFPMath in the patch), if the default value is known to be "false".

I also made the following changes in the patch:

1. In the constructor of MachineFunction, call the version of getSubtargetImpl that takes a Function parameter, so that it gets the Subtarget specific to the Function being compiled.
2. Change ARMTargetMachine::SubtargetMap to be a DenseMap<Function*, unique_ptr<ARMSubtarget>>. This is just a temporary change to ease debugging and should be reverted to a StringMap or changed to another type of map later.


On Mon, Nov 17, 2014 at 11:40 AM, Eric Christopher <[hidden email]> wrote:


On Mon Nov 17 2014 at 9:56:40 AM Bob Wilson <[hidden email]> wrote:

> On Nov 14, 2014, at 2:44 PM, Duncan P. N. Exon Smith <[hidden email]> wrote:
>
> +chrisb
>
>> On 2014-Nov-13, at 16:33, Akira Hatanaka <[hidden email]> wrote:
>>
>> I'm working on fixing PR21471, which is about embedding codegen command line options into the bitcode as function or module-level attributes so that they don't get ignored when doing LTO.
>>
>> http://llvm.org/bugs/show_bug.cgi?id=21471
>>
>> I have an initial patch (attached to this email) which enables clang/llvm to recognize one command line option, write it to the IR, and read it out in a backend pass. I'm looking to get feedback from the community on whether I'm headed in the right direction or whether there are alternate ideas before I go all the way on fixing the PR. Specifically, I'd like to know the answers to the following questions:
>>
>> 1. How do we make sure we continue to be able to use the command line options we've been using for llc and other tools?
>> 2. How to handle cases where two functions in a module have different sets of command line options?
>> 3. Where should the command line options or module/function attributes be stored once they are read out from the IR?
>>
>> The short description of the approach I took in my patch is that command line options that are important to codegen are collected by cl::ParseCommandLineOptions, written to the bitcode as function or module attributes, and read out directly by the optimization passes that need them. cl::opt options are replaced with CodeGenOpt options (subclass of cl::opt), which are needed only to parse the command line and provide the default value when the corresponding options are not in the bitcode.
>
> I like this approach, since it means the frontend doesn't have to understand
> options in order to pass them on to the backend.

I agree. It’s not the approach that was I was expecting, but after reading the patches, I like it.

If we can get consensus on the approach, I suggest that we stage the work to first adopt Akira’s patches and then we can incrementally convert various options over to use it. There may be complications for some options, so I’d like to see separate patches for each one (possibly with a few similar options combined in one patch).


I'm not sold quite yet :)

Akira: Can you show an example of how you expect this to work in the IR and how the backend is going to process?

Thanks.

-eric
 
>

> The static variables should be straightforward to migrate to an LLVMContext
> once ParseCommandLineOptions stores things there instead of in globals.
>
>> diff --git a/lib/CodeGen/CodeGenOption.cpp b/lib/CodeGen/CodeGenOption.cpp
>> new file mode 100644
>> index 0000000..2d74c2f
>> --- /dev/null
>> +++ b/lib/CodeGen/CodeGenOption.cpp
>> @@ -0,0 +1,59 @@
>> +//===- CodeGen/CodeGenOptions.cpp - Code-gen option.           --*- C++ -*-===//
>> +//
>> +//                     The LLVM Compiler Infrastructure
>> +//
>> +// This file is distributed under the University of Illinois Open Source
>> +// License. See LICENSE.TXT for details.
>> +//
>> +//===----------------------------------------------------------------------===//
>> +//
>> +//===----------------------------------------------------------------------===//
>> +
>> +#include "llvm/CodeGen/CodeGenOption.h"
>> +#include "llvm/IR/Attributes.h"
>> +#include "llvm/IR/Module.h"
>> +
>> +using namespace llvm;
>> +
>> +static std::map<std::string, bool> FunctionBoolAttrs;
>> +static std::map<std::string, bool> ModuleBoolAttrs;
>> +
>
> @Chris, should these be using ManagedStatic?
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Embedding command line options in bitcode (PR21471)

Bob Wilson-3
In reply to this post by Chandler Carruth-2

On Nov 19, 2014, at 6:06 PM, Chandler Carruth <[hidden email]> wrote:

So I am deeply concerned about the direction this is taking. I'm trying to catch up on the thread, but I think Chris already highlighted my issue:

On Fri, Nov 14, 2014 at 4:57 PM, Chris Bieneman <[hidden email]> wrote:
1. How do we make sure we continue to be able to use the command line options we've been using for llc and other tools?

In discussions about the new cl::opt API I believe the general idea was that most of the options expressed using cl::opt are actually only relevant as debug options, so I think one big part of this work is really going to be identifying a subset of the current options which are actually relevant to expose in the IR.

I think this is critical.

The whole idea of cl::opt API is for *debugging* options. IE, not supported, expected variations on how passes behave. Those should always be controlled (at the LLVM API layer) through constructors and parameters, not through a side-layer.

There are parts of LLVM currently abusing the cl::opt mechanism to control fundamental functionality, but we should *absolutely* not bake any part of that or support for that into the IR! We should go find and fix those places to use reasonable APIs. Once we have that, I am very supportive of getting a good system for transmitting these options in bitcode and such in order to better support LTO. However, I think that in essentially every case there are going to be two options:

1) Turn these options into function attributes because they can reasonably live as function attributes and different variations can co-exist within a module.

I have heard several people express strong opinions that even when those options are represented as function attributes, we will want a mechanism to override those with llc command line options for experimentation and debugging. If so, they will still need to exist as cl::opt (or some other equivalent). Are you suggesting otherwise?


2) Keep the options as module-level options, but insist that they match for all modules being merged in LTO.

3) (very rare) have clean, well specified merge semantics to merge different options from different modules in LTO. I think these only come up quite rarely. The only really good example I know of would be something like "library link dependencies" where it is a list that we clearly just take the union to merge.


3. Where should the command line options or module/function attributes be stored once they are read out from the IR?

My suggestion would be the OptionStore that I proposed here: http://reviews.llvm.org/D6207

Not to knock the option store (i quite like it), but I think that should be reserved for the cl::opt-style (but with your new API which is way better) debugging options, and never touch the IR.
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Embedding command line options in bitcode (PR21471)

Chandler Carruth-2
On Wed, Nov 19, 2014 at 8:44 PM, Bob Wilson <[hidden email]> wrote:

On Nov 19, 2014, at 6:06 PM, Chandler Carruth <[hidden email]> wrote:

So I am deeply concerned about the direction this is taking. I'm trying to catch up on the thread, but I think Chris already highlighted my issue:

On Fri, Nov 14, 2014 at 4:57 PM, Chris Bieneman <[hidden email]> wrote:
1. How do we make sure we continue to be able to use the command line options we've been using for llc and other tools?

In discussions about the new cl::opt API I believe the general idea was that most of the options expressed using cl::opt are actually only relevant as debug options, so I think one big part of this work is really going to be identifying a subset of the current options which are actually relevant to expose in the IR.

I think this is critical.

The whole idea of cl::opt API is for *debugging* options. IE, not supported, expected variations on how passes behave. Those should always be controlled (at the LLVM API layer) through constructors and parameters, not through a side-layer.

There are parts of LLVM currently abusing the cl::opt mechanism to control fundamental functionality, but we should *absolutely* not bake any part of that or support for that into the IR! We should go find and fix those places to use reasonable APIs. Once we have that, I am very supportive of getting a good system for transmitting these options in bitcode and such in order to better support LTO. However, I think that in essentially every case there are going to be two options:

1) Turn these options into function attributes because they can reasonably live as function attributes and different variations can co-exist within a module.

I have heard several people express strong opinions that even when those options are represented as function attributes, we will want a mechanism to override those with llc command line options for experimentation and debugging. If so, they will still need to exist as cl::opt (or some other equivalent). Are you suggesting otherwise?

Yes and no.

You might have a debug override option using cl::opt (or whatever we replace it with).

Or you might have an LLC option (which might be implemented with cl::opt but is categorically different from the debugging options in common LLVM libraries) that explicitly constructs passes with options, and the passes might skip the function attribute when given a specific override.

Probably lots of other alternative ways of achieving this goal I haven't thought of really....

My point is that regardless of what technique you use to override options, the things that can be serialized through the IR should *not* include debugging only 'cl::opt' options in the common LLVM libraries.
 


2) Keep the options as module-level options, but insist that they match for all modules being merged in LTO.

3) (very rare) have clean, well specified merge semantics to merge different options from different modules in LTO. I think these only come up quite rarely. The only really good example I know of would be something like "library link dependencies" where it is a list that we clearly just take the union to merge.


3. Where should the command line options or module/function attributes be stored once they are read out from the IR?

My suggestion would be the OptionStore that I proposed here: http://reviews.llvm.org/D6207

Not to knock the option store (i quite like it), but I think that should be reserved for the cl::opt-style (but with your new API which is way better) debugging options, and never touch the IR.
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Embedding command line options in bitcode (PR21471)

Jim Grosbach

On Nov 19, 2014, at 6:51 PM, Chandler Carruth <[hidden email]> wrote:

On Wed, Nov 19, 2014 at 8:44 PM, Bob Wilson <[hidden email]> wrote:

On Nov 19, 2014, at 6:06 PM, Chandler Carruth <[hidden email]> wrote:

So I am deeply concerned about the direction this is taking. I'm trying to catch up on the thread, but I think Chris already highlighted my issue:

On Fri, Nov 14, 2014 at 4:57 PM, Chris Bieneman <[hidden email]> wrote:
1. How do we make sure we continue to be able to use the command line options we've been using for llc and other tools?

In discussions about the new cl::opt API I believe the general idea was that most of the options expressed using cl::opt are actually only relevant as debug options, so I think one big part of this work is really going to be identifying a subset of the current options which are actually relevant to expose in the IR.

I think this is critical.

The whole idea of cl::opt API is for *debugging* options. IE, not supported, expected variations on how passes behave. Those should always be controlled (at the LLVM API layer) through constructors and parameters, not through a side-layer.

There are parts of LLVM currently abusing the cl::opt mechanism to control fundamental functionality, but we should *absolutely* not bake any part of that or support for that into the IR! We should go find and fix those places to use reasonable APIs. Once we have that, I am very supportive of getting a good system for transmitting these options in bitcode and such in order to better support LTO. However, I think that in essentially every case there are going to be two options:

1) Turn these options into function attributes because they can reasonably live as function attributes and different variations can co-exist within a module.

I have heard several people express strong opinions that even when those options are represented as function attributes, we will want a mechanism to override those with llc command line options for experimentation and debugging. If so, they will still need to exist as cl::opt (or some other equivalent). Are you suggesting otherwise?

Yes and no.

You might have a debug override option using cl::opt (or whatever we replace it with).

Or you might have an LLC option (which might be implemented with cl::opt but is categorically different from the debugging options in common LLVM libraries) that explicitly constructs passes with options, and the passes might skip the function attribute when given a specific override.

Probably lots of other alternative ways of achieving this goal I haven't thought of really....

My point is that regardless of what technique you use to override options, the things that can be serialized through the IR should *not* include debugging only 'cl::opt' options in the common LLVM libraries.

I think I agree (I’m one of those “several people” Bob mentioned) with this. Depends on which options you consider debugging only and what that implies for when/how they interact with the rest of the stuff. Maybe a specific example or two on each side of the divide to help make sure we’re all defining terms in a somewhat compatible way?

 


2) Keep the options as module-level options, but insist that they match for all modules being merged in LTO.

3) (very rare) have clean, well specified merge semantics to merge different options from different modules in LTO. I think these only come up quite rarely. The only really good example I know of would be something like "library link dependencies" where it is a list that we clearly just take the union to merge.


3. Where should the command line options or module/function attributes be stored once they are read out from the IR?

My suggestion would be the OptionStore that I proposed here: http://reviews.llvm.org/D6207

Not to knock the option store (i quite like it), but I think that should be reserved for the cl::opt-style (but with your new API which is way better) debugging options, and never touch the IR.
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Embedding command line options in bitcode (PR21471)

Bob Wilson-3

On Nov 19, 2014, at 7:00 PM, Jim Grosbach <[hidden email]> wrote:


On Nov 19, 2014, at 6:51 PM, Chandler Carruth <[hidden email]> wrote:

On Wed, Nov 19, 2014 at 8:44 PM, Bob Wilson <[hidden email]> wrote:

On Nov 19, 2014, at 6:06 PM, Chandler Carruth <[hidden email]> wrote:

So I am deeply concerned about the direction this is taking. I'm trying to catch up on the thread, but I think Chris already highlighted my issue:

On Fri, Nov 14, 2014 at 4:57 PM, Chris Bieneman <[hidden email]> wrote:
1. How do we make sure we continue to be able to use the command line options we've been using for llc and other tools?

In discussions about the new cl::opt API I believe the general idea was that most of the options expressed using cl::opt are actually only relevant as debug options, so I think one big part of this work is really going to be identifying a subset of the current options which are actually relevant to expose in the IR.

I think this is critical.

The whole idea of cl::opt API is for *debugging* options. IE, not supported, expected variations on how passes behave. Those should always be controlled (at the LLVM API layer) through constructors and parameters, not through a side-layer.

There are parts of LLVM currently abusing the cl::opt mechanism to control fundamental functionality, but we should *absolutely* not bake any part of that or support for that into the IR! We should go find and fix those places to use reasonable APIs. Once we have that, I am very supportive of getting a good system for transmitting these options in bitcode and such in order to better support LTO. However, I think that in essentially every case there are going to be two options:

1) Turn these options into function attributes because they can reasonably live as function attributes and different variations can co-exist within a module.

I have heard several people express strong opinions that even when those options are represented as function attributes, we will want a mechanism to override those with llc command line options for experimentation and debugging. If so, they will still need to exist as cl::opt (or some other equivalent). Are you suggesting otherwise?

Yes and no.

You might have a debug override option using cl::opt (or whatever we replace it with).

Or you might have an LLC option (which might be implemented with cl::opt but is categorically different from the debugging options in common LLVM libraries) that explicitly constructs passes with options, and the passes might skip the function attribute when given a specific override.

Probably lots of other alternative ways of achieving this goal I haven't thought of really....

My point is that regardless of what technique you use to override options, the things that can be serialized through the IR should *not* include debugging only 'cl::opt' options in the common LLVM libraries.

I think I agree (I’m one of those “several people” Bob mentioned) with this. Depends on which options you consider debugging only and what that implies for when/how they interact with the rest of the stuff. Maybe a specific example or two on each side of the divide to help make sure we’re all defining terms in a somewhat compatible way?

Yes, I think a specific example would help.

I keep saying this, but repetition might help: this is one piece of a much bigger problem and this proposal is NOT about moving all the cl::opt options into the IR. That would be insane. (Akira showed a bunch of them as an example because Eric asked what they would look like, but that wasn’t meant to imply that all those options would necessarily be put into the IR.) We need to look carefully at each option to decide whether it should be serialized in the IR.

This proposal is about the mechanism that we use to do that for one subset of the relevant options.

Let’s make it more specific and only consider the “-arm-long-calls” option. (I’m picking that arbitrarily as a simple example of something that seems like a good fit for a function attribute.) My assertion is that there are a number of other options that need to be handled in a similar way. We should figure out how to handle that kind of option and then we can talk later about which specific options fall into the category. If it turns out that “-arm-long-calls” is controversial or a bad example, we can pick a different one to discuss.

* The clang driver translates the “-mlong-calls” option into “-backend-option -arm-long-calls”. Nothing in clang outside the driver knows anything about “-arm-long-calls". We need a way for clang to recognize “-arm-long-calls” and translate it to a function attribute. As Eric suggested (and as I had originally assumed we would do), one option would be to teach clang to recognize that as a cc1 option and then to translate it to a function attribute. That is certainly a possibility, but there are quite of lot of target-specific options like this, and I’m not convinced there is much to be gained by baking them all into clang.

* There are a bunch of tests in llvm that pass “-arm-long-calls” as a command line option to llc. We could rewrite all those tests to add the function attribute instead, but it is useful to have the llc option available for debugging and experiments, and it would be nice to avoid the test churn to rewrite them all.

* There are a few places in ARMFastISel.cpp and ARMISelLowering.cpp that check the EnableARMLongCalls option. These should be changed to check for the function attribute once it is available. It would be good to avoid having to check BOTH the option and the attribute. One way to do that is to have the the llc option cause the corresponding attribute to be added to all the functions. Then the backend can just check for the attribute.

* The same backend option parsing is shared between llc and clang. When they see the “-arm-long-calls” option, they both need to do the same thing. If the option is one that gets serialized into the IR as a function attribute, both llc and clang need to go through all the functions and add the attribute to each of them. This is exactly what Akira’s patch supports. For the backend options that we choose to mark with codegenoption::FunctionAttrCategory, it will translate them to function attributes.

Akira and I talked about this with Chris B. and Pete Cooper, and I think we reached agreement that this is mostly orthogonal to Chris’s work. Chris wants to have an OptionRegistry and an OptionStore. If we had those things, we would still need a way to translate “-arm-long-calls” into a function attribute.

The best alternative proposal I can see is to have clang understand all these backend options and have code to map each one to a function attribute. But, that doesn’t do anything to support keeping the llc option. Personally I think it would be a very nice design to just completely drop all those llc options (at least for the set of options that we choose to serialize as function attributes) and rely on the function attributes as the only way to specify those things. Jim and Evan seemed to think I was crazy for even suggesting that…. and I suspect they’re probably right.


 


2) Keep the options as module-level options, but insist that they match for all modules being merged in LTO.

Yes, we will need to do this for some set of options, hopefully not too many. That’s outside the scope of this proposal, though. We can probably use a similar approach for module-level options, but let’s take it one step at a time.


3) (very rare) have clean, well specified merge semantics to merge different options from different modules in LTO. I think these only come up quite rarely. The only really good example I know of would be something like "library link dependencies" where it is a list that we clearly just take the union to merge.


3. Where should the command line options or module/function attributes be stored once they are read out from the IR?

My suggestion would be the OptionStore that I proposed here: http://reviews.llvm.org/D6207

Not to knock the option store (i quite like it), but I think that should be reserved for the cl::opt-style (but with your new API which is way better) debugging options, and never touch the IR.
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Embedding command line options in bitcode (PR21471)

Akira Hatanaka
In reply to this post by Chandler Carruth-2
On Wed, Nov 19, 2014 at 6:06 PM, Chandler Carruth <[hidden email]> wrote:
So I am deeply concerned about the direction this is taking. I'm trying to catch up on the thread, but I think Chris already highlighted my issue:

On Fri, Nov 14, 2014 at 4:57 PM, Chris Bieneman <[hidden email]> wrote:
1. How do we make sure we continue to be able to use the command line options we've been using for llc and other tools?

In discussions about the new cl::opt API I believe the general idea was that most of the options expressed using cl::opt are actually only relevant as debug options, so I think one big part of this work is really going to be identifying a subset of the current options which are actually relevant to expose in the IR.

I think this is critical.

The whole idea of cl::opt API is for *debugging* options. IE, not supported, expected variations on how passes behave. Those should always be controlled (at the LLVM API layer) through constructors and parameters, not through a side-layer.


Maybe there is some misunderstanding here. The patch I sent doesn't change llvm to control the behavior of passes through cl::opt options that are currently defined as static variables. The function attributes embedded in the IR control the behavior. The option variables (defined as CodeGenOpt, which is a subclass of cl:opt) have only two roles:

1. Enable parsing the option that appears in the command line.
2. Provide the default value of the option in case neither the bitcode nor command line had the attribute/option the pass is looking for.

It isn't absolutely necessary to keep those variables in the file, they can be safely moved to another file as long as there is a way to get the default value.

Also, I'm not sure customizing the behavior of passes through constructors is a good idea, Wouldn't that mean you have to instantiate the passes multiple times if there are functions that require different behaviors in the module?

There are parts of LLVM currently abusing the cl::opt mechanism to control fundamental functionality, but we should *absolutely* not bake any part of that or support for that into the IR! We should go find and fix those places to use reasonable APIs. Once we have that, I am very supportive of getting a good system for transmitting these options in bitcode and such in order to better support LTO. However, I think that in essentially every case there are going to be two options:

1) Turn these options into function attributes because they can reasonably live as function attributes and different variations can co-exist within a module.

2) Keep the options as module-level options, but insist that they match for all modules being merged in LTO.

3) (very rare) have clean, well specified merge semantics to merge different options from different modules in LTO. I think these only come up quite rarely. The only really good example I know of would be something like "library link dependencies" where it is a list that we clearly just take the union to merge.


3. Where should the command line options or module/function attributes be stored once they are read out from the IR?

My suggestion would be the OptionStore that I proposed here: http://reviews.llvm.org/D6207

Not to knock the option store (i quite like it), but I think that should be reserved for the cl::opt-style (but with your new API which is way better) debugging options, and never touch the IR.

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Embedding command line options in bitcode (PR21471)

Eric Christopher
In reply to this post by Bob Wilson-3
Hi Bob,

I've been giving this some thought, comments inline:

On Wed Nov 19 2014 at 6:24:18 PM Bob Wilson <[hidden email]> wrote:
On Nov 19, 2014, at 4:52 PM, Eric Christopher <[hidden email]> wrote:



On Wed Nov 19 2014 at 4:39:42 PM Akira Hatanaka <[hidden email]> wrote:
On Wed, Nov 19, 2014 at 3:28 PM, Eric Christopher <[hidden email]> wrote:
So my general concern here is that lots of command line options that don't need to be or shouldn't be IR level constructs become oddly named string options. It's bad enough that we're doing that with the target cpu and target feature string let alone the rest of it.

If we're talking about things that end up changing the subtarget used we definitely need to make this part of the lookup key - at which point we're going to want a unified interface to all of them. I haven't applied your patch to see what the IR changes look like here.


With the patches applied, the attributes clang embeds in the IR look like this:

$ clang -cc1 ... -mllvm -arm-long-calls -mllvm -arm-reserve-r9 -menable-no-nans

attributes #0 = { nounwind readonly ssp "NoNaNsFPMath" "arm-long-calls"="1" "arm-reserve-r9"="1" "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="arm7tdmi" "target-features"="-neon,-crypto,+apcs" "unsafe-fp-math"="false" "use-soft-float"="true" }

I'm guessing you want the front-end to convert command line options like "arm-long-calls" or "arm-reserve-r9" to something that doesn't look as odd?


Right, that was part of the idea, the rest is that we define things that actually make sense as front end command line options and use those to communicate via IR constructs or api calls I.e. reserved registers don't need to be a specific back end option, but rather can be passed along when creating a target machine, etc.

-eric

No, reserved registers (I assume you’re referring to “arm-reserve-r9” in Akira’s example) don’t necessarily need to be backend options. But, those kinds of options do need to be recorded in the IR and the target machine needs to reflect the relevant IR settings. Otherwise, we’ll still be left with a broken LTO model. E.G., I build one LTO object with “arm-reserve-r9” and another without it, but the LTO compilation still needs to respect those options for the code in each object. The details of how a particular option should be handled will vary depending on the option, and we need to discuss each one separately.


ABI breaking options should probably be considered part of TargetMachine/Subtarget creation and handled accordingly (if it's subtarget creation it can be on the function as a feature string attribute, if it affects global code generation it needs to be at TargetMachine creation time).
 
It’s pretty clear, though, that there is a class of backend options that translate nicely to function attributes. This RFC is about the mechanism that we use to handle those (without specifying which specific options will be handled that way).


Fair enough. For function attributes I can see a desire to have (as Chris posted in an email from someone at synopsys) a generic target specific dictionary mapping on functions for backend specific uses. I just haven't seen anything in the thread that would be applicable for this rather than an actual function attribute. That said, I would like some sort of delineation for target function attributes as well. Thoughts on that? Perhaps it could look like the string dictionary we've mentioned, maybe not.
 
 
From looking at the patch it seems that the command line option for NaNs should be a general attribute on the function rather than a TargetMachine option anyhow, so it seems that should be migrated down. Whether or not it is exposed as an llvm attribute or a string based key value pair is an interesting discussion. Also, would we want to, instead, expose the backend options as clang IR level options? Seems to make sense from that perspective and then they can add the appropriate attributes to the functions themselves. (-mlongcalls for the long calls option for example).

Thanks!

-eric

Again, I think we can evaluate specific options on a case-by-case basis. For some of them, it might make sense to expose them as clang options. Like you, I had previously assumed we would teach the front-end about all the backend options. I was a little surprised by the approach Akira suggested here, but after giving it some thought, I think it’s a great idea. There are a LOT of target-specific backend options. Some of them do not need to be IR, but many do. Teaching clang about all of them would be a lot of work, and I don’t see any clear benefit from that. This approach lets clang remain relatively independent of the backend options. We still need to think carefully about which backend options use this mechanism, of course.


I'm unconvinced here...
 
Are there other reasons why you think we should add all those options to clang? I just don’t see much to be gained by that.


.... I think that anything you'd like to expose should probably be a front end option. If nothing else I don't see how you plan on annotating the IR. Basically you're constructing a parallel option hierarchy and that seems... redundant. I agree that it's nice for tools to be able to override things in the backend - for testing if nothing else. We do this a lot by checking options and values in the code. A way of streamlining this would be nice - but I don't think that's the goal of these patches no?
 
Do you have objections to the use of string-based key/value pairs? I believe the intention all along, going back to Bill’s work on this a few years ago, was that we would add new attributes for target-independent things and use strings for the target-specific options. That seems like a reasonable starting point to me.


Only some as I mentioned above - I haven't seen anything that needs to be target specific in this way, but I'm open to the idea :)

As a note, this is all for function level things. We need a solution for non-function/module level attributes and I think that's going to look like API calls into the backend for TargetMachine creation - there may be a list of things that would turn into module level attributes, but that is a direction fraught with danger and I'd like to avoid that as much as possible.

(As a note, the particular patch that Akira has is unacceptable because of the changes to the subtarget lookup, but that's a separate issue :)

-eric
 

On Tue Nov 18 2014 at 12:26:49 PM Akira Hatanaka <[hidden email]> wrote:
Updated patch is attached. Note this is just a work-in-progress patch and I plan to address the feedback comments later if this patch is in the right direction.

This is how the command line options are parsed and used by the backend passes:

1. Tools such as clang and llc call cl::ParseCommandLineOptions. Any of the options that should be written to the bitcode as function or module attributes (these options are declared as CodeGenOpt, which is a subclass of cl::opt) are saved to Function or ModuleIntAttrs in CodeGenOptions.cpp.
2. setFunctionAttribute is called to move the function and module options saved in step 1 to the bitcode. Also, options stored in TargetOptions are moved to the bitcode too. Pre-existing attributes in the bitcode are overwritten by the options in the command line. 
3. The backend passes and subtarget classes call getFunctionAttribute to read the attributes stored to the bitcode in step 2. Function::hasFnAttribute can be called directly (for example, NoNaNsFPMath in the patch), if the default value is known to be "false".

I also made the following changes in the patch:

1. In the constructor of MachineFunction, call the version of getSubtargetImpl that takes a Function parameter, so that it gets the Subtarget specific to the Function being compiled.
2. Change ARMTargetMachine::SubtargetMap to be a DenseMap<Function*, unique_ptr<ARMSubtarget>>. This is just a temporary change to ease debugging and should be reverted to a StringMap or changed to another type of map later.


On Mon, Nov 17, 2014 at 11:40 AM, Eric Christopher <[hidden email]> wrote:


On Mon Nov 17 2014 at 9:56:40 AM Bob Wilson <[hidden email]> wrote:

> On Nov 14, 2014, at 2:44 PM, Duncan P. N. Exon Smith <[hidden email]> wrote:
>
> +chrisb
>
>> On 2014-Nov-13, at 16:33, Akira Hatanaka <[hidden email]> wrote:
>>
>> I'm working on fixing PR21471, which is about embedding codegen command line options into the bitcode as function or module-level attributes so that they don't get ignored when doing LTO.
>>
>> http://llvm.org/bugs/show_bug.cgi?id=21471
>>
>> I have an initial patch (attached to this email) which enables clang/llvm to recognize one command line option, write it to the IR, and read it out in a backend pass. I'm looking to get feedback from the community on whether I'm headed in the right direction or whether there are alternate ideas before I go all the way on fixing the PR. Specifically, I'd like to know the answers to the following questions:
>>
>> 1. How do we make sure we continue to be able to use the command line options we've been using for llc and other tools?
>> 2. How to handle cases where two functions in a module have different sets of command line options?
>> 3. Where should the command line options or module/function attributes be stored once they are read out from the IR?
>>
>> The short description of the approach I took in my patch is that command line options that are important to codegen are collected by cl::ParseCommandLineOptions, written to the bitcode as function or module attributes, and read out directly by the optimization passes that need them. cl::opt options are replaced with CodeGenOpt options (subclass of cl::opt), which are needed only to parse the command line and provide the default value when the corresponding options are not in the bitcode.
>
> I like this approach, since it means the frontend doesn't have to understand
> options in order to pass them on to the backend.

I agree. It’s not the approach that was I was expecting, but after reading the patches, I like it.

If we can get consensus on the approach, I suggest that we stage the work to first adopt Akira’s patches and then we can incrementally convert various options over to use it. There may be complications for some options, so I’d like to see separate patches for each one (possibly with a few similar options combined in one patch).


I'm not sold quite yet :)

Akira: Can you show an example of how you expect this to work in the IR and how the backend is going to process?

Thanks.

-eric
 
>

> The static variables should be straightforward to migrate to an LLVMContext
> once ParseCommandLineOptions stores things there instead of in globals.
>
>> diff --git a/lib/CodeGen/CodeGenOption.cpp b/lib/CodeGen/CodeGenOption.cpp
>> new file mode 100644
>> index 0000000..2d74c2f
>> --- /dev/null
>> +++ b/lib/CodeGen/CodeGenOption.cpp
>> @@ -0,0 +1,59 @@
>> +//===- CodeGen/CodeGenOptions.cpp - Code-gen option.           --*- C++ -*-===//
>> +//
>> +//                     The LLVM Compiler Infrastructure
>> +//
>> +// This file is distributed under the University of Illinois Open Source
>> +// License. See LICENSE.TXT for details.
>> +//
>> +//===----------------------------------------------------------------------===//
>> +//
>> +//===----------------------------------------------------------------------===//
>> +
>> +#include "llvm/CodeGen/CodeGenOption.h"
>> +#include "llvm/IR/Attributes.h"
>> +#include "llvm/IR/Module.h"
>> +
>> +using namespace llvm;
>> +
>> +static std::map<std::string, bool> FunctionBoolAttrs;
>> +static std::map<std::string, bool> ModuleBoolAttrs;
>> +
>
> @Chris, should these be using ManagedStatic?
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Embedding command line options in bitcode (PR21471)

Bob Wilson-3
Thanks for your feedback, Eric.

I still think we may be talking past each other a little bit, but rather than delving further into the details right now, I’ve suggested to Akira that he look into how we should handle other kinds of options. I’m hoping that as we look at more of them, we will gain some insight into the approach that we want to take here. This patch really only deals with the easy cases, and whether we use Akira’s approach or not, I’m not especially worried about this kind of option.

Let’s come back to this in a little while after we’ve looked at what to do for some other options…..

On Dec 1, 2014, at 10:53 AM, Eric Christopher <[hidden email]> wrote:

Hi Bob,

I've been giving this some thought, comments inline:

On Wed Nov 19 2014 at 6:24:18 PM Bob Wilson <[hidden email]> wrote:
On Nov 19, 2014, at 4:52 PM, Eric Christopher <[hidden email]> wrote:



On Wed Nov 19 2014 at 4:39:42 PM Akira Hatanaka <[hidden email]> wrote:
On Wed, Nov 19, 2014 at 3:28 PM, Eric Christopher <[hidden email]> wrote:
So my general concern here is that lots of command line options that don't need to be or shouldn't be IR level constructs become oddly named string options. It's bad enough that we're doing that with the target cpu and target feature string let alone the rest of it.

If we're talking about things that end up changing the subtarget used we definitely need to make this part of the lookup key - at which point we're going to want a unified interface to all of them. I haven't applied your patch to see what the IR changes look like here.


With the patches applied, the attributes clang embeds in the IR look like this:

$ clang -cc1 ... -mllvm -arm-long-calls -mllvm -arm-reserve-r9 -menable-no-nans

attributes #0 = { nounwind readonly ssp "NoNaNsFPMath" "arm-long-calls"="1" "arm-reserve-r9"="1" "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="arm7tdmi" "target-features"="-neon,-crypto,+apcs" "unsafe-fp-math"="false" "use-soft-float"="true" }

I'm guessing you want the front-end to convert command line options like "arm-long-calls" or "arm-reserve-r9" to something that doesn't look as odd?


Right, that was part of the idea, the rest is that we define things that actually make sense as front end command line options and use those to communicate via IR constructs or api calls I.e. reserved registers don't need to be a specific back end option, but rather can be passed along when creating a target machine, etc.

-eric

No, reserved registers (I assume you’re referring to “arm-reserve-r9” in Akira’s example) don’t necessarily need to be backend options. But, those kinds of options do need to be recorded in the IR and the target machine needs to reflect the relevant IR settings. Otherwise, we’ll still be left with a broken LTO model. E.G., I build one LTO object with “arm-reserve-r9” and another without it, but the LTO compilation still needs to respect those options for the code in each object. The details of how a particular option should be handled will vary depending on the option, and we need to discuss each one separately.


ABI breaking options should probably be considered part of TargetMachine/Subtarget creation and handled accordingly (if it's subtarget creation it can be on the function as a feature string attribute, if it affects global code generation it needs to be at TargetMachine creation time).
 
It’s pretty clear, though, that there is a class of backend options that translate nicely to function attributes. This RFC is about the mechanism that we use to handle those (without specifying which specific options will be handled that way).


Fair enough. For function attributes I can see a desire to have (as Chris posted in an email from someone at synopsys) a generic target specific dictionary mapping on functions for backend specific uses. I just haven't seen anything in the thread that would be applicable for this rather than an actual function attribute. That said, I would like some sort of delineation for target function attributes as well. Thoughts on that? Perhaps it could look like the string dictionary we've mentioned, maybe not.
 
 
From looking at the patch it seems that the command line option for NaNs should be a general attribute on the function rather than a TargetMachine option anyhow, so it seems that should be migrated down. Whether or not it is exposed as an llvm attribute or a string based key value pair is an interesting discussion. Also, would we want to, instead, expose the backend options as clang IR level options? Seems to make sense from that perspective and then they can add the appropriate attributes to the functions themselves. (-mlongcalls for the long calls option for example).

Thanks!

-eric

Again, I think we can evaluate specific options on a case-by-case basis. For some of them, it might make sense to expose them as clang options. Like you, I had previously assumed we would teach the front-end about all the backend options. I was a little surprised by the approach Akira suggested here, but after giving it some thought, I think it’s a great idea. There are a LOT of target-specific backend options. Some of them do not need to be IR, but many do. Teaching clang about all of them would be a lot of work, and I don’t see any clear benefit from that. This approach lets clang remain relatively independent of the backend options. We still need to think carefully about which backend options use this mechanism, of course.


I'm unconvinced here...
 
Are there other reasons why you think we should add all those options to clang? I just don’t see much to be gained by that.


.... I think that anything you'd like to expose should probably be a front end option. If nothing else I don't see how you plan on annotating the IR. Basically you're constructing a parallel option hierarchy and that seems... redundant. I agree that it's nice for tools to be able to override things in the backend - for testing if nothing else. We do this a lot by checking options and values in the code. A way of streamlining this would be nice - but I don't think that's the goal of these patches no?
 
Do you have objections to the use of string-based key/value pairs? I believe the intention all along, going back to Bill’s work on this a few years ago, was that we would add new attributes for target-independent things and use strings for the target-specific options. That seems like a reasonable starting point to me.


Only some as I mentioned above - I haven't seen anything that needs to be target specific in this way, but I'm open to the idea :)

As a note, this is all for function level things. We need a solution for non-function/module level attributes and I think that's going to look like API calls into the backend for TargetMachine creation - there may be a list of things that would turn into module level attributes, but that is a direction fraught with danger and I'd like to avoid that as much as possible.

(As a note, the particular patch that Akira has is unacceptable because of the changes to the subtarget lookup, but that's a separate issue :)

-eric
 

On Tue Nov 18 2014 at 12:26:49 PM Akira Hatanaka <[hidden email]> wrote:
Updated patch is attached. Note this is just a work-in-progress patch and I plan to address the feedback comments later if this patch is in the right direction.

This is how the command line options are parsed and used by the backend passes:

1. Tools such as clang and llc call cl::ParseCommandLineOptions. Any of the options that should be written to the bitcode as function or module attributes (these options are declared as CodeGenOpt, which is a subclass of cl::opt) are saved to Function or ModuleIntAttrs in CodeGenOptions.cpp.
2. setFunctionAttribute is called to move the function and module options saved in step 1 to the bitcode. Also, options stored in TargetOptions are moved to the bitcode too. Pre-existing attributes in the bitcode are overwritten by the options in the command line. 
3. The backend passes and subtarget classes call getFunctionAttribute to read the attributes stored to the bitcode in step 2. Function::hasFnAttribute can be called directly (for example, NoNaNsFPMath in the patch), if the default value is known to be "false".

I also made the following changes in the patch:

1. In the constructor of MachineFunction, call the version of getSubtargetImpl that takes a Function parameter, so that it gets the Subtarget specific to the Function being compiled.
2. Change ARMTargetMachine::SubtargetMap to be a DenseMap<Function*, unique_ptr<ARMSubtarget>>. This is just a temporary change to ease debugging and should be reverted to a StringMap or changed to another type of map later.


On Mon, Nov 17, 2014 at 11:40 AM, Eric Christopher <[hidden email]> wrote:


On Mon Nov 17 2014 at 9:56:40 AM Bob Wilson <[hidden email]> wrote:

> On Nov 14, 2014, at 2:44 PM, Duncan P. N. Exon Smith <[hidden email]> wrote:
>
> +chrisb
>
>> On 2014-Nov-13, at 16:33, Akira Hatanaka <[hidden email]> wrote:
>>
>> I'm working on fixing PR21471, which is about embedding codegen command line options into the bitcode as function or module-level attributes so that they don't get ignored when doing LTO.
>>
>> http://llvm.org/bugs/show_bug.cgi?id=21471
>>
>> I have an initial patch (attached to this email) which enables clang/llvm to recognize one command line option, write it to the IR, and read it out in a backend pass. I'm looking to get feedback from the community on whether I'm headed in the right direction or whether there are alternate ideas before I go all the way on fixing the PR. Specifically, I'd like to know the answers to the following questions:
>>
>> 1. How do we make sure we continue to be able to use the command line options we've been using for llc and other tools?
>> 2. How to handle cases where two functions in a module have different sets of command line options?
>> 3. Where should the command line options or module/function attributes be stored once they are read out from the IR?
>>
>> The short description of the approach I took in my patch is that command line options that are important to codegen are collected by cl::ParseCommandLineOptions, written to the bitcode as function or module attributes, and read out directly by the optimization passes that need them. cl::opt options are replaced with CodeGenOpt options (subclass of cl::opt), which are needed only to parse the command line and provide the default value when the corresponding options are not in the bitcode.
>
> I like this approach, since it means the frontend doesn't have to understand
> options in order to pass them on to the backend.

I agree. It’s not the approach that was I was expecting, but after reading the patches, I like it.

If we can get consensus on the approach, I suggest that we stage the work to first adopt Akira’s patches and then we can incrementally convert various options over to use it. There may be complications for some options, so I’d like to see separate patches for each one (possibly with a few similar options combined in one patch).


I'm not sold quite yet :)

Akira: Can you show an example of how you expect this to work in the IR and how the backend is going to process?

Thanks.

-eric
 
>

> The static variables should be straightforward to migrate to an LLVMContext
> once ParseCommandLineOptions stores things there instead of in globals.
>
>> diff --git a/lib/CodeGen/CodeGenOption.cpp b/lib/CodeGen/CodeGenOption.cpp
>> new file mode 100644
>> index 0000000..2d74c2f
>> --- /dev/null
>> +++ b/lib/CodeGen/CodeGenOption.cpp
>> @@ -0,0 +1,59 @@
>> +//===- CodeGen/CodeGenOptions.cpp - Code-gen option.           --*- C++ -*-===//
>> +//
>> +//                     The LLVM Compiler Infrastructure
>> +//
>> +// This file is distributed under the University of Illinois Open Source
>> +// License. See LICENSE.TXT for details.
>> +//
>> +//===----------------------------------------------------------------------===//
>> +//
>> +//===----------------------------------------------------------------------===//
>> +
>> +#include "llvm/CodeGen/CodeGenOption.h"
>> +#include "llvm/IR/Attributes.h"
>> +#include "llvm/IR/Module.h"
>> +
>> +using namespace llvm;
>> +
>> +static std::map<std::string, bool> FunctionBoolAttrs;
>> +static std::map<std::string, bool> ModuleBoolAttrs;
>> +
>
> @Chris, should these be using ManagedStatic?
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Embedding command line options in bitcode (PR21471)

Eric Christopher
On Mon Dec 01 2014 at 4:22:15 PM Bob Wilson <[hidden email]> wrote:
Thanks for your feedback, Eric.

I still think we may be talking past each other a little bit, but rather than

Might be, sorry if so :(
 
delving further into the details right now, I’ve suggested to Akira that he look into how we should handle other kinds of options. I’m hoping that as we look at more of them, we will gain some insight into the approach that we want to take here. This patch really only deals with the easy cases, and whether we use Akira’s approach or not, I’m not especially worried about this kind of option.

Let’s come back to this in a little while after we’ve looked at what to do for some other options…..


I apologize if this seems a bit rambly, but I'm trying to get a lot of thoughts on this down:

FWIW I'm looking at some of this at the moment which is why I spoke to a few of the different options and needing to control TargetMachine creation. Just to continue with some of the discussion (at least to point some things the right direction and get feedback) let's think about the -target-abi option for ARM. It sets the ABI for the backend on the target machine. It probably shouldn't change on a per function basis (wait, except for the calling convention which can be changed via an attribute, fun!), so we want it to control global data layout and valid types which means we need it on the TargetMachine. The question is how to pass this information into the backend for such an option - right now, if we wanted, we could punt to using the command line options that currently exist and the existing ParseCommandLineOptions call from clang.

That said, it's a pretty awful hack to use them. For some of my particular needs I've got some code around unifying the DataLayout creation between the front end and back end, but it's a partial solution for the general problem and I'll probably run into more issues in this arena.

So, trying to come up with ways to communicate all of this via the API so that we can set ABIs for targets that need it (ARM, MIPS, others), as well as probably other options (anything passed via -backend-option is a good candidate).

One thought is expanding and/or deriving target specific versions of TargetOptions and populating that on creation from the front end. We'd want to be careful with the target specific bits and constructing defaults, probably something off of the bits in Targets.cpp would be appropriate. A lot of this type of code leads us down the TargetSupport library or something of the sort - classes to help describe or create backend constructs.

Anyhow, this is my current thinking on how to do API building of TargetMachine etc. (Alternately a TargetMachineBuilder? But that sounds complicated in all of the same ways without any particular gain).

-eric
 
On Dec 1, 2014, at 10:53 AM, Eric Christopher <[hidden email]> wrote:

Hi Bob,

I've been giving this some thought, comments inline:

On Wed Nov 19 2014 at 6:24:18 PM Bob Wilson <[hidden email]> wrote:
On Nov 19, 2014, at 4:52 PM, Eric Christopher <[hidden email]> wrote:



On Wed Nov 19 2014 at 4:39:42 PM Akira Hatanaka <[hidden email]> wrote:
On Wed, Nov 19, 2014 at 3:28 PM, Eric Christopher <[hidden email]> wrote:
So my general concern here is that lots of command line options that don't need to be or shouldn't be IR level constructs become oddly named string options. It's bad enough that we're doing that with the target cpu and target feature string let alone the rest of it.

If we're talking about things that end up changing the subtarget used we definitely need to make this part of the lookup key - at which point we're going to want a unified interface to all of them. I haven't applied your patch to see what the IR changes look like here.


With the patches applied, the attributes clang embeds in the IR look like this:

$ clang -cc1 ... -mllvm -arm-long-calls -mllvm -arm-reserve-r9 -menable-no-nans

attributes #0 = { nounwind readonly ssp "NoNaNsFPMath" "arm-long-calls"="1" "arm-reserve-r9"="1" "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="arm7tdmi" "target-features"="-neon,-crypto,+apcs" "unsafe-fp-math"="false" "use-soft-float"="true" }

I'm guessing you want the front-end to convert command line options like "arm-long-calls" or "arm-reserve-r9" to something that doesn't look as odd?


Right, that was part of the idea, the rest is that we define things that actually make sense as front end command line options and use those to communicate via IR constructs or api calls I.e. reserved registers don't need to be a specific back end option, but rather can be passed along when creating a target machine, etc.

-eric

No, reserved registers (I assume you’re referring to “arm-reserve-r9” in Akira’s example) don’t necessarily need to be backend options. But, those kinds of options do need to be recorded in the IR and the target machine needs to reflect the relevant IR settings. Otherwise, we’ll still be left with a broken LTO model. E.G., I build one LTO object with “arm-reserve-r9” and another without it, but the LTO compilation still needs to respect those options for the code in each object. The details of how a particular option should be handled will vary depending on the option, and we need to discuss each one separately.


ABI breaking options should probably be considered part of TargetMachine/Subtarget creation and handled accordingly (if it's subtarget creation it can be on the function as a feature string attribute, if it affects global code generation it needs to be at TargetMachine creation time).
 
It’s pretty clear, though, that there is a class of backend options that translate nicely to function attributes. This RFC is about the mechanism that we use to handle those (without specifying which specific options will be handled that way).


Fair enough. For function attributes I can see a desire to have (as Chris posted in an email from someone at synopsys) a generic target specific dictionary mapping on functions for backend specific uses. I just haven't seen anything in the thread that would be applicable for this rather than an actual function attribute. That said, I would like some sort of delineation for target function attributes as well. Thoughts on that? Perhaps it could look like the string dictionary we've mentioned, maybe not.
 
 
From looking at the patch it seems that the command line option for NaNs should be a general attribute on the function rather than a TargetMachine option anyhow, so it seems that should be migrated down. Whether or not it is exposed as an llvm attribute or a string based key value pair is an interesting discussion. Also, would we want to, instead, expose the backend options as clang IR level options? Seems to make sense from that perspective and then they can add the appropriate attributes to the functions themselves. (-mlongcalls for the long calls option for example).

Thanks!

-eric

Again, I think we can evaluate specific options on a case-by-case basis. For some of them, it might make sense to expose them as clang options. Like you, I had previously assumed we would teach the front-end about all the backend options. I was a little surprised by the approach Akira suggested here, but after giving it some thought, I think it’s a great idea. There are a LOT of target-specific backend options. Some of them do not need to be IR, but many do. Teaching clang about all of them would be a lot of work, and I don’t see any clear benefit from that. This approach lets clang remain relatively independent of the backend options. We still need to think carefully about which backend options use this mechanism, of course.


I'm unconvinced here...
 
Are there other reasons why you think we should add all those options to clang? I just don’t see much to be gained by that.


.... I think that anything you'd like to expose should probably be a front end option. If nothing else I don't see how you plan on annotating the IR. Basically you're constructing a parallel option hierarchy and that seems... redundant. I agree that it's nice for tools to be able to override things in the backend - for testing if nothing else. We do this a lot by checking options and values in the code. A way of streamlining this would be nice - but I don't think that's the goal of these patches no?
 
Do you have objections to the use of string-based key/value pairs? I believe the intention all along, going back to Bill’s work on this a few years ago, was that we would add new attributes for target-independent things and use strings for the target-specific options. That seems like a reasonable starting point to me.


Only some as I mentioned above - I haven't seen anything that needs to be target specific in this way, but I'm open to the idea :)

As a note, this is all for function level things. We need a solution for non-function/module level attributes and I think that's going to look like API calls into the backend for TargetMachine creation - there may be a list of things that would turn into module level attributes, but that is a direction fraught with danger and I'd like to avoid that as much as possible.

(As a note, the particular patch that Akira has is unacceptable because of the changes to the subtarget lookup, but that's a separate issue :)

-eric
 

On Tue Nov 18 2014 at 12:26:49 PM Akira Hatanaka <[hidden email]> wrote:
Updated patch is attached. Note this is just a work-in-progress patch and I plan to address the feedback comments later if this patch is in the right direction.

This is how the command line options are parsed and used by the backend passes:

1. Tools such as clang and llc call cl::ParseCommandLineOptions. Any of the options that should be written to the bitcode as function or module attributes (these options are declared as CodeGenOpt, which is a subclass of cl::opt) are saved to Function or ModuleIntAttrs in CodeGenOptions.cpp.
2. setFunctionAttribute is called to move the function and module options saved in step 1 to the bitcode. Also, options stored in TargetOptions are moved to the bitcode too. Pre-existing attributes in the bitcode are overwritten by the options in the command line. 
3. The backend passes and subtarget classes call getFunctionAttribute to read the attributes stored to the bitcode in step 2. Function::hasFnAttribute can be called directly (for example, NoNaNsFPMath in the patch), if the default value is known to be "false".

I also made the following changes in the patch:

1. In the constructor of MachineFunction, call the version of getSubtargetImpl that takes a Function parameter, so that it gets the Subtarget specific to the Function being compiled.
2. Change ARMTargetMachine::SubtargetMap to be a DenseMap<Function*, unique_ptr<ARMSubtarget>>. This is just a temporary change to ease debugging and should be reverted to a StringMap or changed to another type of map later.


On Mon, Nov 17, 2014 at 11:40 AM, Eric Christopher <[hidden email]> wrote:


On Mon Nov 17 2014 at 9:56:40 AM Bob Wilson <[hidden email]> wrote:

> On Nov 14, 2014, at 2:44 PM, Duncan P. N. Exon Smith <[hidden email]> wrote:
>
> +chrisb
>
>> On 2014-Nov-13, at 16:33, Akira Hatanaka <[hidden email]> wrote:
>>
>> I'm working on fixing PR21471, which is about embedding codegen command line options into the bitcode as function or module-level attributes so that they don't get ignored when doing LTO.
>>
>> http://llvm.org/bugs/show_bug.cgi?id=21471
>>
>> I have an initial patch (attached to this email) which enables clang/llvm to recognize one command line option, write it to the IR, and read it out in a backend pass. I'm looking to get feedback from the community on whether I'm headed in the right direction or whether there are alternate ideas before I go all the way on fixing the PR. Specifically, I'd like to know the answers to the following questions:
>>
>> 1. How do we make sure we continue to be able to use the command line options we've been using for llc and other tools?
>> 2. How to handle cases where two functions in a module have different sets of command line options?
>> 3. Where should the command line options or module/function attributes be stored once they are read out from the IR?
>>
>> The short description of the approach I took in my patch is that command line options that are important to codegen are collected by cl::ParseCommandLineOptions, written to the bitcode as function or module attributes, and read out directly by the optimization passes that need them. cl::opt options are replaced with CodeGenOpt options (subclass of cl::opt), which are needed only to parse the command line and provide the default value when the corresponding options are not in the bitcode.
>
> I like this approach, since it means the frontend doesn't have to understand
> options in order to pass them on to the backend.

I agree. It’s not the approach that was I was expecting, but after reading the patches, I like it.

If we can get consensus on the approach, I suggest that we stage the work to first adopt Akira’s patches and then we can incrementally convert various options over to use it. There may be complications for some options, so I’d like to see separate patches for each one (possibly with a few similar options combined in one patch).


I'm not sold quite yet :)

Akira: Can you show an example of how you expect this to work in the IR and how the backend is going to process?

Thanks.

-eric
 
>

> The static variables should be straightforward to migrate to an LLVMContext
> once ParseCommandLineOptions stores things there instead of in globals.
>
>> diff --git a/lib/CodeGen/CodeGenOption.cpp b/lib/CodeGen/CodeGenOption.cpp
>> new file mode 100644
>> index 0000000..2d74c2f
>> --- /dev/null
>> +++ b/lib/CodeGen/CodeGenOption.cpp
>> @@ -0,0 +1,59 @@
>> +//===- CodeGen/CodeGenOptions.cpp - Code-gen option.           --*- C++ -*-===//
>> +//
>> +//                     The LLVM Compiler Infrastructure
>> +//
>> +// This file is distributed under the University of Illinois Open Source
>> +// License. See LICENSE.TXT for details.
>> +//
>> +//===----------------------------------------------------------------------===//
>> +//
>> +//===----------------------------------------------------------------------===//
>> +
>> +#include "llvm/CodeGen/CodeGenOption.h"
>> +#include "llvm/IR/Attributes.h"
>> +#include "llvm/IR/Module.h"
>> +
>> +using namespace llvm;
>> +
>> +static std::map<std::string, bool> FunctionBoolAttrs;
>> +static std::map<std::string, bool> ModuleBoolAttrs;
>> +
>
> @Chris, should these be using ManagedStatic?
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Embedding command line options in bitcode (PR21471)

Akira Hatanaka
On Tue, Dec 2, 2014 at 3:21 PM, Eric Christopher <[hidden email]> wrote:
On Mon Dec 01 2014 at 4:22:15 PM Bob Wilson <[hidden email]> wrote:
Thanks for your feedback, Eric.

I still think we may be talking past each other a little bit, but rather than

Might be, sorry if so :(
 
delving further into the details right now, I’ve suggested to Akira that he look into how we should handle other kinds of options. I’m hoping that as we look at more of them, we will gain some insight into the approach that we want to take here. This patch really only deals with the easy cases, and whether we use Akira’s approach or not, I’m not especially worried about this kind of option.

Let’s come back to this in a little while after we’ve looked at what to do for some other options…..


I apologize if this seems a bit rambly, but I'm trying to get a lot of thoughts on this down:

FWIW I'm looking at some of this at the moment which is why I spoke to a few of the different options and needing to control TargetMachine creation. Just to continue with some of the discussion (at least to point some things the right direction and get feedback) let's think about the -target-abi option for ARM. It sets the ABI for the backend on the target machine. It probably shouldn't change on a per function basis (wait, except for the calling convention which can be changed via an attribute, fun!), so we want it to control global data layout and valid types which means we need it on the TargetMachine. The question is how to pass this information into the backend for such an option - right now, if we wanted, we could punt to using the command line options that currently exist and the existing ParseCommandLineOptions call from clang.


When option -target-abi=apcs-gnu for ARM is used for clang, EmitAssemblyHelper::CreateTargetMachine passes FeatruresStr="+apcs" to Target::createTargetMachine. Do we have anything else we have to pass to the backend in this case?
 
That said, it's a pretty awful hack to use them. For some of my particular needs I've got some code around unifying the DataLayout creation between the front end and back end, but it's a partial solution for the general problem and I'll probably run into more issues in this arena.

So, trying to come up with ways to communicate all of this via the API so that we can set ABIs for targets that need it (ARM, MIPS, others), as well as probably other options (anything passed via -backend-option is a good candidate).

One thought is expanding and/or deriving target specific versions of TargetOptions and populating that on creation from the front end. We'd want to be careful with the target specific bits and constructing defaults, probably something off of the bits in Targets.cpp would be appropriate. A lot of this type of code leads us down the TargetSupport library or something of the sort - classes to help describe or create backend constructs.

Anyhow, this is my current thinking on how to do API building of TargetMachine etc. (Alternately a TargetMachineBuilder? But that sounds complicated in all of the same ways without any particular gain).

-eric
 
On Dec 1, 2014, at 10:53 AM, Eric Christopher <[hidden email]> wrote:

Hi Bob,

I've been giving this some thought, comments inline:

On Wed Nov 19 2014 at 6:24:18 PM Bob Wilson <[hidden email]> wrote:
On Nov 19, 2014, at 4:52 PM, Eric Christopher <[hidden email]> wrote:



On Wed Nov 19 2014 at 4:39:42 PM Akira Hatanaka <[hidden email]> wrote:
On Wed, Nov 19, 2014 at 3:28 PM, Eric Christopher <[hidden email]> wrote:
So my general concern here is that lots of command line options that don't need to be or shouldn't be IR level constructs become oddly named string options. It's bad enough that we're doing that with the target cpu and target feature string let alone the rest of it.

If we're talking about things that end up changing the subtarget used we definitely need to make this part of the lookup key - at which point we're going to want a unified interface to all of them. I haven't applied your patch to see what the IR changes look like here.


With the patches applied, the attributes clang embeds in the IR look like this:

$ clang -cc1 ... -mllvm -arm-long-calls -mllvm -arm-reserve-r9 -menable-no-nans

attributes #0 = { nounwind readonly ssp "NoNaNsFPMath" "arm-long-calls"="1" "arm-reserve-r9"="1" "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="arm7tdmi" "target-features"="-neon,-crypto,+apcs" "unsafe-fp-math"="false" "use-soft-float"="true" }

I'm guessing you want the front-end to convert command line options like "arm-long-calls" or "arm-reserve-r9" to something that doesn't look as odd?


Right, that was part of the idea, the rest is that we define things that actually make sense as front end command line options and use those to communicate via IR constructs or api calls I.e. reserved registers don't need to be a specific back end option, but rather can be passed along when creating a target machine, etc.

-eric

No, reserved registers (I assume you’re referring to “arm-reserve-r9” in Akira’s example) don’t necessarily need to be backend options. But, those kinds of options do need to be recorded in the IR and the target machine needs to reflect the relevant IR settings. Otherwise, we’ll still be left with a broken LTO model. E.G., I build one LTO object with “arm-reserve-r9” and another without it, but the LTO compilation still needs to respect those options for the code in each object. The details of how a particular option should be handled will vary depending on the option, and we need to discuss each one separately.


ABI breaking options should probably be considered part of TargetMachine/Subtarget creation and handled accordingly (if it's subtarget creation it can be on the function as a feature string attribute, if it affects global code generation it needs to be at TargetMachine creation time).
 
It’s pretty clear, though, that there is a class of backend options that translate nicely to function attributes. This RFC is about the mechanism that we use to handle those (without specifying which specific options will be handled that way).


Fair enough. For function attributes I can see a desire to have (as Chris posted in an email from someone at synopsys) a generic target specific dictionary mapping on functions for backend specific uses. I just haven't seen anything in the thread that would be applicable for this rather than an actual function attribute. That said, I would like some sort of delineation for target function attributes as well. Thoughts on that? Perhaps it could look like the string dictionary we've mentioned, maybe not.
 
 
From looking at the patch it seems that the command line option for NaNs should be a general attribute on the function rather than a TargetMachine option anyhow, so it seems that should be migrated down. Whether or not it is exposed as an llvm attribute or a string based key value pair is an interesting discussion. Also, would we want to, instead, expose the backend options as clang IR level options? Seems to make sense from that perspective and then they can add the appropriate attributes to the functions themselves. (-mlongcalls for the long calls option for example).

Thanks!

-eric

Again, I think we can evaluate specific options on a case-by-case basis. For some of them, it might make sense to expose them as clang options. Like you, I had previously assumed we would teach the front-end about all the backend options. I was a little surprised by the approach Akira suggested here, but after giving it some thought, I think it’s a great idea. There are a LOT of target-specific backend options. Some of them do not need to be IR, but many do. Teaching clang about all of them would be a lot of work, and I don’t see any clear benefit from that. This approach lets clang remain relatively independent of the backend options. We still need to think carefully about which backend options use this mechanism, of course.


I'm unconvinced here...
 
Are there other reasons why you think we should add all those options to clang? I just don’t see much to be gained by that.


.... I think that anything you'd like to expose should probably be a front end option. If nothing else I don't see how you plan on annotating the IR. Basically you're constructing a parallel option hierarchy and that seems... redundant. I agree that it's nice for tools to be able to override things in the backend - for testing if nothing else. We do this a lot by checking options and values in the code. A way of streamlining this would be nice - but I don't think that's the goal of these patches no?
 
Do you have objections to the use of string-based key/value pairs? I believe the intention all along, going back to Bill’s work on this a few years ago, was that we would add new attributes for target-independent things and use strings for the target-specific options. That seems like a reasonable starting point to me.


Only some as I mentioned above - I haven't seen anything that needs to be target specific in this way, but I'm open to the idea :)

As a note, this is all for function level things. We need a solution for non-function/module level attributes and I think that's going to look like API calls into the backend for TargetMachine creation - there may be a list of things that would turn into module level attributes, but that is a direction fraught with danger and I'd like to avoid that as much as possible.

(As a note, the particular patch that Akira has is unacceptable because of the changes to the subtarget lookup, but that's a separate issue :)

-eric
 

On Tue Nov 18 2014 at 12:26:49 PM Akira Hatanaka <[hidden email]> wrote:
Updated patch is attached. Note this is just a work-in-progress patch and I plan to address the feedback comments later if this patch is in the right direction.

This is how the command line options are parsed and used by the backend passes:

1. Tools such as clang and llc call cl::ParseCommandLineOptions. Any of the options that should be written to the bitcode as function or module attributes (these options are declared as CodeGenOpt, which is a subclass of cl::opt) are saved to Function or ModuleIntAttrs in CodeGenOptions.cpp.
2. setFunctionAttribute is called to move the function and module options saved in step 1 to the bitcode. Also, options stored in TargetOptions are moved to the bitcode too. Pre-existing attributes in the bitcode are overwritten by the options in the command line. 
3. The backend passes and subtarget classes call getFunctionAttribute to read the attributes stored to the bitcode in step 2. Function::hasFnAttribute can be called directly (for example, NoNaNsFPMath in the patch), if the default value is known to be "false".

I also made the following changes in the patch:

1. In the constructor of MachineFunction, call the version of getSubtargetImpl that takes a Function parameter, so that it gets the Subtarget specific to the Function being compiled.
2. Change ARMTargetMachine::SubtargetMap to be a DenseMap<Function*, unique_ptr<ARMSubtarget>>. This is just a temporary change to ease debugging and should be reverted to a StringMap or changed to another type of map later.


On Mon, Nov 17, 2014 at 11:40 AM, Eric Christopher <[hidden email]> wrote:


On Mon Nov 17 2014 at 9:56:40 AM Bob Wilson <[hidden email]> wrote:

> On Nov 14, 2014, at 2:44 PM, Duncan P. N. Exon Smith <[hidden email]> wrote:
>
> +chrisb
>
>> On 2014-Nov-13, at 16:33, Akira Hatanaka <[hidden email]> wrote:
>>
>> I'm working on fixing PR21471, which is about embedding codegen command line options into the bitcode as function or module-level attributes so that they don't get ignored when doing LTO.
>>
>> http://llvm.org/bugs/show_bug.cgi?id=21471
>>
>> I have an initial patch (attached to this email) which enables clang/llvm to recognize one command line option, write it to the IR, and read it out in a backend pass. I'm looking to get feedback from the community on whether I'm headed in the right direction or whether there are alternate ideas before I go all the way on fixing the PR. Specifically, I'd like to know the answers to the following questions:
>>
>> 1. How do we make sure we continue to be able to use the command line options we've been using for llc and other tools?
>> 2. How to handle cases where two functions in a module have different sets of command line options?
>> 3. Where should the command line options or module/function attributes be stored once they are read out from the IR?
>>
>> The short description of the approach I took in my patch is that command line options that are important to codegen are collected by cl::ParseCommandLineOptions, written to the bitcode as function or module attributes, and read out directly by the optimization passes that need them. cl::opt options are replaced with CodeGenOpt options (subclass of cl::opt), which are needed only to parse the command line and provide the default value when the corresponding options are not in the bitcode.
>
> I like this approach, since it means the frontend doesn't have to understand
> options in order to pass them on to the backend.

I agree. It’s not the approach that was I was expecting, but after reading the patches, I like it.

If we can get consensus on the approach, I suggest that we stage the work to first adopt Akira’s patches and then we can incrementally convert various options over to use it. There may be complications for some options, so I’d like to see separate patches for each one (possibly with a few similar options combined in one patch).


I'm not sold quite yet :)

Akira: Can you show an example of how you expect this to work in the IR and how the backend is going to process?

Thanks.

-eric
 
>

> The static variables should be straightforward to migrate to an LLVMContext
> once ParseCommandLineOptions stores things there instead of in globals.
>
>> diff --git a/lib/CodeGen/CodeGenOption.cpp b/lib/CodeGen/CodeGenOption.cpp
>> new file mode 100644
>> index 0000000..2d74c2f
>> --- /dev/null
>> +++ b/lib/CodeGen/CodeGenOption.cpp
>> @@ -0,0 +1,59 @@
>> +//===- CodeGen/CodeGenOptions.cpp - Code-gen option.           --*- C++ -*-===//
>> +//
>> +//                     The LLVM Compiler Infrastructure
>> +//
>> +// This file is distributed under the University of Illinois Open Source
>> +// License. See LICENSE.TXT for details.
>> +//
>> +//===----------------------------------------------------------------------===//
>> +//
>> +//===----------------------------------------------------------------------===//
>> +
>> +#include "llvm/CodeGen/CodeGenOption.h"
>> +#include "llvm/IR/Attributes.h"
>> +#include "llvm/IR/Module.h"
>> +
>> +using namespace llvm;
>> +
>> +static std::map<std::string, bool> FunctionBoolAttrs;
>> +static std::map<std::string, bool> ModuleBoolAttrs;
>> +
>
> @Chris, should these be using ManagedStatic?
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev




_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Embedding command line options in bitcode (PR21471)

Eric Christopher


On Tue Dec 02 2014 at 4:31:43 PM Akira Hatanaka <[hidden email]> wrote:
On Tue, Dec 2, 2014 at 3:21 PM, Eric Christopher <[hidden email]> wrote:
On Mon Dec 01 2014 at 4:22:15 PM Bob Wilson <[hidden email]> wrote:
Thanks for your feedback, Eric.

I still think we may be talking past each other a little bit, but rather than

Might be, sorry if so :(
 
delving further into the details right now, I’ve suggested to Akira that he look into how we should handle other kinds of options. I’m hoping that as we look at more of them, we will gain some insight into the approach that we want to take here. This patch really only deals with the easy cases, and whether we use Akira’s approach or not, I’m not especially worried about this kind of option.

Let’s come back to this in a little while after we’ve looked at what to do for some other options…..


I apologize if this seems a bit rambly, but I'm trying to get a lot of thoughts on this down:

FWIW I'm looking at some of this at the moment which is why I spoke to a few of the different options and needing to control TargetMachine creation. Just to continue with some of the discussion (at least to point some things the right direction and get feedback) let's think about the -target-abi option for ARM. It sets the ABI for the backend on the target machine. It probably shouldn't change on a per function basis (wait, except for the calling convention which can be changed via an attribute, fun!), so we want it to control global data layout and valid types which means we need it on the TargetMachine. The question is how to pass this information into the backend for such an option - right now, if we wanted, we could punt to using the command line options that currently exist and the existing ParseCommandLineOptions call from clang.


When option -target-abi=apcs-gnu for ARM is used for clang, EmitAssemblyHelper::CreateTargetMachine passes FeatruresStr="+apcs" to Target::createTargetMachine. Do we have anything else we have to pass to the backend in this case?

Not yet, but the option can be a backend option rather than a subtarget feature - and I'm probably going to do that in the short term as it's no worse than the current status, see the next line here below :)

-eric
 
 
That said, it's a pretty awful hack to use them. For some of my particular needs I've got some code around unifying the DataLayout creation between the front end and back end, but it's a partial solution for the general problem and I'll probably run into more issues in this arena.

So, trying to come up with ways to communicate all of this via the API so that we can set ABIs for targets that need it (ARM, MIPS, others), as well as probably other options (anything passed via -backend-option is a good candidate).

One thought is expanding and/or deriving target specific versions of TargetOptions and populating that on creation from the front end. We'd want to be careful with the target specific bits and constructing defaults, probably something off of the bits in Targets.cpp would be appropriate. A lot of this type of code leads us down the TargetSupport library or something of the sort - classes to help describe or create backend constructs.

Anyhow, this is my current thinking on how to do API building of TargetMachine etc. (Alternately a TargetMachineBuilder? But that sounds complicated in all of the same ways without any particular gain).

-eric
 
On Dec 1, 2014, at 10:53 AM, Eric Christopher <[hidden email]> wrote:

Hi Bob,

I've been giving this some thought, comments inline:

On Wed Nov 19 2014 at 6:24:18 PM Bob Wilson <[hidden email]> wrote:
On Nov 19, 2014, at 4:52 PM, Eric Christopher <[hidden email]> wrote:



On Wed Nov 19 2014 at 4:39:42 PM Akira Hatanaka <[hidden email]> wrote:
On Wed, Nov 19, 2014 at 3:28 PM, Eric Christopher <[hidden email]> wrote:
So my general concern here is that lots of command line options that don't need to be or shouldn't be IR level constructs become oddly named string options. It's bad enough that we're doing that with the target cpu and target feature string let alone the rest of it.

If we're talking about things that end up changing the subtarget used we definitely need to make this part of the lookup key - at which point we're going to want a unified interface to all of them. I haven't applied your patch to see what the IR changes look like here.


With the patches applied, the attributes clang embeds in the IR look like this:

$ clang -cc1 ... -mllvm -arm-long-calls -mllvm -arm-reserve-r9 -menable-no-nans

attributes #0 = { nounwind readonly ssp "NoNaNsFPMath" "arm-long-calls"="1" "arm-reserve-r9"="1" "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="arm7tdmi" "target-features"="-neon,-crypto,+apcs" "unsafe-fp-math"="false" "use-soft-float"="true" }

I'm guessing you want the front-end to convert command line options like "arm-long-calls" or "arm-reserve-r9" to something that doesn't look as odd?


Right, that was part of the idea, the rest is that we define things that actually make sense as front end command line options and use those to communicate via IR constructs or api calls I.e. reserved registers don't need to be a specific back end option, but rather can be passed along when creating a target machine, etc.

-eric

No, reserved registers (I assume you’re referring to “arm-reserve-r9” in Akira’s example) don’t necessarily need to be backend options. But, those kinds of options do need to be recorded in the IR and the target machine needs to reflect the relevant IR settings. Otherwise, we’ll still be left with a broken LTO model. E.G., I build one LTO object with “arm-reserve-r9” and another without it, but the LTO compilation still needs to respect those options for the code in each object. The details of how a particular option should be handled will vary depending on the option, and we need to discuss each one separately.


ABI breaking options should probably be considered part of TargetMachine/Subtarget creation and handled accordingly (if it's subtarget creation it can be on the function as a feature string attribute, if it affects global code generation it needs to be at TargetMachine creation time).
 
It’s pretty clear, though, that there is a class of backend options that translate nicely to function attributes. This RFC is about the mechanism that we use to handle those (without specifying which specific options will be handled that way).


Fair enough. For function attributes I can see a desire to have (as Chris posted in an email from someone at synopsys) a generic target specific dictionary mapping on functions for backend specific uses. I just haven't seen anything in the thread that would be applicable for this rather than an actual function attribute. That said, I would like some sort of delineation for target function attributes as well. Thoughts on that? Perhaps it could look like the string dictionary we've mentioned, maybe not.
 
 
From looking at the patch it seems that the command line option for NaNs should be a general attribute on the function rather than a TargetMachine option anyhow, so it seems that should be migrated down. Whether or not it is exposed as an llvm attribute or a string based key value pair is an interesting discussion. Also, would we want to, instead, expose the backend options as clang IR level options? Seems to make sense from that perspective and then they can add the appropriate attributes to the functions themselves. (-mlongcalls for the long calls option for example).

Thanks!

-eric

Again, I think we can evaluate specific options on a case-by-case basis. For some of them, it might make sense to expose them as clang options. Like you, I had previously assumed we would teach the front-end about all the backend options. I was a little surprised by the approach Akira suggested here, but after giving it some thought, I think it’s a great idea. There are a LOT of target-specific backend options. Some of them do not need to be IR, but many do. Teaching clang about all of them would be a lot of work, and I don’t see any clear benefit from that. This approach lets clang remain relatively independent of the backend options. We still need to think carefully about which backend options use this mechanism, of course.


I'm unconvinced here...
 
Are there other reasons why you think we should add all those options to clang? I just don’t see much to be gained by that.


.... I think that anything you'd like to expose should probably be a front end option. If nothing else I don't see how you plan on annotating the IR. Basically you're constructing a parallel option hierarchy and that seems... redundant. I agree that it's nice for tools to be able to override things in the backend - for testing if nothing else. We do this a lot by checking options and values in the code. A way of streamlining this would be nice - but I don't think that's the goal of these patches no?
 
Do you have objections to the use of string-based key/value pairs? I believe the intention all along, going back to Bill’s work on this a few years ago, was that we would add new attributes for target-independent things and use strings for the target-specific options. That seems like a reasonable starting point to me.


Only some as I mentioned above - I haven't seen anything that needs to be target specific in this way, but I'm open to the idea :)

As a note, this is all for function level things. We need a solution for non-function/module level attributes and I think that's going to look like API calls into the backend for TargetMachine creation - there may be a list of things that would turn into module level attributes, but that is a direction fraught with danger and I'd like to avoid that as much as possible.

(As a note, the particular patch that Akira has is unacceptable because of the changes to the subtarget lookup, but that's a separate issue :)

-eric
 

On Tue Nov 18 2014 at 12:26:49 PM Akira Hatanaka <[hidden email]> wrote:
Updated patch is attached. Note this is just a work-in-progress patch and I plan to address the feedback comments later if this patch is in the right direction.

This is how the command line options are parsed and used by the backend passes:

1. Tools such as clang and llc call cl::ParseCommandLineOptions. Any of the options that should be written to the bitcode as function or module attributes (these options are declared as CodeGenOpt, which is a subclass of cl::opt) are saved to Function or ModuleIntAttrs in CodeGenOptions.cpp.
2. setFunctionAttribute is called to move the function and module options saved in step 1 to the bitcode. Also, options stored in TargetOptions are moved to the bitcode too. Pre-existing attributes in the bitcode are overwritten by the options in the command line. 
3. The backend passes and subtarget classes call getFunctionAttribute to read the attributes stored to the bitcode in step 2. Function::hasFnAttribute can be called directly (for example, NoNaNsFPMath in the patch), if the default value is known to be "false".

I also made the following changes in the patch:

1. In the constructor of MachineFunction, call the version of getSubtargetImpl that takes a Function parameter, so that it gets the Subtarget specific to the Function being compiled.
2. Change ARMTargetMachine::SubtargetMap to be a DenseMap<Function*, unique_ptr<ARMSubtarget>>. This is just a temporary change to ease debugging and should be reverted to a StringMap or changed to another type of map later.


On Mon, Nov 17, 2014 at 11:40 AM, Eric Christopher <[hidden email]> wrote:


On Mon Nov 17 2014 at 9:56:40 AM Bob Wilson <[hidden email]> wrote:

> On Nov 14, 2014, at 2:44 PM, Duncan P. N. Exon Smith <[hidden email]> wrote:
>
> +chrisb
>
>> On 2014-Nov-13, at 16:33, Akira Hatanaka <[hidden email]> wrote:
>>
>> I'm working on fixing PR21471, which is about embedding codegen command line options into the bitcode as function or module-level attributes so that they don't get ignored when doing LTO.
>>
>> http://llvm.org/bugs/show_bug.cgi?id=21471
>>
>> I have an initial patch (attached to this email) which enables clang/llvm to recognize one command line option, write it to the IR, and read it out in a backend pass. I'm looking to get feedback from the community on whether I'm headed in the right direction or whether there are alternate ideas before I go all the way on fixing the PR. Specifically, I'd like to know the answers to the following questions:
>>
>> 1. How do we make sure we continue to be able to use the command line options we've been using for llc and other tools?
>> 2. How to handle cases where two functions in a module have different sets of command line options?
>> 3. Where should the command line options or module/function attributes be stored once they are read out from the IR?
>>
>> The short description of the approach I took in my patch is that command line options that are important to codegen are collected by cl::ParseCommandLineOptions, written to the bitcode as function or module attributes, and read out directly by the optimization passes that need them. cl::opt options are replaced with CodeGenOpt options (subclass of cl::opt), which are needed only to parse the command line and provide the default value when the corresponding options are not in the bitcode.
>
> I like this approach, since it means the frontend doesn't have to understand
> options in order to pass them on to the backend.

I agree. It’s not the approach that was I was expecting, but after reading the patches, I like it.

If we can get consensus on the approach, I suggest that we stage the work to first adopt Akira’s patches and then we can incrementally convert various options over to use it. There may be complications for some options, so I’d like to see separate patches for each one (possibly with a few similar options combined in one patch).


I'm not sold quite yet :)

Akira: Can you show an example of how you expect this to work in the IR and how the backend is going to process?

Thanks.

-eric
 
>

> The static variables should be straightforward to migrate to an LLVMContext
> once ParseCommandLineOptions stores things there instead of in globals.
>
>> diff --git a/lib/CodeGen/CodeGenOption.cpp b/lib/CodeGen/CodeGenOption.cpp
>> new file mode 100644
>> index 0000000..2d74c2f
>> --- /dev/null
>> +++ b/lib/CodeGen/CodeGenOption.cpp
>> @@ -0,0 +1,59 @@
>> +//===- CodeGen/CodeGenOptions.cpp - Code-gen option.           --*- C++ -*-===//
>> +//
>> +//                     The LLVM Compiler Infrastructure
>> +//
>> +// This file is distributed under the University of Illinois Open Source
>> +// License. See LICENSE.TXT for details.
>> +//
>> +//===----------------------------------------------------------------------===//
>> +//
>> +//===----------------------------------------------------------------------===//
>> +
>> +#include "llvm/CodeGen/CodeGenOption.h"
>> +#include "llvm/IR/Attributes.h"
>> +#include "llvm/IR/Module.h"
>> +
>> +using namespace llvm;
>> +
>> +static std::map<std::string, bool> FunctionBoolAttrs;
>> +static std::map<std::string, bool> ModuleBoolAttrs;
>> +
>
> @Chris, should these be using ManagedStatic?
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Embedding command line options in bitcode (PR21471)

Akira Hatanaka
On Tue, Dec 2, 2014 at 4:38 PM, Eric Christopher <[hidden email]> wrote:


On Tue Dec 02 2014 at 4:31:43 PM Akira Hatanaka <[hidden email]> wrote:
On Tue, Dec 2, 2014 at 3:21 PM, Eric Christopher <[hidden email]> wrote:
On Mon Dec 01 2014 at 4:22:15 PM Bob Wilson <[hidden email]> wrote:
Thanks for your feedback, Eric.

I still think we may be talking past each other a little bit, but rather than

Might be, sorry if so :(
 
delving further into the details right now, I’ve suggested to Akira that he look into how we should handle other kinds of options. I’m hoping that as we look at more of them, we will gain some insight into the approach that we want to take here. This patch really only deals with the easy cases, and whether we use Akira’s approach or not, I’m not especially worried about this kind of option.

Let’s come back to this in a little while after we’ve looked at what to do for some other options…..


I apologize if this seems a bit rambly, but I'm trying to get a lot of thoughts on this down:

FWIW I'm looking at some of this at the moment which is why I spoke to a few of the different options and needing to control TargetMachine creation. Just to continue with some of the discussion (at least to point some things the right direction and get feedback) let's think about the -target-abi option for ARM. It sets the ABI for the backend on the target machine. It probably shouldn't change on a per function basis (wait, except for the calling convention which can be changed via an attribute, fun!), so we want it to control global data layout and valid types which means we need it on the TargetMachine. The question is how to pass this information into the backend for such an option - right now, if we wanted, we could punt to using the command line options that currently exist and the existing ParseCommandLineOptions call from clang.


When option -target-abi=apcs-gnu for ARM is used for clang, EmitAssemblyHelper::CreateTargetMachine passes FeatruresStr="+apcs" to Target::createTargetMachine. Do we have anything else we have to pass to the backend in this case?

Not yet, but the option can be a backend option rather than a subtarget feature - and I'm probably going to do that in the short term as it's no worse than the current status, see the next line here below :)

-eric
 
 
That said, it's a pretty awful hack to use them. For some of my particular needs I've got some code around unifying the DataLayout creation between the front end and back end, but it's a partial solution for the general problem and I'll probably run into more issues in this arena.

So, trying to come up with ways to communicate all of this via the API so that we can set ABIs for targets that need it (ARM, MIPS, others), as well as probably other options (anything passed via -backend-option is a good candidate).

One thought is expanding and/or deriving target specific versions of TargetOptions and populating that on creation from the front end. We'd want to be careful with the target specific bits and constructing defaults, probably something off of the bits in Targets.cpp would be appropriate. A lot of this type of code leads us down the TargetSupport library or something of the sort - classes to help describe or create backend constructs.

Anyhow, this is my current thinking on how to do API building of TargetMachine etc. (Alternately a TargetMachineBuilder? But that sounds complicated in all of the same ways without any particular gain).



If we wanted to, in EmitAssemblyHelper::CreateTargetMachine, we could pass module-level CodeGenOpts.BackendOptions (backend options that don't need to change on a per-function basis) to Target::createTargetMachine. We can either pass it directly to the function or add a new field in llvm::TargetOptions that holds the options. If we can do that, we don't have to call ParseCommandLineOptions from EmitAssemblyHelper::CreateTargetMachine.

 
-eric
 
On Dec 1, 2014, at 10:53 AM, Eric Christopher <[hidden email]> wrote:

Hi Bob,

I've been giving this some thought, comments inline:

On Wed Nov 19 2014 at 6:24:18 PM Bob Wilson <[hidden email]> wrote:
On Nov 19, 2014, at 4:52 PM, Eric Christopher <[hidden email]> wrote:



On Wed Nov 19 2014 at 4:39:42 PM Akira Hatanaka <[hidden email]> wrote:
On Wed, Nov 19, 2014 at 3:28 PM, Eric Christopher <[hidden email]> wrote:
So my general concern here is that lots of command line options that don't need to be or shouldn't be IR level constructs become oddly named string options. It's bad enough that we're doing that with the target cpu and target feature string let alone the rest of it.

If we're talking about things that end up changing the subtarget used we definitely need to make this part of the lookup key - at which point we're going to want a unified interface to all of them. I haven't applied your patch to see what the IR changes look like here.


With the patches applied, the attributes clang embeds in the IR look like this:

$ clang -cc1 ... -mllvm -arm-long-calls -mllvm -arm-reserve-r9 -menable-no-nans

attributes #0 = { nounwind readonly ssp "NoNaNsFPMath" "arm-long-calls"="1" "arm-reserve-r9"="1" "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="arm7tdmi" "target-features"="-neon,-crypto,+apcs" "unsafe-fp-math"="false" "use-soft-float"="true" }

I'm guessing you want the front-end to convert command line options like "arm-long-calls" or "arm-reserve-r9" to something that doesn't look as odd?


Right, that was part of the idea, the rest is that we define things that actually make sense as front end command line options and use those to communicate via IR constructs or api calls I.e. reserved registers don't need to be a specific back end option, but rather can be passed along when creating a target machine, etc.

-eric

No, reserved registers (I assume you’re referring to “arm-reserve-r9” in Akira’s example) don’t necessarily need to be backend options. But, those kinds of options do need to be recorded in the IR and the target machine needs to reflect the relevant IR settings. Otherwise, we’ll still be left with a broken LTO model. E.G., I build one LTO object with “arm-reserve-r9” and another without it, but the LTO compilation still needs to respect those options for the code in each object. The details of how a particular option should be handled will vary depending on the option, and we need to discuss each one separately.


ABI breaking options should probably be considered part of TargetMachine/Subtarget creation and handled accordingly (if it's subtarget creation it can be on the function as a feature string attribute, if it affects global code generation it needs to be at TargetMachine creation time).
 
It’s pretty clear, though, that there is a class of backend options that translate nicely to function attributes. This RFC is about the mechanism that we use to handle those (without specifying which specific options will be handled that way).


Fair enough. For function attributes I can see a desire to have (as Chris posted in an email from someone at synopsys) a generic target specific dictionary mapping on functions for backend specific uses. I just haven't seen anything in the thread that would be applicable for this rather than an actual function attribute. That said, I would like some sort of delineation for target function attributes as well. Thoughts on that? Perhaps it could look like the string dictionary we've mentioned, maybe not.
 
 
From looking at the patch it seems that the command line option for NaNs should be a general attribute on the function rather than a TargetMachine option anyhow, so it seems that should be migrated down. Whether or not it is exposed as an llvm attribute or a string based key value pair is an interesting discussion. Also, would we want to, instead, expose the backend options as clang IR level options? Seems to make sense from that perspective and then they can add the appropriate attributes to the functions themselves. (-mlongcalls for the long calls option for example).

Thanks!

-eric

Again, I think we can evaluate specific options on a case-by-case basis. For some of them, it might make sense to expose them as clang options. Like you, I had previously assumed we would teach the front-end about all the backend options. I was a little surprised by the approach Akira suggested here, but after giving it some thought, I think it’s a great idea. There are a LOT of target-specific backend options. Some of them do not need to be IR, but many do. Teaching clang about all of them would be a lot of work, and I don’t see any clear benefit from that. This approach lets clang remain relatively independent of the backend options. We still need to think carefully about which backend options use this mechanism, of course.


I'm unconvinced here...
 
Are there other reasons why you think we should add all those options to clang? I just don’t see much to be gained by that.


.... I think that anything you'd like to expose should probably be a front end option. If nothing else I don't see how you plan on annotating the IR. Basically you're constructing a parallel option hierarchy and that seems... redundant. I agree that it's nice for tools to be able to override things in the backend - for testing if nothing else. We do this a lot by checking options and values in the code. A way of streamlining this would be nice - but I don't think that's the goal of these patches no?
 
Do you have objections to the use of string-based key/value pairs? I believe the intention all along, going back to Bill’s work on this a few years ago, was that we would add new attributes for target-independent things and use strings for the target-specific options. That seems like a reasonable starting point to me.


Only some as I mentioned above - I haven't seen anything that needs to be target specific in this way, but I'm open to the idea :)

As a note, this is all for function level things. We need a solution for non-function/module level attributes and I think that's going to look like API calls into the backend for TargetMachine creation - there may be a list of things that would turn into module level attributes, but that is a direction fraught with danger and I'd like to avoid that as much as possible.

(As a note, the particular patch that Akira has is unacceptable because of the changes to the subtarget lookup, but that's a separate issue :)

-eric
 

On Tue Nov 18 2014 at 12:26:49 PM Akira Hatanaka <[hidden email]> wrote:
Updated patch is attached. Note this is just a work-in-progress patch and I plan to address the feedback comments later if this patch is in the right direction.

This is how the command line options are parsed and used by the backend passes:

1. Tools such as clang and llc call cl::ParseCommandLineOptions. Any of the options that should be written to the bitcode as function or module attributes (these options are declared as CodeGenOpt, which is a subclass of cl::opt) are saved to Function or ModuleIntAttrs in CodeGenOptions.cpp.
2. setFunctionAttribute is called to move the function and module options saved in step 1 to the bitcode. Also, options stored in TargetOptions are moved to the bitcode too. Pre-existing attributes in the bitcode are overwritten by the options in the command line. 
3. The backend passes and subtarget classes call getFunctionAttribute to read the attributes stored to the bitcode in step 2. Function::hasFnAttribute can be called directly (for example, NoNaNsFPMath in the patch), if the default value is known to be "false".

I also made the following changes in the patch:

1. In the constructor of MachineFunction, call the version of getSubtargetImpl that takes a Function parameter, so that it gets the Subtarget specific to the Function being compiled.
2. Change ARMTargetMachine::SubtargetMap to be a DenseMap<Function*, unique_ptr<ARMSubtarget>>. This is just a temporary change to ease debugging and should be reverted to a StringMap or changed to another type of map later.


On Mon, Nov 17, 2014 at 11:40 AM, Eric Christopher <[hidden email]> wrote:


On Mon Nov 17 2014 at 9:56:40 AM Bob Wilson <[hidden email]> wrote:

> On Nov 14, 2014, at 2:44 PM, Duncan P. N. Exon Smith <[hidden email]> wrote:
>
> +chrisb
>
>> On 2014-Nov-13, at 16:33, Akira Hatanaka <[hidden email]> wrote:
>>
>> I'm working on fixing PR21471, which is about embedding codegen command line options into the bitcode as function or module-level attributes so that they don't get ignored when doing LTO.
>>
>> http://llvm.org/bugs/show_bug.cgi?id=21471
>>
>> I have an initial patch (attached to this email) which enables clang/llvm to recognize one command line option, write it to the IR, and read it out in a backend pass. I'm looking to get feedback from the community on whether I'm headed in the right direction or whether there are alternate ideas before I go all the way on fixing the PR. Specifically, I'd like to know the answers to the following questions:
>>
>> 1. How do we make sure we continue to be able to use the command line options we've been using for llc and other tools?
>> 2. How to handle cases where two functions in a module have different sets of command line options?
>> 3. Where should the command line options or module/function attributes be stored once they are read out from the IR?
>>
>> The short description of the approach I took in my patch is that command line options that are important to codegen are collected by cl::ParseCommandLineOptions, written to the bitcode as function or module attributes, and read out directly by the optimization passes that need them. cl::opt options are replaced with CodeGenOpt options (subclass of cl::opt), which are needed only to parse the command line and provide the default value when the corresponding options are not in the bitcode.
>
> I like this approach, since it means the frontend doesn't have to understand
> options in order to pass them on to the backend.

I agree. It’s not the approach that was I was expecting, but after reading the patches, I like it.

If we can get consensus on the approach, I suggest that we stage the work to first adopt Akira’s patches and then we can incrementally convert various options over to use it. There may be complications for some options, so I’d like to see separate patches for each one (possibly with a few similar options combined in one patch).


I'm not sold quite yet :)

Akira: Can you show an example of how you expect this to work in the IR and how the backend is going to process?

Thanks.

-eric
 
>

> The static variables should be straightforward to migrate to an LLVMContext
> once ParseCommandLineOptions stores things there instead of in globals.
>
>> diff --git a/lib/CodeGen/CodeGenOption.cpp b/lib/CodeGen/CodeGenOption.cpp
>> new file mode 100644
>> index 0000000..2d74c2f
>> --- /dev/null
>> +++ b/lib/CodeGen/CodeGenOption.cpp
>> @@ -0,0 +1,59 @@
>> +//===- CodeGen/CodeGenOptions.cpp - Code-gen option.           --*- C++ -*-===//
>> +//
>> +//                     The LLVM Compiler Infrastructure
>> +//
>> +// This file is distributed under the University of Illinois Open Source
>> +// License. See LICENSE.TXT for details.
>> +//
>> +//===----------------------------------------------------------------------===//
>> +//
>> +//===----------------------------------------------------------------------===//
>> +
>> +#include "llvm/CodeGen/CodeGenOption.h"
>> +#include "llvm/IR/Attributes.h"
>> +#include "llvm/IR/Module.h"
>> +
>> +using namespace llvm;
>> +
>> +static std::map<std::string, bool> FunctionBoolAttrs;
>> +static std::map<std::string, bool> ModuleBoolAttrs;
>> +
>
> @Chris, should these be using ManagedStatic?
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev




_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Embedding command line options in bitcode (PR21471)

Eric Christopher


On Wed Dec 03 2014 at 11:39:23 AM Akira Hatanaka <[hidden email]> wrote:
On Tue, Dec 2, 2014 at 4:38 PM, Eric Christopher <[hidden email]> wrote:


On Tue Dec 02 2014 at 4:31:43 PM Akira Hatanaka <[hidden email]> wrote:
On Tue, Dec 2, 2014 at 3:21 PM, Eric Christopher <[hidden email]> wrote:
On Mon Dec 01 2014 at 4:22:15 PM Bob Wilson <[hidden email]> wrote:
Thanks for your feedback, Eric.

I still think we may be talking past each other a little bit, but rather than

Might be, sorry if so :(
 
delving further into the details right now, I’ve suggested to Akira that he look into how we should handle other kinds of options. I’m hoping that as we look at more of them, we will gain some insight into the approach that we want to take here. This patch really only deals with the easy cases, and whether we use Akira’s approach or not, I’m not especially worried about this kind of option.

Let’s come back to this in a little while after we’ve looked at what to do for some other options…..


I apologize if this seems a bit rambly, but I'm trying to get a lot of thoughts on this down:

FWIW I'm looking at some of this at the moment which is why I spoke to a few of the different options and needing to control TargetMachine creation. Just to continue with some of the discussion (at least to point some things the right direction and get feedback) let's think about the -target-abi option for ARM. It sets the ABI for the backend on the target machine. It probably shouldn't change on a per function basis (wait, except for the calling convention which can be changed via an attribute, fun!), so we want it to control global data layout and valid types which means we need it on the TargetMachine. The question is how to pass this information into the backend for such an option - right now, if we wanted, we could punt to using the command line options that currently exist and the existing ParseCommandLineOptions call from clang.


When option -target-abi=apcs-gnu for ARM is used for clang, EmitAssemblyHelper::CreateTargetMachine passes FeatruresStr="+apcs" to Target::createTargetMachine. Do we have anything else we have to pass to the backend in this case?

Not yet, but the option can be a backend option rather than a subtarget feature - and I'm probably going to do that in the short term as it's no worse than the current status, see the next line here below :)

-eric
 
 
That said, it's a pretty awful hack to use them. For some of my particular needs I've got some code around unifying the DataLayout creation between the front end and back end, but it's a partial solution for the general problem and I'll probably run into more issues in this arena.

So, trying to come up with ways to communicate all of this via the API so that we can set ABIs for targets that need it (ARM, MIPS, others), as well as probably other options (anything passed via -backend-option is a good candidate).

One thought is expanding and/or deriving target specific versions of TargetOptions and populating that on creation from the front end. We'd want to be careful with the target specific bits and constructing defaults, probably something off of the bits in Targets.cpp would be appropriate. A lot of this type of code leads us down the TargetSupport library or something of the sort - classes to help describe or create backend constructs.

Anyhow, this is my current thinking on how to do API building of TargetMachine etc. (Alternately a TargetMachineBuilder? But that sounds complicated in all of the same ways without any particular gain).



If we wanted to, in EmitAssemblyHelper::CreateTargetMachine, we could pass module-level CodeGenOpts.BackendOptions (backend options that don't need to change on a per-function basis) to Target::createTargetMachine. We can either pass it directly to the function or add a new field in llvm::TargetOptions that holds the options. If we can do that, we don't have to call ParseCommandLineOptions from EmitAssemblyHelper::CreateTargetMachine.


Yeah, I've been debating something like that. I'm not a huge fan of strings as a method of passing options around, but I'm starting to get worn down for sure. As far as ParseCommandLineOptions, the only deal is either we'd need to invent a new option format or we'd basically be calling it in the constructor for TargetMachine. I think I'd rather the new option format than that though. :\

For my current problem I am going to end up just passing ABI as a string in TargetOptions, it's honestly the cleanest way to do it at the moment and it can be shared by multiple targets.

-eric
 
 
-eric
 
On Dec 1, 2014, at 10:53 AM, Eric Christopher <[hidden email]> wrote:

Hi Bob,

I've been giving this some thought, comments inline:

On Wed Nov 19 2014 at 6:24:18 PM Bob Wilson <[hidden email]> wrote:
On Nov 19, 2014, at 4:52 PM, Eric Christopher <[hidden email]> wrote:



On Wed Nov 19 2014 at 4:39:42 PM Akira Hatanaka <[hidden email]> wrote:
On Wed, Nov 19, 2014 at 3:28 PM, Eric Christopher <[hidden email]> wrote:
So my general concern here is that lots of command line options that don't need to be or shouldn't be IR level constructs become oddly named string options. It's bad enough that we're doing that with the target cpu and target feature string let alone the rest of it.

If we're talking about things that end up changing the subtarget used we definitely need to make this part of the lookup key - at which point we're going to want a unified interface to all of them. I haven't applied your patch to see what the IR changes look like here.


With the patches applied, the attributes clang embeds in the IR look like this:

$ clang -cc1 ... -mllvm -arm-long-calls -mllvm -arm-reserve-r9 -menable-no-nans

attributes #0 = { nounwind readonly ssp "NoNaNsFPMath" "arm-long-calls"="1" "arm-reserve-r9"="1" "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="arm7tdmi" "target-features"="-neon,-crypto,+apcs" "unsafe-fp-math"="false" "use-soft-float"="true" }

I'm guessing you want the front-end to convert command line options like "arm-long-calls" or "arm-reserve-r9" to something that doesn't look as odd?


Right, that was part of the idea, the rest is that we define things that actually make sense as front end command line options and use those to communicate via IR constructs or api calls I.e. reserved registers don't need to be a specific back end option, but rather can be passed along when creating a target machine, etc.

-eric

No, reserved registers (I assume you’re referring to “arm-reserve-r9” in Akira’s example) don’t necessarily need to be backend options. But, those kinds of options do need to be recorded in the IR and the target machine needs to reflect the relevant IR settings. Otherwise, we’ll still be left with a broken LTO model. E.G., I build one LTO object with “arm-reserve-r9” and another without it, but the LTO compilation still needs to respect those options for the code in each object. The details of how a particular option should be handled will vary depending on the option, and we need to discuss each one separately.


ABI breaking options should probably be considered part of TargetMachine/Subtarget creation and handled accordingly (if it's subtarget creation it can be on the function as a feature string attribute, if it affects global code generation it needs to be at TargetMachine creation time).
 
It’s pretty clear, though, that there is a class of backend options that translate nicely to function attributes. This RFC is about the mechanism that we use to handle those (without specifying which specific options will be handled that way).


Fair enough. For function attributes I can see a desire to have (as Chris posted in an email from someone at synopsys) a generic target specific dictionary mapping on functions for backend specific uses. I just haven't seen anything in the thread that would be applicable for this rather than an actual function attribute. That said, I would like some sort of delineation for target function attributes as well. Thoughts on that? Perhaps it could look like the string dictionary we've mentioned, maybe not.
 
 
From looking at the patch it seems that the command line option for NaNs should be a general attribute on the function rather than a TargetMachine option anyhow, so it seems that should be migrated down. Whether or not it is exposed as an llvm attribute or a string based key value pair is an interesting discussion. Also, would we want to, instead, expose the backend options as clang IR level options? Seems to make sense from that perspective and then they can add the appropriate attributes to the functions themselves. (-mlongcalls for the long calls option for example).

Thanks!

-eric

Again, I think we can evaluate specific options on a case-by-case basis. For some of them, it might make sense to expose them as clang options. Like you, I had previously assumed we would teach the front-end about all the backend options. I was a little surprised by the approach Akira suggested here, but after giving it some thought, I think it’s a great idea. There are a LOT of target-specific backend options. Some of them do not need to be IR, but many do. Teaching clang about all of them would be a lot of work, and I don’t see any clear benefit from that. This approach lets clang remain relatively independent of the backend options. We still need to think carefully about which backend options use this mechanism, of course.


I'm unconvinced here...
 
Are there other reasons why you think we should add all those options to clang? I just don’t see much to be gained by that.


.... I think that anything you'd like to expose should probably be a front end option. If nothing else I don't see how you plan on annotating the IR. Basically you're constructing a parallel option hierarchy and that seems... redundant. I agree that it's nice for tools to be able to override things in the backend - for testing if nothing else. We do this a lot by checking options and values in the code. A way of streamlining this would be nice - but I don't think that's the goal of these patches no?
 
Do you have objections to the use of string-based key/value pairs? I believe the intention all along, going back to Bill’s work on this a few years ago, was that we would add new attributes for target-independent things and use strings for the target-specific options. That seems like a reasonable starting point to me.


Only some as I mentioned above - I haven't seen anything that needs to be target specific in this way, but I'm open to the idea :)

As a note, this is all for function level things. We need a solution for non-function/module level attributes and I think that's going to look like API calls into the backend for TargetMachine creation - there may be a list of things that would turn into module level attributes, but that is a direction fraught with danger and I'd like to avoid that as much as possible.

(As a note, the particular patch that Akira has is unacceptable because of the changes to the subtarget lookup, but that's a separate issue :)

-eric
 

On Tue Nov 18 2014 at 12:26:49 PM Akira Hatanaka <[hidden email]> wrote:
Updated patch is attached. Note this is just a work-in-progress patch and I plan to address the feedback comments later if this patch is in the right direction.

This is how the command line options are parsed and used by the backend passes:

1. Tools such as clang and llc call cl::ParseCommandLineOptions. Any of the options that should be written to the bitcode as function or module attributes (these options are declared as CodeGenOpt, which is a subclass of cl::opt) are saved to Function or ModuleIntAttrs in CodeGenOptions.cpp.
2. setFunctionAttribute is called to move the function and module options saved in step 1 to the bitcode. Also, options stored in TargetOptions are moved to the bitcode too. Pre-existing attributes in the bitcode are overwritten by the options in the command line. 
3. The backend passes and subtarget classes call getFunctionAttribute to read the attributes stored to the bitcode in step 2. Function::hasFnAttribute can be called directly (for example, NoNaNsFPMath in the patch), if the default value is known to be "false".

I also made the following changes in the patch:

1. In the constructor of MachineFunction, call the version of getSubtargetImpl that takes a Function parameter, so that it gets the Subtarget specific to the Function being compiled.
2. Change ARMTargetMachine::SubtargetMap to be a DenseMap<Function*, unique_ptr<ARMSubtarget>>. This is just a temporary change to ease debugging and should be reverted to a StringMap or changed to another type of map later.


On Mon, Nov 17, 2014 at 11:40 AM, Eric Christopher <[hidden email]> wrote:


On Mon Nov 17 2014 at 9:56:40 AM Bob Wilson <[hidden email]> wrote:

> On Nov 14, 2014, at 2:44 PM, Duncan P. N. Exon Smith <[hidden email]> wrote:
>
> +chrisb
>
>> On 2014-Nov-13, at 16:33, Akira Hatanaka <[hidden email]> wrote:
>>
>> I'm working on fixing PR21471, which is about embedding codegen command line options into the bitcode as function or module-level attributes so that they don't get ignored when doing LTO.
>>
>> http://llvm.org/bugs/show_bug.cgi?id=21471
>>
>> I have an initial patch (attached to this email) which enables clang/llvm to recognize one command line option, write it to the IR, and read it out in a backend pass. I'm looking to get feedback from the community on whether I'm headed in the right direction or whether there are alternate ideas before I go all the way on fixing the PR. Specifically, I'd like to know the answers to the following questions:
>>
>> 1. How do we make sure we continue to be able to use the command line options we've been using for llc and other tools?
>> 2. How to handle cases where two functions in a module have different sets of command line options?
>> 3. Where should the command line options or module/function attributes be stored once they are read out from the IR?
>>
>> The short description of the approach I took in my patch is that command line options that are important to codegen are collected by cl::ParseCommandLineOptions, written to the bitcode as function or module attributes, and read out directly by the optimization passes that need them. cl::opt options are replaced with CodeGenOpt options (subclass of cl::opt), which are needed only to parse the command line and provide the default value when the corresponding options are not in the bitcode.
>
> I like this approach, since it means the frontend doesn't have to understand
> options in order to pass them on to the backend.

I agree. It’s not the approach that was I was expecting, but after reading the patches, I like it.

If we can get consensus on the approach, I suggest that we stage the work to first adopt Akira’s patches and then we can incrementally convert various options over to use it. There may be complications for some options, so I’d like to see separate patches for each one (possibly with a few similar options combined in one patch).


I'm not sold quite yet :)

Akira: Can you show an example of how you expect this to work in the IR and how the backend is going to process?

Thanks.

-eric
 
>

> The static variables should be straightforward to migrate to an LLVMContext
> once ParseCommandLineOptions stores things there instead of in globals.
>
>> diff --git a/lib/CodeGen/CodeGenOption.cpp b/lib/CodeGen/CodeGenOption.cpp
>> new file mode 100644
>> index 0000000..2d74c2f
>> --- /dev/null
>> +++ b/lib/CodeGen/CodeGenOption.cpp
>> @@ -0,0 +1,59 @@
>> +//===- CodeGen/CodeGenOptions.cpp - Code-gen option.           --*- C++ -*-===//
>> +//
>> +//                     The LLVM Compiler Infrastructure
>> +//
>> +// This file is distributed under the University of Illinois Open Source
>> +// License. See LICENSE.TXT for details.
>> +//
>> +//===----------------------------------------------------------------------===//
>> +//
>> +//===----------------------------------------------------------------------===//
>> +
>> +#include "llvm/CodeGen/CodeGenOption.h"
>> +#include "llvm/IR/Attributes.h"
>> +#include "llvm/IR/Module.h"
>> +
>> +using namespace llvm;
>> +
>> +static std::map<std::string, bool> FunctionBoolAttrs;
>> +static std::map<std::string, bool> ModuleBoolAttrs;
>> +
>
> @Chris, should these be using ManagedStatic?
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Embedding command line options in bitcode (PR21471)

Akira Hatanaka
On Fri, Dec 5, 2014 at 2:40 PM, Eric Christopher <[hidden email]> wrote:


On Wed Dec 03 2014 at 11:39:23 AM Akira Hatanaka <[hidden email]> wrote:
On Tue, Dec 2, 2014 at 4:38 PM, Eric Christopher <[hidden email]> wrote:


On Tue Dec 02 2014 at 4:31:43 PM Akira Hatanaka <[hidden email]> wrote:
On Tue, Dec 2, 2014 at 3:21 PM, Eric Christopher <[hidden email]> wrote:
On Mon Dec 01 2014 at 4:22:15 PM Bob Wilson <[hidden email]> wrote:
Thanks for your feedback, Eric.

I still think we may be talking past each other a little bit, but rather than

Might be, sorry if so :(
 
delving further into the details right now, I’ve suggested to Akira that he look into how we should handle other kinds of options. I’m hoping that as we look at more of them, we will gain some insight into the approach that we want to take here. This patch really only deals with the easy cases, and whether we use Akira’s approach or not, I’m not especially worried about this kind of option.

Let’s come back to this in a little while after we’ve looked at what to do for some other options…..


I apologize if this seems a bit rambly, but I'm trying to get a lot of thoughts on this down:

FWIW I'm looking at some of this at the moment which is why I spoke to a few of the different options and needing to control TargetMachine creation. Just to continue with some of the discussion (at least to point some things the right direction and get feedback) let's think about the -target-abi option for ARM. It sets the ABI for the backend on the target machine. It probably shouldn't change on a per function basis (wait, except for the calling convention which can be changed via an attribute, fun!), so we want it to control global data layout and valid types which means we need it on the TargetMachine. The question is how to pass this information into the backend for such an option - right now, if we wanted, we could punt to using the command line options that currently exist and the existing ParseCommandLineOptions call from clang.


When option -target-abi=apcs-gnu for ARM is used for clang, EmitAssemblyHelper::CreateTargetMachine passes FeatruresStr="+apcs" to Target::createTargetMachine. Do we have anything else we have to pass to the backend in this case?

Not yet, but the option can be a backend option rather than a subtarget feature - and I'm probably going to do that in the short term as it's no worse than the current status, see the next line here below :)

-eric
 
 
That said, it's a pretty awful hack to use them. For some of my particular needs I've got some code around unifying the DataLayout creation between the front end and back end, but it's a partial solution for the general problem and I'll probably run into more issues in this arena.

So, trying to come up with ways to communicate all of this via the API so that we can set ABIs for targets that need it (ARM, MIPS, others), as well as probably other options (anything passed via -backend-option is a good candidate).

One thought is expanding and/or deriving target specific versions of TargetOptions and populating that on creation from the front end. We'd want to be careful with the target specific bits and constructing defaults, probably something off of the bits in Targets.cpp would be appropriate. A lot of this type of code leads us down the TargetSupport library or something of the sort - classes to help describe or create backend constructs.

Anyhow, this is my current thinking on how to do API building of TargetMachine etc. (Alternately a TargetMachineBuilder? But that sounds complicated in all of the same ways without any particular gain).



If we wanted to, in EmitAssemblyHelper::CreateTargetMachine, we could pass module-level CodeGenOpts.BackendOptions (backend options that don't need to change on a per-function basis) to Target::createTargetMachine. We can either pass it directly to the function or add a new field in llvm::TargetOptions that holds the options. If we can do that, we don't have to call ParseCommandLineOptions from EmitAssemblyHelper::CreateTargetMachine.


Yeah, I've been debating something like that. I'm not a huge fan of strings as a method of passing options around, but I'm starting to get worn down for sure. As far as ParseCommandLineOptions, the only deal is either we'd need to invent a new option format or we'd basically be calling it in the constructor for TargetMachine. I think I'd rather the new option format than that though. :\


If we make changes in clang to pass the options to TargetMachine (via createTargetMachine) or save them in the IR, we can eventually remove the calls to ParseCommandLineOptions in clang. Is that right?

For my current problem I am going to end up just passing ABI as a string in TargetOptions, it's honestly the cleanest way to do it at the moment and it can be shared by multiple targets.


I guess you can pass any other module-level options that you've mentioned that are passed via -backend-option in a similar fashion, perhaps using a key-value pair of strings. Is that what you have in mind?
 
Just to give you an update, I've been working on new patches based on the feedback I've received so far and discussing how I should proceed from here with Bob and Duncan. As the first step, we are trying to classify the command line options listed in the attachment to PR21471 into categories based on whether they are module-level or function level attributes or whether they affect subtarget creation, etc. I'll probably start new threads on llvm-dev soon to discuss the option categories individually to have a more focused discussion.

A couple of other questions:

- Subtarget lookup in XXXTargetMachine::getSubtargetImpl(const Function &F) seems a bit error-prone to me. If you add an option that is needed for subtargert creation as an attribute to Function, but forget to add it to the key that is used to search SubtargetMap, it can incorrectly return a cached subtarget when it should be creating a new one. Would it be better to generate a string key from the whole AttributeSet of the function (along with CPU and FeaturesStr and other options in TargetOptions, if necessary) and use it to do the lookup? The downside of course is that the key can get long as attributes that are not needed for subtarget creation are used to generate the key.

- My understanding is that the only target currently using per-function subtarget creation is mips16/32. If that's correct, is there a reason you haven't enabled it for other targets, perhaps because you ran into or saw some problems?

-eric
 
 
-eric
 
On Dec 1, 2014, at 10:53 AM, Eric Christopher <[hidden email]> wrote:

Hi Bob,

I've been giving this some thought, comments inline:

On Wed Nov 19 2014 at 6:24:18 PM Bob Wilson <[hidden email]> wrote:
On Nov 19, 2014, at 4:52 PM, Eric Christopher <[hidden email]> wrote:



On Wed Nov 19 2014 at 4:39:42 PM Akira Hatanaka <[hidden email]> wrote:
On Wed, Nov 19, 2014 at 3:28 PM, Eric Christopher <[hidden email]> wrote:
So my general concern here is that lots of command line options that don't need to be or shouldn't be IR level constructs become oddly named string options. It's bad enough that we're doing that with the target cpu and target feature string let alone the rest of it.

If we're talking about things that end up changing the subtarget used we definitely need to make this part of the lookup key - at which point we're going to want a unified interface to all of them. I haven't applied your patch to see what the IR changes look like here.


With the patches applied, the attributes clang embeds in the IR look like this:

$ clang -cc1 ... -mllvm -arm-long-calls -mllvm -arm-reserve-r9 -menable-no-nans

attributes #0 = { nounwind readonly ssp "NoNaNsFPMath" "arm-long-calls"="1" "arm-reserve-r9"="1" "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="arm7tdmi" "target-features"="-neon,-crypto,+apcs" "unsafe-fp-math"="false" "use-soft-float"="true" }

I'm guessing you want the front-end to convert command line options like "arm-long-calls" or "arm-reserve-r9" to something that doesn't look as odd?


Right, that was part of the idea, the rest is that we define things that actually make sense as front end command line options and use those to communicate via IR constructs or api calls I.e. reserved registers don't need to be a specific back end option, but rather can be passed along when creating a target machine, etc.

-eric

No, reserved registers (I assume you’re referring to “arm-reserve-r9” in Akira’s example) don’t necessarily need to be backend options. But, those kinds of options do need to be recorded in the IR and the target machine needs to reflect the relevant IR settings. Otherwise, we’ll still be left with a broken LTO model. E.G., I build one LTO object with “arm-reserve-r9” and another without it, but the LTO compilation still needs to respect those options for the code in each object. The details of how a particular option should be handled will vary depending on the option, and we need to discuss each one separately.


ABI breaking options should probably be considered part of TargetMachine/Subtarget creation and handled accordingly (if it's subtarget creation it can be on the function as a feature string attribute, if it affects global code generation it needs to be at TargetMachine creation time).
 
It’s pretty clear, though, that there is a class of backend options that translate nicely to function attributes. This RFC is about the mechanism that we use to handle those (without specifying which specific options will be handled that way).


Fair enough. For function attributes I can see a desire to have (as Chris posted in an email from someone at synopsys) a generic target specific dictionary mapping on functions for backend specific uses. I just haven't seen anything in the thread that would be applicable for this rather than an actual function attribute. That said, I would like some sort of delineation for target function attributes as well. Thoughts on that? Perhaps it could look like the string dictionary we've mentioned, maybe not.
 
 
From looking at the patch it seems that the command line option for NaNs should be a general attribute on the function rather than a TargetMachine option anyhow, so it seems that should be migrated down. Whether or not it is exposed as an llvm attribute or a string based key value pair is an interesting discussion. Also, would we want to, instead, expose the backend options as clang IR level options? Seems to make sense from that perspective and then they can add the appropriate attributes to the functions themselves. (-mlongcalls for the long calls option for example).

Thanks!

-eric

Again, I think we can evaluate specific options on a case-by-case basis. For some of them, it might make sense to expose them as clang options. Like you, I had previously assumed we would teach the front-end about all the backend options. I was a little surprised by the approach Akira suggested here, but after giving it some thought, I think it’s a great idea. There are a LOT of target-specific backend options. Some of them do not need to be IR, but many do. Teaching clang about all of them would be a lot of work, and I don’t see any clear benefit from that. This approach lets clang remain relatively independent of the backend options. We still need to think carefully about which backend options use this mechanism, of course.


I'm unconvinced here...
 
Are there other reasons why you think we should add all those options to clang? I just don’t see much to be gained by that.


.... I think that anything you'd like to expose should probably be a front end option. If nothing else I don't see how you plan on annotating the IR. Basically you're constructing a parallel option hierarchy and that seems... redundant. I agree that it's nice for tools to be able to override things in the backend - for testing if nothing else. We do this a lot by checking options and values in the code. A way of streamlining this would be nice - but I don't think that's the goal of these patches no?
 
Do you have objections to the use of string-based key/value pairs? I believe the intention all along, going back to Bill’s work on this a few years ago, was that we would add new attributes for target-independent things and use strings for the target-specific options. That seems like a reasonable starting point to me.


Only some as I mentioned above - I haven't seen anything that needs to be target specific in this way, but I'm open to the idea :)

As a note, this is all for function level things. We need a solution for non-function/module level attributes and I think that's going to look like API calls into the backend for TargetMachine creation - there may be a list of things that would turn into module level attributes, but that is a direction fraught with danger and I'd like to avoid that as much as possible.

(As a note, the particular patch that Akira has is unacceptable because of the changes to the subtarget lookup, but that's a separate issue :)

-eric
 

On Tue Nov 18 2014 at 12:26:49 PM Akira Hatanaka <[hidden email]> wrote:
Updated patch is attached. Note this is just a work-in-progress patch and I plan to address the feedback comments later if this patch is in the right direction.

This is how the command line options are parsed and used by the backend passes:

1. Tools such as clang and llc call cl::ParseCommandLineOptions. Any of the options that should be written to the bitcode as function or module attributes (these options are declared as CodeGenOpt, which is a subclass of cl::opt) are saved to Function or ModuleIntAttrs in CodeGenOptions.cpp.
2. setFunctionAttribute is called to move the function and module options saved in step 1 to the bitcode. Also, options stored in TargetOptions are moved to the bitcode too. Pre-existing attributes in the bitcode are overwritten by the options in the command line. 
3. The backend passes and subtarget classes call getFunctionAttribute to read the attributes stored to the bitcode in step 2. Function::hasFnAttribute can be called directly (for example, NoNaNsFPMath in the patch), if the default value is known to be "false".

I also made the following changes in the patch:

1. In the constructor of MachineFunction, call the version of getSubtargetImpl that takes a Function parameter, so that it gets the Subtarget specific to the Function being compiled.
2. Change ARMTargetMachine::SubtargetMap to be a DenseMap<Function*, unique_ptr<ARMSubtarget>>. This is just a temporary change to ease debugging and should be reverted to a StringMap or changed to another type of map later.


On Mon, Nov 17, 2014 at 11:40 AM, Eric Christopher <[hidden email]> wrote:


On Mon Nov 17 2014 at 9:56:40 AM Bob Wilson <[hidden email]> wrote:

> On Nov 14, 2014, at 2:44 PM, Duncan P. N. Exon Smith <[hidden email]> wrote:
>
> +chrisb
>
>> On 2014-Nov-13, at 16:33, Akira Hatanaka <[hidden email]> wrote:
>>
>> I'm working on fixing PR21471, which is about embedding codegen command line options into the bitcode as function or module-level attributes so that they don't get ignored when doing LTO.
>>
>> http://llvm.org/bugs/show_bug.cgi?id=21471
>>
>> I have an initial patch (attached to this email) which enables clang/llvm to recognize one command line option, write it to the IR, and read it out in a backend pass. I'm looking to get feedback from the community on whether I'm headed in the right direction or whether there are alternate ideas before I go all the way on fixing the PR. Specifically, I'd like to know the answers to the following questions:
>>
>> 1. How do we make sure we continue to be able to use the command line options we've been using for llc and other tools?
>> 2. How to handle cases where two functions in a module have different sets of command line options?
>> 3. Where should the command line options or module/function attributes be stored once they are read out from the IR?
>>
>> The short description of the approach I took in my patch is that command line options that are important to codegen are collected by cl::ParseCommandLineOptions, written to the bitcode as function or module attributes, and read out directly by the optimization passes that need them. cl::opt options are replaced with CodeGenOpt options (subclass of cl::opt), which are needed only to parse the command line and provide the default value when the corresponding options are not in the bitcode.
>
> I like this approach, since it means the frontend doesn't have to understand
> options in order to pass them on to the backend.

I agree. It’s not the approach that was I was expecting, but after reading the patches, I like it.

If we can get consensus on the approach, I suggest that we stage the work to first adopt Akira’s patches and then we can incrementally convert various options over to use it. There may be complications for some options, so I’d like to see separate patches for each one (possibly with a few similar options combined in one patch).


I'm not sold quite yet :)

Akira: Can you show an example of how you expect this to work in the IR and how the backend is going to process?

Thanks.

-eric
 
>

> The static variables should be straightforward to migrate to an LLVMContext
> once ParseCommandLineOptions stores things there instead of in globals.
>
>> diff --git a/lib/CodeGen/CodeGenOption.cpp b/lib/CodeGen/CodeGenOption.cpp
>> new file mode 100644
>> index 0000000..2d74c2f
>> --- /dev/null
>> +++ b/lib/CodeGen/CodeGenOption.cpp
>> @@ -0,0 +1,59 @@
>> +//===- CodeGen/CodeGenOptions.cpp - Code-gen option.           --*- C++ -*-===//
>> +//
>> +//                     The LLVM Compiler Infrastructure
>> +//
>> +// This file is distributed under the University of Illinois Open Source
>> +// License. See LICENSE.TXT for details.
>> +//
>> +//===----------------------------------------------------------------------===//
>> +//
>> +//===----------------------------------------------------------------------===//
>> +
>> +#include "llvm/CodeGen/CodeGenOption.h"
>> +#include "llvm/IR/Attributes.h"
>> +#include "llvm/IR/Module.h"
>> +
>> +using namespace llvm;
>> +
>> +static std::map<std::string, bool> FunctionBoolAttrs;
>> +static std::map<std::string, bool> ModuleBoolAttrs;
>> +
>
> @Chris, should these be using ManagedStatic?
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev




_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Embedding command line options in bitcode (PR21471)

Akira Hatanaka
On Tue, Dec 9, 2014 at 12:38 PM, Akira Hatanaka <[hidden email]> wrote:
On Fri, Dec 5, 2014 at 2:40 PM, Eric Christopher <[hidden email]> wrote:


On Wed Dec 03 2014 at 11:39:23 AM Akira Hatanaka <[hidden email]> wrote:
On Tue, Dec 2, 2014 at 4:38 PM, Eric Christopher <[hidden email]> wrote:


On Tue Dec 02 2014 at 4:31:43 PM Akira Hatanaka <[hidden email]> wrote:
On Tue, Dec 2, 2014 at 3:21 PM, Eric Christopher <[hidden email]> wrote:
On Mon Dec 01 2014 at 4:22:15 PM Bob Wilson <[hidden email]> wrote:
Thanks for your feedback, Eric.

I still think we may be talking past each other a little bit, but rather than

Might be, sorry if so :(
 
delving further into the details right now, I’ve suggested to Akira that he look into how we should handle other kinds of options. I’m hoping that as we look at more of them, we will gain some insight into the approach that we want to take here. This patch really only deals with the easy cases, and whether we use Akira’s approach or not, I’m not especially worried about this kind of option.

Let’s come back to this in a little while after we’ve looked at what to do for some other options…..


I apologize if this seems a bit rambly, but I'm trying to get a lot of thoughts on this down:

FWIW I'm looking at some of this at the moment which is why I spoke to a few of the different options and needing to control TargetMachine creation. Just to continue with some of the discussion (at least to point some things the right direction and get feedback) let's think about the -target-abi option for ARM. It sets the ABI for the backend on the target machine. It probably shouldn't change on a per function basis (wait, except for the calling convention which can be changed via an attribute, fun!), so we want it to control global data layout and valid types which means we need it on the TargetMachine. The question is how to pass this information into the backend for such an option - right now, if we wanted, we could punt to using the command line options that currently exist and the existing ParseCommandLineOptions call from clang.


When option -target-abi=apcs-gnu for ARM is used for clang, EmitAssemblyHelper::CreateTargetMachine passes FeatruresStr="+apcs" to Target::createTargetMachine. Do we have anything else we have to pass to the backend in this case?

Not yet, but the option can be a backend option rather than a subtarget feature - and I'm probably going to do that in the short term as it's no worse than the current status, see the next line here below :)

-eric
 
 
That said, it's a pretty awful hack to use them. For some of my particular needs I've got some code around unifying the DataLayout creation between the front end and back end, but it's a partial solution for the general problem and I'll probably run into more issues in this arena.

So, trying to come up with ways to communicate all of this via the API so that we can set ABIs for targets that need it (ARM, MIPS, others), as well as probably other options (anything passed via -backend-option is a good candidate).

One thought is expanding and/or deriving target specific versions of TargetOptions and populating that on creation from the front end. We'd want to be careful with the target specific bits and constructing defaults, probably something off of the bits in Targets.cpp would be appropriate. A lot of this type of code leads us down the TargetSupport library or something of the sort - classes to help describe or create backend constructs.

Anyhow, this is my current thinking on how to do API building of TargetMachine etc. (Alternately a TargetMachineBuilder? But that sounds complicated in all of the same ways without any particular gain).



If we wanted to, in EmitAssemblyHelper::CreateTargetMachine, we could pass module-level CodeGenOpts.BackendOptions (backend options that don't need to change on a per-function basis) to Target::createTargetMachine. We can either pass it directly to the function or add a new field in llvm::TargetOptions that holds the options. If we can do that, we don't have to call ParseCommandLineOptions from EmitAssemblyHelper::CreateTargetMachine.


Yeah, I've been debating something like that. I'm not a huge fan of strings as a method of passing options around, but I'm starting to get worn down for sure. As far as ParseCommandLineOptions, the only deal is either we'd need to invent a new option format or we'd basically be calling it in the constructor for TargetMachine. I think I'd rather the new option format than that though. :\


If we make changes in clang to pass the options to TargetMachine (via createTargetMachine) or save them in the IR, we can eventually remove the calls to ParseCommandLineOptions in clang. Is that right?

For my current problem I am going to end up just passing ABI as a string in TargetOptions, it's honestly the cleanest way to do it at the moment and it can be shared by multiple targets.


I guess you can pass any other module-level options that you've mentioned that are passed via -backend-option in a similar fashion, perhaps using a key-value pair of strings. Is that what you have in mind?
 
Just to give you an update, I've been working on new patches based on the feedback I've received so far and discussing how I should proceed from here with Bob and Duncan. As the first step, we are trying to classify the command line options listed in the attachment to PR21471 into categories based on whether they are module-level or function level attributes or whether they affect subtarget creation, etc. I'll probably start new threads on llvm-dev soon to discuss the option categories individually to have a more focused discussion.

A couple of other questions:

- Subtarget lookup in XXXTargetMachine::getSubtargetImpl(const Function &F) seems a bit error-prone to me. If you add an option that is needed for subtargert creation as an attribute to Function, but forget to add it to the key that is used to search SubtargetMap, it can incorrectly return a cached subtarget when it should be creating a new one. Would it be better to generate a string key from the whole AttributeSet of the function (along with CPU and FeaturesStr and other options in TargetOptions, if necessary) and use it to do the lookup? The downside of course is that the key can get long as attributes that are not needed for subtarget creation are used to generate the key.

- My understanding is that the only target currently using per-function subtarget creation is mips16/32. If that's correct, is there a reason you haven't enabled it for other targets, perhaps because you ran into or saw some problems?


I see derived classes of TargetTransformInfo are calling TargetMachine::getSubtarget(Function*) to get the per-function subtarget in a few places, but other than those cases, TargetMachine's subtarget is used. 
 
-eric
 
 
-eric
 
On Dec 1, 2014, at 10:53 AM, Eric Christopher <[hidden email]> wrote:

Hi Bob,

I've been giving this some thought, comments inline:

On Wed Nov 19 2014 at 6:24:18 PM Bob Wilson <[hidden email]> wrote:
On Nov 19, 2014, at 4:52 PM, Eric Christopher <[hidden email]> wrote:



On Wed Nov 19 2014 at 4:39:42 PM Akira Hatanaka <[hidden email]> wrote:
On Wed, Nov 19, 2014 at 3:28 PM, Eric Christopher <[hidden email]> wrote:
So my general concern here is that lots of command line options that don't need to be or shouldn't be IR level constructs become oddly named string options. It's bad enough that we're doing that with the target cpu and target feature string let alone the rest of it.

If we're talking about things that end up changing the subtarget used we definitely need to make this part of the lookup key - at which point we're going to want a unified interface to all of them. I haven't applied your patch to see what the IR changes look like here.


With the patches applied, the attributes clang embeds in the IR look like this:

$ clang -cc1 ... -mllvm -arm-long-calls -mllvm -arm-reserve-r9 -menable-no-nans

attributes #0 = { nounwind readonly ssp "NoNaNsFPMath" "arm-long-calls"="1" "arm-reserve-r9"="1" "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="arm7tdmi" "target-features"="-neon,-crypto,+apcs" "unsafe-fp-math"="false" "use-soft-float"="true" }

I'm guessing you want the front-end to convert command line options like "arm-long-calls" or "arm-reserve-r9" to something that doesn't look as odd?


Right, that was part of the idea, the rest is that we define things that actually make sense as front end command line options and use those to communicate via IR constructs or api calls I.e. reserved registers don't need to be a specific back end option, but rather can be passed along when creating a target machine, etc.

-eric

No, reserved registers (I assume you’re referring to “arm-reserve-r9” in Akira’s example) don’t necessarily need to be backend options. But, those kinds of options do need to be recorded in the IR and the target machine needs to reflect the relevant IR settings. Otherwise, we’ll still be left with a broken LTO model. E.G., I build one LTO object with “arm-reserve-r9” and another without it, but the LTO compilation still needs to respect those options for the code in each object. The details of how a particular option should be handled will vary depending on the option, and we need to discuss each one separately.


ABI breaking options should probably be considered part of TargetMachine/Subtarget creation and handled accordingly (if it's subtarget creation it can be on the function as a feature string attribute, if it affects global code generation it needs to be at TargetMachine creation time).
 
It’s pretty clear, though, that there is a class of backend options that translate nicely to function attributes. This RFC is about the mechanism that we use to handle those (without specifying which specific options will be handled that way).


Fair enough. For function attributes I can see a desire to have (as Chris posted in an email from someone at synopsys) a generic target specific dictionary mapping on functions for backend specific uses. I just haven't seen anything in the thread that would be applicable for this rather than an actual function attribute. That said, I would like some sort of delineation for target function attributes as well. Thoughts on that? Perhaps it could look like the string dictionary we've mentioned, maybe not.
 
 
From looking at the patch it seems that the command line option for NaNs should be a general attribute on the function rather than a TargetMachine option anyhow, so it seems that should be migrated down. Whether or not it is exposed as an llvm attribute or a string based key value pair is an interesting discussion. Also, would we want to, instead, expose the backend options as clang IR level options? Seems to make sense from that perspective and then they can add the appropriate attributes to the functions themselves. (-mlongcalls for the long calls option for example).

Thanks!

-eric

Again, I think we can evaluate specific options on a case-by-case basis. For some of them, it might make sense to expose them as clang options. Like you, I had previously assumed we would teach the front-end about all the backend options. I was a little surprised by the approach Akira suggested here, but after giving it some thought, I think it’s a great idea. There are a LOT of target-specific backend options. Some of them do not need to be IR, but many do. Teaching clang about all of them would be a lot of work, and I don’t see any clear benefit from that. This approach lets clang remain relatively independent of the backend options. We still need to think carefully about which backend options use this mechanism, of course.


I'm unconvinced here...
 
Are there other reasons why you think we should add all those options to clang? I just don’t see much to be gained by that.


.... I think that anything you'd like to expose should probably be a front end option. If nothing else I don't see how you plan on annotating the IR. Basically you're constructing a parallel option hierarchy and that seems... redundant. I agree that it's nice for tools to be able to override things in the backend - for testing if nothing else. We do this a lot by checking options and values in the code. A way of streamlining this would be nice - but I don't think that's the goal of these patches no?
 
Do you have objections to the use of string-based key/value pairs? I believe the intention all along, going back to Bill’s work on this a few years ago, was that we would add new attributes for target-independent things and use strings for the target-specific options. That seems like a reasonable starting point to me.


Only some as I mentioned above - I haven't seen anything that needs to be target specific in this way, but I'm open to the idea :)

As a note, this is all for function level things. We need a solution for non-function/module level attributes and I think that's going to look like API calls into the backend for TargetMachine creation - there may be a list of things that would turn into module level attributes, but that is a direction fraught with danger and I'd like to avoid that as much as possible.

(As a note, the particular patch that Akira has is unacceptable because of the changes to the subtarget lookup, but that's a separate issue :)

-eric
 

On Tue Nov 18 2014 at 12:26:49 PM Akira Hatanaka <[hidden email]> wrote:
Updated patch is attached. Note this is just a work-in-progress patch and I plan to address the feedback comments later if this patch is in the right direction.

This is how the command line options are parsed and used by the backend passes:

1. Tools such as clang and llc call cl::ParseCommandLineOptions. Any of the options that should be written to the bitcode as function or module attributes (these options are declared as CodeGenOpt, which is a subclass of cl::opt) are saved to Function or ModuleIntAttrs in CodeGenOptions.cpp.
2. setFunctionAttribute is called to move the function and module options saved in step 1 to the bitcode. Also, options stored in TargetOptions are moved to the bitcode too. Pre-existing attributes in the bitcode are overwritten by the options in the command line. 
3. The backend passes and subtarget classes call getFunctionAttribute to read the attributes stored to the bitcode in step 2. Function::hasFnAttribute can be called directly (for example, NoNaNsFPMath in the patch), if the default value is known to be "false".

I also made the following changes in the patch:

1. In the constructor of MachineFunction, call the version of getSubtargetImpl that takes a Function parameter, so that it gets the Subtarget specific to the Function being compiled.
2. Change ARMTargetMachine::SubtargetMap to be a DenseMap<Function*, unique_ptr<ARMSubtarget>>. This is just a temporary change to ease debugging and should be reverted to a StringMap or changed to another type of map later.


On Mon, Nov 17, 2014 at 11:40 AM, Eric Christopher <[hidden email]> wrote:


On Mon Nov 17 2014 at 9:56:40 AM Bob Wilson <[hidden email]> wrote:

> On Nov 14, 2014, at 2:44 PM, Duncan P. N. Exon Smith <[hidden email]> wrote:
>
> +chrisb
>
>> On 2014-Nov-13, at 16:33, Akira Hatanaka <[hidden email]> wrote:
>>
>> I'm working on fixing PR21471, which is about embedding codegen command line options into the bitcode as function or module-level attributes so that they don't get ignored when doing LTO.
>>
>> http://llvm.org/bugs/show_bug.cgi?id=21471
>>
>> I have an initial patch (attached to this email) which enables clang/llvm to recognize one command line option, write it to the IR, and read it out in a backend pass. I'm looking to get feedback from the community on whether I'm headed in the right direction or whether there are alternate ideas before I go all the way on fixing the PR. Specifically, I'd like to know the answers to the following questions:
>>
>> 1. How do we make sure we continue to be able to use the command line options we've been using for llc and other tools?
>> 2. How to handle cases where two functions in a module have different sets of command line options?
>> 3. Where should the command line options or module/function attributes be stored once they are read out from the IR?
>>
>> The short description of the approach I took in my patch is that command line options that are important to codegen are collected by cl::ParseCommandLineOptions, written to the bitcode as function or module attributes, and read out directly by the optimization passes that need them. cl::opt options are replaced with CodeGenOpt options (subclass of cl::opt), which are needed only to parse the command line and provide the default value when the corresponding options are not in the bitcode.
>
> I like this approach, since it means the frontend doesn't have to understand
> options in order to pass them on to the backend.

I agree. It’s not the approach that was I was expecting, but after reading the patches, I like it.

If we can get consensus on the approach, I suggest that we stage the work to first adopt Akira’s patches and then we can incrementally convert various options over to use it. There may be complications for some options, so I’d like to see separate patches for each one (possibly with a few similar options combined in one patch).


I'm not sold quite yet :)

Akira: Can you show an example of how you expect this to work in the IR and how the backend is going to process?

Thanks.

-eric
 
>

> The static variables should be straightforward to migrate to an LLVMContext
> once ParseCommandLineOptions stores things there instead of in globals.
>
>> diff --git a/lib/CodeGen/CodeGenOption.cpp b/lib/CodeGen/CodeGenOption.cpp
>> new file mode 100644
>> index 0000000..2d74c2f
>> --- /dev/null
>> +++ b/lib/CodeGen/CodeGenOption.cpp
>> @@ -0,0 +1,59 @@
>> +//===- CodeGen/CodeGenOptions.cpp - Code-gen option.           --*- C++ -*-===//
>> +//
>> +//                     The LLVM Compiler Infrastructure
>> +//
>> +// This file is distributed under the University of Illinois Open Source
>> +// License. See LICENSE.TXT for details.
>> +//
>> +//===----------------------------------------------------------------------===//
>> +//
>> +//===----------------------------------------------------------------------===//
>> +
>> +#include "llvm/CodeGen/CodeGenOption.h"
>> +#include "llvm/IR/Attributes.h"
>> +#include "llvm/IR/Module.h"
>> +
>> +using namespace llvm;
>> +
>> +static std::map<std::string, bool> FunctionBoolAttrs;
>> +static std::map<std::string, bool> ModuleBoolAttrs;
>> +
>
> @Chris, should these be using ManagedStatic?
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev





_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
123