[llvm-dev] [llvm-readobj][RFC]Making llvm-readobj GNU command-line compatible

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

[llvm-dev] [llvm-readobj][RFC]Making llvm-readobj GNU command-line compatible

David Jones via llvm-dev
Hi all,

A broad goal of many of the LLVM binary tools, such as llvm-objcopy and llvm-objdump is to provide an alternative to the GNU equivalent, and as such, these tools have been developed to be command-line compatible. One tool where this hasn’t been the case up to now is llvm-readobj (aka llvm-readelf).

There was some discussion in https://reviews.llvm.org/D33872 about the purpose of llvm-readobj, so I’d like to ask the community's opinion. What is the purpose of llvm-readobj? Is it purely intended as an aid to testing? Should it be aiming to be GNU compatible, like most of the rest of the LLVM tools?

The main issue I discovered with GNU compatibility is that llvm-readobj has a few incompatible command-line flags with different interpretations between the two tools:

* -s means dump symbols in GNU readelf, but dump sections in llvm-readobj
* -t means dump section details in GNU readelf, but dump symbols in llvm-readobj
* -a means dump all in GNU readelf, but dump arm attributes in llvm-readobj

There are also several missing aliases and some missing features, but we can implement those with no negative impact on the users of llvm-readobj, so I won't discuss those here.

Also of relevance here are long options preceded with only a single dash. My understanding of GNU’s behaviour is that each letter following it is treated as a different option, whereas in llvm-readobj, we get one single option (e.g. ‘readobj -abc’ would be equivalent to ‘readobj -a -b -c’, but ‘llvm-readobj -abc’ is equivalent to ‘llvm-readobj --abc’). This is at least partly related to the cl::opt/libOption issues discussed in http://lists.llvm.org/pipermail/llvm-dev/2018-October/127328.html).

I'd like to propose that we fix the three switches above such that they match GNU readelf's interpretation, and to change short-option handling similarly. This would inevitably result in some test churn (there are approximately 200 tests between core llvm and lld that would need updating), but it is manageable. More of an issue is that any users would suddenly find the switches changing on them, if they have started using llvm-readobj. On the other hand, I think the benefit for those used to GNU readelf outweighs the cost.

We could do a few different things to mitigate the impact of changing over, roughly in my order of preference, if we decide against just taking the plunge and changing the meaning:

1) For the next release, add a deprecation warning, saying that the switches’ meanings will be changed in a following release, and then fix it after the next release has been created, along with release notes documenting the change.
2) Provide a “--gnu-mode” or similar switch that changes the meaning of the command-line switches above to match the GNU mode. This again provides an opt-in, but also allows downstream ports to enable it by default, should they wish.
3) Change the meaning of the switches only for llvm-readelf, and not for llvm-readobj. This is similar to the behaviour of --elf-output-style: it is GNU for llvm-readelf, and LLVM for llvm-readobj, but does have essentially the same potential for disrupting users as 1).
4) Provide a third user-facing driver (e.g. “llvm-gnu-readelf”) that provides a different CLI to the others. This makes it an opt-in feature, by using a different executable.
5) Just accept this divergence, although I personally would prefer not to, as this has the potential to confuse users migrating from GNU tools to LLVM tools.

Thoughts?

James

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [llvm-readobj][RFC]Making llvm-readobj GNU command-line compatible

David Jones via llvm-dev
Hi James,

I also wanted to work on this discrepancy, but I just sent a patch instead of an RFC: https://reviews.llvm.org/D54124. Thanks for sending the RFC that I should have started myself :)

On Tue, Nov 6, 2018 at 4:53 AM James Henderson via llvm-dev <[hidden email]> wrote:
Hi all,

A broad goal of many of the LLVM binary tools, such as llvm-objcopy and llvm-objdump is to provide an alternative to the GNU equivalent, and as such, these tools have been developed to be command-line compatible. One tool where this hasn’t been the case up to now is llvm-readobj (aka llvm-readelf).
I don't want to digress too much, but llvm-objdump isn't compatible either. For instance, "-df" is an llvm-objdump flag that accepts a list of functions to disassemble, but objdump accepts "-df" as a merged form of "-d -f" i.e. "--dissassemble --file-headers". So we may want to consider this as a meta-discussion for other tools like llvm-objdump.
 

There was some discussion in https://reviews.llvm.org/D33872 about the purpose of llvm-readobj, so I’d like to ask the community's opinion. What is the purpose of llvm-readobj? Is it purely intended as an aid to testing? Should it be aiming to be GNU compatible, like most of the rest of the LLVM tools?
From the source:
// This is a tool similar to readelf, except it works on multiple object file
// formats. The main purpose of this tool is to provide detailed output suitable
// for FileCheck.

My impression is that llvm-readobj is intended to provide information in the spirit of readelf, but not with any strong goal of keeping the format the same. Then, llvm-readelf (as a symlink wrapper) was added recently, to be more of a drop-in replacement, although still maybe not strict (same format, but maybe not char-for-char compatible). That's just what I've inferred from looking at code though, don't take my impression as judgement.

If that's the case, I think llvm-readelf should be relatively easy to make breaking changes to if it breaks in favor of increasing GNU readelf compatibility. llvm-readobj on the other hand has been around for a long time that folks might be relying on its flag parsing. I'd be happy if the latter were wrong and we could change llvm-readobj more freely though.
 

The main issue I discovered with GNU compatibility is that llvm-readobj has a few incompatible command-line flags with different interpretations between the two tools:

* -s means dump symbols in GNU readelf, but dump sections in llvm-readobj
* -t means dump section details in GNU readelf, but dump symbols in llvm-readobj
* -a means dump all in GNU readelf, but dump arm attributes in llvm-readobj

There are also several missing aliases and some missing features, but we can implement those with no negative impact on the users of llvm-readobj, so I won't discuss those here.

Also of relevance here are long options preceded with only a single dash. My understanding of GNU’s behaviour is that each letter following it is treated as a different option, whereas in llvm-readobj, we get one single option (e.g. ‘readobj -abc’ would be equivalent to ‘readobj -a -b -c’, but ‘llvm-readobj -abc’ is equivalent to ‘llvm-readobj --abc’). This is at least partly related to the cl::opt/libOption issues discussed in http://lists.llvm.org/pipermail/llvm-dev/2018-October/127328.html).

I'd like to propose that we fix the three switches above such that they match GNU readelf's interpretation, and to change short-option handling similarly. This would inevitably result in some test churn (there are approximately 200 tests between core llvm and lld that would need updating), but it is manageable. More of an issue is that any users would suddenly find the switches changing on them, if they have started using llvm-readobj. On the other hand, I think the benefit for those used to GNU readelf outweighs the cost.
+1
 

We could do a few different things to mitigate the impact of changing over, roughly in my order of preference, if we decide against just taking the plunge and changing the meaning:

1) For the next release, add a deprecation warning, saying that the switches’ meanings will be changed in a following release, and then fix it after the next release has been created, along with release notes documenting the change.
2) Provide a “--gnu-mode” or similar switch that changes the meaning of the command-line switches above to match the GNU mode. This again provides an opt-in, but also allows downstream ports to enable it by default, should they wish.
3) Change the meaning of the switches only for llvm-readelf, and not for llvm-readobj. This is similar to the behaviour of --elf-output-style: it is GNU for llvm-readelf, and LLVM for llvm-readobj, but does have essentially the same potential for disrupting users as 1).
4) Provide a third user-facing driver (e.g. “llvm-gnu-readelf”) that provides a different CLI to the others. This makes it an opt-in feature, by using a different executable.
5) Just accept this divergence, although I personally would prefer not to, as this has the potential to confuse users migrating from GNU tools to LLVM tools.

Thoughts?

(3) SGTM (that's the approach I went with in my patch)
(2) Sounds like it could get messy to have dependencies between flags, e.g. "--gnu-mode --help" and "--help" would have to be programmed to print different things for what "-s" is an alias of.
(1) Means we would need to wait until the next release (March?) to do anything? I'd rather not be tied down to slow release cycles :( [btw, does LLVM have a deprecation policy anywhere?]
(4) I could live with this if it came to it, but I think it's assuming that someone would want llvm-readelf and *not* want readelf compatibility, enough to outweigh all the people that want llvm-readelf to be like readelf -- who is that?
(5) I think we should veto this option -- this discussion means we clearly don't accept divergence :)
 

James
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

smime.p7s (6K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [llvm-readobj][RFC]Making llvm-readobj GNU command-line compatible

David Jones via llvm-dev
Pinging this thread to see if anyone else has opinions or objections -- if not I plan to go ahead with stepping towards compatibility with readelf vs llvm-readelf in https://reviews.llvm.org/D54124 on Monday.

On Tue, Nov 6, 2018 at 9:52 AM Jordan Rupprecht <[hidden email]> wrote:
Hi James,

I also wanted to work on this discrepancy, but I just sent a patch instead of an RFC: https://reviews.llvm.org/D54124. Thanks for sending the RFC that I should have started myself :)

On Tue, Nov 6, 2018 at 4:53 AM James Henderson via llvm-dev <[hidden email]> wrote:
Hi all,

A broad goal of many of the LLVM binary tools, such as llvm-objcopy and llvm-objdump is to provide an alternative to the GNU equivalent, and as such, these tools have been developed to be command-line compatible. One tool where this hasn’t been the case up to now is llvm-readobj (aka llvm-readelf).
I don't want to digress too much, but llvm-objdump isn't compatible either. For instance, "-df" is an llvm-objdump flag that accepts a list of functions to disassemble, but objdump accepts "-df" as a merged form of "-d -f" i.e. "--dissassemble --file-headers". So we may want to consider this as a meta-discussion for other tools like llvm-objdump.
 

There was some discussion in https://reviews.llvm.org/D33872 about the purpose of llvm-readobj, so I’d like to ask the community's opinion. What is the purpose of llvm-readobj? Is it purely intended as an aid to testing? Should it be aiming to be GNU compatible, like most of the rest of the LLVM tools?
From the source:
// This is a tool similar to readelf, except it works on multiple object file
// formats. The main purpose of this tool is to provide detailed output suitable
// for FileCheck.

My impression is that llvm-readobj is intended to provide information in the spirit of readelf, but not with any strong goal of keeping the format the same. Then, llvm-readelf (as a symlink wrapper) was added recently, to be more of a drop-in replacement, although still maybe not strict (same format, but maybe not char-for-char compatible). That's just what I've inferred from looking at code though, don't take my impression as judgement.

If that's the case, I think llvm-readelf should be relatively easy to make breaking changes to if it breaks in favor of increasing GNU readelf compatibility. llvm-readobj on the other hand has been around for a long time that folks might be relying on its flag parsing. I'd be happy if the latter were wrong and we could change llvm-readobj more freely though.
 

The main issue I discovered with GNU compatibility is that llvm-readobj has a few incompatible command-line flags with different interpretations between the two tools:

* -s means dump symbols in GNU readelf, but dump sections in llvm-readobj
* -t means dump section details in GNU readelf, but dump symbols in llvm-readobj
* -a means dump all in GNU readelf, but dump arm attributes in llvm-readobj

There are also several missing aliases and some missing features, but we can implement those with no negative impact on the users of llvm-readobj, so I won't discuss those here.

Also of relevance here are long options preceded with only a single dash. My understanding of GNU’s behaviour is that each letter following it is treated as a different option, whereas in llvm-readobj, we get one single option (e.g. ‘readobj -abc’ would be equivalent to ‘readobj -a -b -c’, but ‘llvm-readobj -abc’ is equivalent to ‘llvm-readobj --abc’). This is at least partly related to the cl::opt/libOption issues discussed in http://lists.llvm.org/pipermail/llvm-dev/2018-October/127328.html).

I'd like to propose that we fix the three switches above such that they match GNU readelf's interpretation, and to change short-option handling similarly. This would inevitably result in some test churn (there are approximately 200 tests between core llvm and lld that would need updating), but it is manageable. More of an issue is that any users would suddenly find the switches changing on them, if they have started using llvm-readobj. On the other hand, I think the benefit for those used to GNU readelf outweighs the cost.
+1
 

We could do a few different things to mitigate the impact of changing over, roughly in my order of preference, if we decide against just taking the plunge and changing the meaning:

1) For the next release, add a deprecation warning, saying that the switches’ meanings will be changed in a following release, and then fix it after the next release has been created, along with release notes documenting the change.
2) Provide a “--gnu-mode” or similar switch that changes the meaning of the command-line switches above to match the GNU mode. This again provides an opt-in, but also allows downstream ports to enable it by default, should they wish.
3) Change the meaning of the switches only for llvm-readelf, and not for llvm-readobj. This is similar to the behaviour of --elf-output-style: it is GNU for llvm-readelf, and LLVM for llvm-readobj, but does have essentially the same potential for disrupting users as 1).
4) Provide a third user-facing driver (e.g. “llvm-gnu-readelf”) that provides a different CLI to the others. This makes it an opt-in feature, by using a different executable.
5) Just accept this divergence, although I personally would prefer not to, as this has the potential to confuse users migrating from GNU tools to LLVM tools.

Thoughts?

(3) SGTM (that's the approach I went with in my patch)
(2) Sounds like it could get messy to have dependencies between flags, e.g. "--gnu-mode --help" and "--help" would have to be programmed to print different things for what "-s" is an alias of.
(1) Means we would need to wait until the next release (March?) to do anything? I'd rather not be tied down to slow release cycles :( [btw, does LLVM have a deprecation policy anywhere?]
(4) I could live with this if it came to it, but I think it's assuming that someone would want llvm-readelf and *not* want readelf compatibility, enough to outweigh all the people that want llvm-readelf to be like readelf -- who is that?
(5) I think we should veto this option -- this discussion means we clearly don't accept divergence :)
 

James
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

smime.p7s (6K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [llvm-readobj][RFC]Making llvm-readobj GNU command-line compatible

David Jones via llvm-dev
In reply to this post by David Jones via llvm-dev
On Tue, Nov 6, 2018 at 9:53 PM James Henderson via llvm-dev <[hidden email]> wrote:
Hi all,

A broad goal of many of the LLVM binary tools, such as llvm-objcopy and llvm-objdump is to provide an alternative to the GNU equivalent, and as such, these tools have been developed to be command-line compatible. One tool where this hasn’t been the case up to now is llvm-readobj (aka llvm-readelf).

There was some discussion in https://reviews.llvm.org/D33872 about the purpose of llvm-readobj, so I’d like to ask the community's opinion. What is the purpose of llvm-readobj? Is it purely intended as an aid to testing? Should it be aiming to be GNU compatible, like most of the rest of the LLVM tools?

The main issue I discovered with GNU compatibility is that llvm-readobj has a few incompatible command-line flags with different interpretations between the two tools:

* -s means dump symbols in GNU readelf, but dump sections in llvm-readobj
* -t means dump section details in GNU readelf, but dump symbols in llvm-readobj
* -a means dump all in GNU readelf, but dump arm attributes in llvm-readobj

There are also several missing aliases and some missing features, but we can implement those with no negative impact on the users of llvm-readobj, so I won't discuss those here.

Also of relevance here are long options preceded with only a single dash. My understanding of GNU’s behaviour is that each letter following it is treated as a different option, whereas in llvm-readobj, we get one single option (e.g. ‘readobj -abc’ would be equivalent to ‘readobj -a -b -c’, but ‘llvm-readobj -abc’ is equivalent to ‘llvm-readobj --abc’). This is at least partly related to the cl::opt/libOption issues discussed in http://lists.llvm.org/pipermail/llvm-dev/2018-October/127328.html).

I'd like to propose that we fix the three switches above such that they match GNU readelf's interpretation, and to change short-option handling similarly. This would inevitably result in some test churn (there are approximately 200 tests between core llvm and lld that would need updating), but it is manageable. More of an issue is that any users would suddenly find the switches changing on them, if they have started using llvm-readobj. On the other hand, I think the benefit for those used to GNU readelf outweighs the cost.

We use llvm-readobj in several lld tests, but they should be easily updated if we decide to make it GNU compatible. I see a benefit of making it command-line compatible with GNU, so I believe we should do that.

We could do a few different things to mitigate the impact of changing over, roughly in my order of preference, if we decide against just taking the plunge and changing the meaning:

1) For the next release, add a deprecation warning, saying that the switches’ meanings will be changed in a following release, and then fix it after the next release has been created, along with release notes documenting the change.
2) Provide a “--gnu-mode” or similar switch that changes the meaning of the command-line switches above to match the GNU mode. This again provides an opt-in, but also allows downstream ports to enable it by default, should they wish.
3) Change the meaning of the switches only for llvm-readelf, and not for llvm-readobj. This is similar to the behaviour of --elf-output-style: it is GNU for llvm-readelf, and LLVM for llvm-readobj, but does have essentially the same potential for disrupting users as 1).
4) Provide a third user-facing driver (e.g. “llvm-gnu-readelf”) that provides a different CLI to the others. This makes it an opt-in feature, by using a different executable.
5) Just accept this divergence, although I personally would prefer not to, as this has the potential to confuse users migrating from GNU tools to LLVM tools.

Thoughts?

James
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [llvm-readobj][RFC]Making llvm-readobj GNU command-line compatible

David Jones via llvm-dev
In reply to this post by David Jones via llvm-dev
I'm also in favor of (3), possibly in combination of (1). That is, we should make llvm-readelf flag compatible with readelf, migrate uses of llvm-readobj with ELF files within LLVM to use llvm-readelf and then consider deprecating llvm-readobj for ELF (adding deprecation warnings in the next release).

On Fri, Nov 9, 2018 at 2:51 PM Jordan Rupprecht via llvm-dev <[hidden email]> wrote:
Pinging this thread to see if anyone else has opinions or objections -- if not I plan to go ahead with stepping towards compatibility with readelf vs llvm-readelf in https://reviews.llvm.org/D54124 on Monday.

On Tue, Nov 6, 2018 at 9:52 AM Jordan Rupprecht <[hidden email]> wrote:
Hi James,

I also wanted to work on this discrepancy, but I just sent a patch instead of an RFC: https://reviews.llvm.org/D54124. Thanks for sending the RFC that I should have started myself :)

On Tue, Nov 6, 2018 at 4:53 AM James Henderson via llvm-dev <[hidden email]> wrote:
Hi all,

A broad goal of many of the LLVM binary tools, such as llvm-objcopy and llvm-objdump is to provide an alternative to the GNU equivalent, and as such, these tools have been developed to be command-line compatible. One tool where this hasn’t been the case up to now is llvm-readobj (aka llvm-readelf).
I don't want to digress too much, but llvm-objdump isn't compatible either. For instance, "-df" is an llvm-objdump flag that accepts a list of functions to disassemble, but objdump accepts "-df" as a merged form of "-d -f" i.e. "--dissassemble --file-headers". So we may want to consider this as a meta-discussion for other tools like llvm-objdump.
 

There was some discussion in https://reviews.llvm.org/D33872 about the purpose of llvm-readobj, so I’d like to ask the community's opinion. What is the purpose of llvm-readobj? Is it purely intended as an aid to testing? Should it be aiming to be GNU compatible, like most of the rest of the LLVM tools?
From the source:
// This is a tool similar to readelf, except it works on multiple object file
// formats. The main purpose of this tool is to provide detailed output suitable
// for FileCheck.

My impression is that llvm-readobj is intended to provide information in the spirit of readelf, but not with any strong goal of keeping the format the same. Then, llvm-readelf (as a symlink wrapper) was added recently, to be more of a drop-in replacement, although still maybe not strict (same format, but maybe not char-for-char compatible). That's just what I've inferred from looking at code though, don't take my impression as judgement.

If that's the case, I think llvm-readelf should be relatively easy to make breaking changes to if it breaks in favor of increasing GNU readelf compatibility. llvm-readobj on the other hand has been around for a long time that folks might be relying on its flag parsing. I'd be happy if the latter were wrong and we could change llvm-readobj more freely though.
 

The main issue I discovered with GNU compatibility is that llvm-readobj has a few incompatible command-line flags with different interpretations between the two tools:

* -s means dump symbols in GNU readelf, but dump sections in llvm-readobj
* -t means dump section details in GNU readelf, but dump symbols in llvm-readobj
* -a means dump all in GNU readelf, but dump arm attributes in llvm-readobj

There are also several missing aliases and some missing features, but we can implement those with no negative impact on the users of llvm-readobj, so I won't discuss those here.

Also of relevance here are long options preceded with only a single dash. My understanding of GNU’s behaviour is that each letter following it is treated as a different option, whereas in llvm-readobj, we get one single option (e.g. ‘readobj -abc’ would be equivalent to ‘readobj -a -b -c’, but ‘llvm-readobj -abc’ is equivalent to ‘llvm-readobj --abc’). This is at least partly related to the cl::opt/libOption issues discussed in http://lists.llvm.org/pipermail/llvm-dev/2018-October/127328.html).

I'd like to propose that we fix the three switches above such that they match GNU readelf's interpretation, and to change short-option handling similarly. This would inevitably result in some test churn (there are approximately 200 tests between core llvm and lld that would need updating), but it is manageable. More of an issue is that any users would suddenly find the switches changing on them, if they have started using llvm-readobj. On the other hand, I think the benefit for those used to GNU readelf outweighs the cost.
+1
 

We could do a few different things to mitigate the impact of changing over, roughly in my order of preference, if we decide against just taking the plunge and changing the meaning:

1) For the next release, add a deprecation warning, saying that the switches’ meanings will be changed in a following release, and then fix it after the next release has been created, along with release notes documenting the change.
2) Provide a “--gnu-mode” or similar switch that changes the meaning of the command-line switches above to match the GNU mode. This again provides an opt-in, but also allows downstream ports to enable it by default, should they wish.
3) Change the meaning of the switches only for llvm-readelf, and not for llvm-readobj. This is similar to the behaviour of --elf-output-style: it is GNU for llvm-readelf, and LLVM for llvm-readobj, but does have essentially the same potential for disrupting users as 1).
4) Provide a third user-facing driver (e.g. “llvm-gnu-readelf”) that provides a different CLI to the others. This makes it an opt-in feature, by using a different executable.
5) Just accept this divergence, although I personally would prefer not to, as this has the potential to confuse users migrating from GNU tools to LLVM tools.

Thoughts?

(3) SGTM (that's the approach I went with in my patch)
(2) Sounds like it could get messy to have dependencies between flags, e.g. "--gnu-mode --help" and "--help" would have to be programmed to print different things for what "-s" is an alias of.
(1) Means we would need to wait until the next release (March?) to do anything? I'd rather not be tied down to slow release cycles :( [btw, does LLVM have a deprecation policy anywhere?]
(4) I could live with this if it came to it, but I think it's assuming that someone would want llvm-readelf and *not* want readelf compatibility, enough to outweigh all the people that want llvm-readelf to be like readelf -- who is that?
(5) I think we should veto this option -- this discussion means we clearly don't accept divergence :)
 

James
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [llvm-readobj][RFC]Making llvm-readobj GNU command-line compatible

David Jones via llvm-dev
I'm not sure I follow why we'd deprecate llvm-readobj specifically for ELF? It probably makes sense to keep it around for those who prefer the LLVM-style output, otherwise every ELF test that wants to dump more verbose output etc has to add --elf-output-style=LLVM to the command-line.

On Mon, 12 Nov 2018 at 05:20, Petr Hosek <[hidden email]> wrote:
I'm also in favor of (3), possibly in combination of (1). That is, we should make llvm-readelf flag compatible with readelf, migrate uses of llvm-readobj with ELF files within LLVM to use llvm-readelf and then consider deprecating llvm-readobj for ELF (adding deprecation warnings in the next release).

On Fri, Nov 9, 2018 at 2:51 PM Jordan Rupprecht via llvm-dev <[hidden email]> wrote:
Pinging this thread to see if anyone else has opinions or objections -- if not I plan to go ahead with stepping towards compatibility with readelf vs llvm-readelf in https://reviews.llvm.org/D54124 on Monday.

On Tue, Nov 6, 2018 at 9:52 AM Jordan Rupprecht <[hidden email]> wrote:
Hi James,

I also wanted to work on this discrepancy, but I just sent a patch instead of an RFC: https://reviews.llvm.org/D54124. Thanks for sending the RFC that I should have started myself :)

On Tue, Nov 6, 2018 at 4:53 AM James Henderson via llvm-dev <[hidden email]> wrote:
Hi all,

A broad goal of many of the LLVM binary tools, such as llvm-objcopy and llvm-objdump is to provide an alternative to the GNU equivalent, and as such, these tools have been developed to be command-line compatible. One tool where this hasn’t been the case up to now is llvm-readobj (aka llvm-readelf).
I don't want to digress too much, but llvm-objdump isn't compatible either. For instance, "-df" is an llvm-objdump flag that accepts a list of functions to disassemble, but objdump accepts "-df" as a merged form of "-d -f" i.e. "--dissassemble --file-headers". So we may want to consider this as a meta-discussion for other tools like llvm-objdump.
 

There was some discussion in https://reviews.llvm.org/D33872 about the purpose of llvm-readobj, so I’d like to ask the community's opinion. What is the purpose of llvm-readobj? Is it purely intended as an aid to testing? Should it be aiming to be GNU compatible, like most of the rest of the LLVM tools?
From the source:
// This is a tool similar to readelf, except it works on multiple object file
// formats. The main purpose of this tool is to provide detailed output suitable
// for FileCheck.

My impression is that llvm-readobj is intended to provide information in the spirit of readelf, but not with any strong goal of keeping the format the same. Then, llvm-readelf (as a symlink wrapper) was added recently, to be more of a drop-in replacement, although still maybe not strict (same format, but maybe not char-for-char compatible). That's just what I've inferred from looking at code though, don't take my impression as judgement.

If that's the case, I think llvm-readelf should be relatively easy to make breaking changes to if it breaks in favor of increasing GNU readelf compatibility. llvm-readobj on the other hand has been around for a long time that folks might be relying on its flag parsing. I'd be happy if the latter were wrong and we could change llvm-readobj more freely though.
 

The main issue I discovered with GNU compatibility is that llvm-readobj has a few incompatible command-line flags with different interpretations between the two tools:

* -s means dump symbols in GNU readelf, but dump sections in llvm-readobj
* -t means dump section details in GNU readelf, but dump symbols in llvm-readobj
* -a means dump all in GNU readelf, but dump arm attributes in llvm-readobj

There are also several missing aliases and some missing features, but we can implement those with no negative impact on the users of llvm-readobj, so I won't discuss those here.

Also of relevance here are long options preceded with only a single dash. My understanding of GNU’s behaviour is that each letter following it is treated as a different option, whereas in llvm-readobj, we get one single option (e.g. ‘readobj -abc’ would be equivalent to ‘readobj -a -b -c’, but ‘llvm-readobj -abc’ is equivalent to ‘llvm-readobj --abc’). This is at least partly related to the cl::opt/libOption issues discussed in http://lists.llvm.org/pipermail/llvm-dev/2018-October/127328.html).

I'd like to propose that we fix the three switches above such that they match GNU readelf's interpretation, and to change short-option handling similarly. This would inevitably result in some test churn (there are approximately 200 tests between core llvm and lld that would need updating), but it is manageable. More of an issue is that any users would suddenly find the switches changing on them, if they have started using llvm-readobj. On the other hand, I think the benefit for those used to GNU readelf outweighs the cost.
+1
 

We could do a few different things to mitigate the impact of changing over, roughly in my order of preference, if we decide against just taking the plunge and changing the meaning:

1) For the next release, add a deprecation warning, saying that the switches’ meanings will be changed in a following release, and then fix it after the next release has been created, along with release notes documenting the change.
2) Provide a “--gnu-mode” or similar switch that changes the meaning of the command-line switches above to match the GNU mode. This again provides an opt-in, but also allows downstream ports to enable it by default, should they wish.
3) Change the meaning of the switches only for llvm-readelf, and not for llvm-readobj. This is similar to the behaviour of --elf-output-style: it is GNU for llvm-readelf, and LLVM for llvm-readobj, but does have essentially the same potential for disrupting users as 1).
4) Provide a third user-facing driver (e.g. “llvm-gnu-readelf”) that provides a different CLI to the others. This makes it an opt-in feature, by using a different executable.
5) Just accept this divergence, although I personally would prefer not to, as this has the potential to confuse users migrating from GNU tools to LLVM tools.

Thoughts?

(3) SGTM (that's the approach I went with in my patch)
(2) Sounds like it could get messy to have dependencies between flags, e.g. "--gnu-mode --help" and "--help" would have to be programmed to print different things for what "-s" is an alias of.
(1) Means we would need to wait until the next release (March?) to do anything? I'd rather not be tied down to slow release cycles :( [btw, does LLVM have a deprecation policy anywhere?]
(4) I could live with this if it came to it, but I think it's assuming that someone would want llvm-readelf and *not* want readelf compatibility, enough to outweigh all the people that want llvm-readelf to be like readelf -- who is that?
(5) I think we should veto this option -- this discussion means we clearly don't accept divergence :)
 

James
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev