[llvm-dev] PR36144: X86 Intel syntax and masm flavor

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

[llvm-dev] PR36144: X86 Intel syntax and masm flavor

韩玉 via llvm-dev
Hi,

We have a significant regression since llvm 5.0.0 in the x86 assembler.

The following snippets fail:

1)

.intel_syntax
0:
  jmp 0b

2)

.intel_syntax
  and edi, 0b010101

when running through `llvm-mc -arch x86-64`.

This regression was introduced in r301390, which was driven by PR27884.

I think https://llvm.org/PR36144 describes this very well, and I think we should
get this fixed, since it's a pretty basic thing to support in the assembler.

Here are a few solutions to this:

1) Introduce a new asm dialect/flavor/style to assemble masm files.

2) Only set the flags based on the target triple. Also suggested in PR27884.

3) Only set the flags based on a new command line flag.

Let me know if any other solution comes to mind.

While we get this issue fixed, is it reasonable to revert r301390?

Thanks,

--
Francis
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] PR36144: X86 Intel syntax and masm flavor

韩玉 via llvm-dev
I think we should revert r301390 just on principle from looking at the code. If I understand correctly, it flips the bit for "is parsing inline asm" to true when encountering a plain .intel_syntax directive. That's just wrong.

On Wed, Sep 12, 2018 at 6:34 AM Francis Visoiu Mistrih via llvm-dev <[hidden email]> wrote:
Hi,

We have a significant regression since llvm 5.0.0 in the x86 assembler.

The following snippets fail:

1)

.intel_syntax
0:
  jmp 0b

2)

.intel_syntax
  and edi, 0b010101

when running through `llvm-mc -arch x86-64`.

This regression was introduced in r301390, which was driven by PR27884.

I think https://llvm.org/PR36144 describes this very well, and I think we should
get this fixed, since it's a pretty basic thing to support in the assembler.

Here are a few solutions to this:

1) Introduce a new asm dialect/flavor/style to assemble masm files.

2) Only set the flags based on the target triple. Also suggested in PR27884.

3) Only set the flags based on a new command line flag.

Let me know if any other solution comes to mind.

While we get this issue fixed, is it reasonable to revert r301390?

Thanks,

--
Francis
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] PR36144: X86 Intel syntax and masm flavor

韩玉 via llvm-dev
Sorry, I spoke too soon. This only happens for intel style inline assembly in LLVM IR. I don't have a good suggestion.

On Wed, Sep 12, 2018 at 1:44 PM Reid Kleckner <[hidden email]> wrote:
I think we should revert r301390 just on principle from looking at the code. If I understand correctly, it flips the bit for "is parsing inline asm" to true when encountering a plain .intel_syntax directive. That's just wrong.

On Wed, Sep 12, 2018 at 6:34 AM Francis Visoiu Mistrih via llvm-dev <[hidden email]> wrote:
Hi,

We have a significant regression since llvm 5.0.0 in the x86 assembler.

The following snippets fail:

1)

.intel_syntax
0:
  jmp 0b

2)

.intel_syntax
  and edi, 0b010101

when running through `llvm-mc -arch x86-64`.

This regression was introduced in r301390, which was driven by PR27884.

I think https://llvm.org/PR36144 describes this very well, and I think we should
get this fixed, since it's a pretty basic thing to support in the assembler.

Here are a few solutions to this:

1) Introduce a new asm dialect/flavor/style to assemble masm files.

2) Only set the flags based on the target triple. Also suggested in PR27884.

3) Only set the flags based on a new command line flag.

Let me know if any other solution comes to mind.

While we get this issue fixed, is it reasonable to revert r301390?

Thanks,

--
Francis
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] PR36144: X86 Intel syntax and masm flavor

韩玉 via llvm-dev


On Sep 12, 2018, at 1:48 PM, Reid Kleckner via llvm-dev <[hidden email]> wrote:

Sorry, I spoke too soon. This only happens for intel style inline assembly in LLVM IR. I don't have a good suggestion.

Why is it relevant that the issue is contained? The 0b support in MS asm shouldn’t break the general intel assembly syntax. 

On Wed, Sep 12, 2018 at 1:44 PM Reid Kleckner <[hidden email]> wrote:
I think we should revert r301390 just on principle from looking at the code. If I understand correctly, it flips the bit for "is parsing inline asm" to true when encountering a plain .intel_syntax directive. That's just wrong.

On Wed, Sep 12, 2018 at 6:34 AM Francis Visoiu Mistrih via llvm-dev <[hidden email]> wrote:
Hi,

We have a significant regression since llvm 5.0.0 in the x86 assembler.

The following snippets fail:

1)

.intel_syntax
0:
  jmp 0b

2)

.intel_syntax
  and edi, 0b010101

when running through `llvm-mc -arch x86-64`.

This regression was introduced in r301390, which was driven by PR27884.

I think https://llvm.org/PR36144 describes this very well, and I think we should
get this fixed, since it's a pretty basic thing to support in the assembler.

Here are a few solutions to this:

1) Introduce a new asm dialect/flavor/style to assemble masm files.

2) Only set the flags based on the target triple. Also suggested in PR27884.

3) Only set the flags based on a new command line flag.

Let me know if any other solution comes to mind.

While we get this issue fixed, is it reasonable to revert r301390?

Thanks,

--
Francis
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] PR36144: X86 Intel syntax and masm flavor

韩玉 via llvm-dev
After looking more closely at the patch in question, I again think it should be reverted.

The patch sets a boolean to indicate that inline asm is being parsed in three places now:
1. true when parsing intel inline asm
2. true when .intel_syntax directives are encountered (BUG!)
3. false when .att_syntax is encountered

We should only set this to true when parsing *inline asm*, clearly. I sent my second email after I saw place 1 when re-reading the patch and thought, oh, everything is working as intended. I'll go ahead and do it, since it is clearly wrong. .intel_syntax should not imply IsParsingInlineAsm.

On Mon, Oct 22, 2018 at 1:13 PM Gerolf Hoflehner <[hidden email]> wrote:


On Sep 12, 2018, at 1:48 PM, Reid Kleckner via llvm-dev <[hidden email]> wrote:

Sorry, I spoke too soon. This only happens for intel style inline assembly in LLVM IR. I don't have a good suggestion.

Why is it relevant that the issue is contained? The 0b support in MS asm shouldn’t break the general intel assembly syntax. 

On Wed, Sep 12, 2018 at 1:44 PM Reid Kleckner <[hidden email]> wrote:
I think we should revert r301390 just on principle from looking at the code. If I understand correctly, it flips the bit for "is parsing inline asm" to true when encountering a plain .intel_syntax directive. That's just wrong.

On Wed, Sep 12, 2018 at 6:34 AM Francis Visoiu Mistrih via llvm-dev <[hidden email]> wrote:
Hi,

We have a significant regression since llvm 5.0.0 in the x86 assembler.

The following snippets fail:

1)

.intel_syntax
0:
  jmp 0b

2)

.intel_syntax
  and edi, 0b010101

when running through `llvm-mc -arch x86-64`.

This regression was introduced in r301390, which was driven by PR27884.

I think https://llvm.org/PR36144 describes this very well, and I think we should
get this fixed, since it's a pretty basic thing to support in the assembler.

Here are a few solutions to this:

1) Introduce a new asm dialect/flavor/style to assemble masm files.

2) Only set the flags based on the target triple. Also suggested in PR27884.

3) Only set the flags based on a new command line flag.

Let me know if any other solution comes to mind.

While we get this issue fixed, is it reasonable to revert r301390?

Thanks,

--
Francis
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] PR36144: X86 Intel syntax and masm flavor

韩玉 via llvm-dev


On Oct 22, 2018, at 1:44 PM, Reid Kleckner <[hidden email]> wrote:

After looking more closely at the patch in question, I again think it should be reverted.

The patch sets a boolean to indicate that inline asm is being parsed in three places now:
1. true when parsing intel inline asm
2. true when .intel_syntax directives are encountered (BUG!)
3. false when .att_syntax is encountered

We should only set this to true when parsing *inline asm*, clearly. I sent my second email after I saw place 1 when re-reading the patch and thought, oh, everything is working as intended. I'll go ahead and do it, since it is clearly wrong. .intel_syntax should not imply IsParsingInlineAsm.
Thanks, Reid!


On Mon, Oct 22, 2018 at 1:13 PM Gerolf Hoflehner <[hidden email]> wrote:


On Sep 12, 2018, at 1:48 PM, Reid Kleckner via llvm-dev <[hidden email]> wrote:

Sorry, I spoke too soon. This only happens for intel style inline assembly in LLVM IR. I don't have a good suggestion.

Why is it relevant that the issue is contained? The 0b support in MS asm shouldn’t break the general intel assembly syntax. 

On Wed, Sep 12, 2018 at 1:44 PM Reid Kleckner <[hidden email]> wrote:
I think we should revert r301390 just on principle from looking at the code. If I understand correctly, it flips the bit for "is parsing inline asm" to true when encountering a plain .intel_syntax directive. That's just wrong.

On Wed, Sep 12, 2018 at 6:34 AM Francis Visoiu Mistrih via llvm-dev <[hidden email]> wrote:
Hi,

We have a significant regression since llvm 5.0.0 in the x86 assembler.

The following snippets fail:

1)

.intel_syntax
0:
  jmp 0b

2)

.intel_syntax
  and edi, 0b010101

when running through `llvm-mc -arch x86-64`.

This regression was introduced in r301390, which was driven by PR27884.

I think https://llvm.org/PR36144 describes this very well, and I think we should
get this fixed, since it's a pretty basic thing to support in the assembler.

Here are a few solutions to this:

1) Introduce a new asm dialect/flavor/style to assemble masm files.

2) Only set the flags based on the target triple. Also suggested in PR27884.

3) Only set the flags based on a new command line flag.

Let me know if any other solution comes to mind.

While we get this issue fixed, is it reasonable to revert r301390?

Thanks,

--
Francis
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev



_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] PR36144: X86 Intel syntax and masm flavor

韩玉 via llvm-dev
So, unfortunately it is not so easy to revert that patch. Previously, here is what would happen:

1. clang frontend parses __asm { ... } blob with the help of MC. IsParsingMSInlineAsm is true in the AsmLexer. 0b111 is recognized as a binary integer literal.
2. clang *rewrites* the user asm, including many integer literals, to gnu/intel syntax, something that llvm-mc could parse in standalone mode. This rewrites 0b111 to 7, for example.
3. Backend builds a string literal to parse from the inline asm blob. String literal says something like "and al, 7" instead of "and al, 0b111".
4. Backend parses and re-emits asm in AT&T syntax, and you get "andb $7, %al".

r311639 removed the infrastructure that rewrote immediate expressions in step 2 to plain decimal integers, so now this doesn't work. We do step 1 correctly, step 2 never happens, and step 3 can't parse 0b111, since it thinks it's a label reference.

I think we need to fix forward at this point. I think the best way forward would be some global option to control how 0b is interpreted, rather than a distinct masm vs intel asm dialect.

On Mon, Oct 22, 2018 at 2:08 PM Gerolf Hoflehner <[hidden email]> wrote:


On Oct 22, 2018, at 1:44 PM, Reid Kleckner <[hidden email]> wrote:

After looking more closely at the patch in question, I again think it should be reverted.

The patch sets a boolean to indicate that inline asm is being parsed in three places now:
1. true when parsing intel inline asm
2. true when .intel_syntax directives are encountered (BUG!)
3. false when .att_syntax is encountered

We should only set this to true when parsing *inline asm*, clearly. I sent my second email after I saw place 1 when re-reading the patch and thought, oh, everything is working as intended. I'll go ahead and do it, since it is clearly wrong. .intel_syntax should not imply IsParsingInlineAsm.
Thanks, Reid!


On Mon, Oct 22, 2018 at 1:13 PM Gerolf Hoflehner <[hidden email]> wrote:


On Sep 12, 2018, at 1:48 PM, Reid Kleckner via llvm-dev <[hidden email]> wrote:

Sorry, I spoke too soon. This only happens for intel style inline assembly in LLVM IR. I don't have a good suggestion.

Why is it relevant that the issue is contained? The 0b support in MS asm shouldn’t break the general intel assembly syntax. 

On Wed, Sep 12, 2018 at 1:44 PM Reid Kleckner <[hidden email]> wrote:
I think we should revert r301390 just on principle from looking at the code. If I understand correctly, it flips the bit for "is parsing inline asm" to true when encountering a plain .intel_syntax directive. That's just wrong.

On Wed, Sep 12, 2018 at 6:34 AM Francis Visoiu Mistrih via llvm-dev <[hidden email]> wrote:
Hi,

We have a significant regression since llvm 5.0.0 in the x86 assembler.

The following snippets fail:

1)

.intel_syntax
0:
  jmp 0b

2)

.intel_syntax
  and edi, 0b010101

when running through `llvm-mc -arch x86-64`.

This regression was introduced in r301390, which was driven by PR27884.

I think https://llvm.org/PR36144 describes this very well, and I think we should
get this fixed, since it's a pretty basic thing to support in the assembler.

Here are a few solutions to this:

1) Introduce a new asm dialect/flavor/style to assemble masm files.

2) Only set the flags based on the target triple. Also suggested in PR27884.

3) Only set the flags based on a new command line flag.

Let me know if any other solution comes to mind.

While we get this issue fixed, is it reasonable to revert r301390?

Thanks,

--
Francis
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev



_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] PR36144: X86 Intel syntax and masm flavor

韩玉 via llvm-dev
Nevermind, I came up with a solution and sent it out here: https://reviews.llvm.org/D53535
That should at least get things back to normal.

On Mon, Oct 22, 2018 at 2:49 PM Reid Kleckner <[hidden email]> wrote:
So, unfortunately it is not so easy to revert that patch. Previously, here is what would happen:

1. clang frontend parses __asm { ... } blob with the help of MC. IsParsingMSInlineAsm is true in the AsmLexer. 0b111 is recognized as a binary integer literal.
2. clang *rewrites* the user asm, including many integer literals, to gnu/intel syntax, something that llvm-mc could parse in standalone mode. This rewrites 0b111 to 7, for example.
3. Backend builds a string literal to parse from the inline asm blob. String literal says something like "and al, 7" instead of "and al, 0b111".
4. Backend parses and re-emits asm in AT&T syntax, and you get "andb $7, %al".

r311639 removed the infrastructure that rewrote immediate expressions in step 2 to plain decimal integers, so now this doesn't work. We do step 1 correctly, step 2 never happens, and step 3 can't parse 0b111, since it thinks it's a label reference.

I think we need to fix forward at this point. I think the best way forward would be some global option to control how 0b is interpreted, rather than a distinct masm vs intel asm dialect.

On Mon, Oct 22, 2018 at 2:08 PM Gerolf Hoflehner <[hidden email]> wrote:


On Oct 22, 2018, at 1:44 PM, Reid Kleckner <[hidden email]> wrote:

After looking more closely at the patch in question, I again think it should be reverted.

The patch sets a boolean to indicate that inline asm is being parsed in three places now:
1. true when parsing intel inline asm
2. true when .intel_syntax directives are encountered (BUG!)
3. false when .att_syntax is encountered

We should only set this to true when parsing *inline asm*, clearly. I sent my second email after I saw place 1 when re-reading the patch and thought, oh, everything is working as intended. I'll go ahead and do it, since it is clearly wrong. .intel_syntax should not imply IsParsingInlineAsm.
Thanks, Reid!


On Mon, Oct 22, 2018 at 1:13 PM Gerolf Hoflehner <[hidden email]> wrote:


On Sep 12, 2018, at 1:48 PM, Reid Kleckner via llvm-dev <[hidden email]> wrote:

Sorry, I spoke too soon. This only happens for intel style inline assembly in LLVM IR. I don't have a good suggestion.

Why is it relevant that the issue is contained? The 0b support in MS asm shouldn’t break the general intel assembly syntax. 

On Wed, Sep 12, 2018 at 1:44 PM Reid Kleckner <[hidden email]> wrote:
I think we should revert r301390 just on principle from looking at the code. If I understand correctly, it flips the bit for "is parsing inline asm" to true when encountering a plain .intel_syntax directive. That's just wrong.

On Wed, Sep 12, 2018 at 6:34 AM Francis Visoiu Mistrih via llvm-dev <[hidden email]> wrote:
Hi,

We have a significant regression since llvm 5.0.0 in the x86 assembler.

The following snippets fail:

1)

.intel_syntax
0:
  jmp 0b

2)

.intel_syntax
  and edi, 0b010101

when running through `llvm-mc -arch x86-64`.

This regression was introduced in r301390, which was driven by PR27884.

I think https://llvm.org/PR36144 describes this very well, and I think we should
get this fixed, since it's a pretty basic thing to support in the assembler.

Here are a few solutions to this:

1) Introduce a new asm dialect/flavor/style to assemble masm files.

2) Only set the flags based on the target triple. Also suggested in PR27884.

3) Only set the flags based on a new command line flag.

Let me know if any other solution comes to mind.

While we get this issue fixed, is it reasonable to revert r301390?

Thanks,

--
Francis
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev



_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev