Intel asm syntax and variable names

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Intel asm syntax and variable names

Yatsina, Marina

Hi all,

 

I’ve encountered an issue with x86 Intel asm syntax when using certain variable names.

 

If you look at the following example, where I try to do a mov to a memory location named “flags2”, llvm- mc works fine:

 

>cat test_good.s

mov eax, flags2

>llvm-mc.exe -x86-asm-syntax=intel test_good.s -o -

        .text

        movl    flags2, %eax

 

But if the memory location is named “flags”, llvm-mc fails:

 

>cat test_bad.s

mov eax, flags

>llvm-mc.exe -x86-asm-syntax=intel test_bad.s -o -

test_bad.s:1:1: error: invalid operand for instruction

mov eax, flags

^

        .text

 

After investigation, I saw that the memory location named “flags” was matched to the EFLAGS register in the MatchRegisterName() function in the generated X86GenAsmMatcher.inc.

 

case 'f': // 1 string to match.

      if (memcmp(Name.data()+1, "lags", 4))

        break;

      return 25;      // "flags"

 

So basically, what I’m seeing with “flags” (which should be a legit variable name) is that the X86AsmParser creates a reference to an implicit register instead of a reference to memory.

There are additional issues here as well - what if we compile to SSE, but use a variable named “ZMM0” which is a register in AVX-512? Should this be allowed?

 

We probably need some way to mark the registers (using attributes or predicates?) so that we’d know which ones are part of the legal set of registers that can be referenced in the architecture we’re compiling too.

Do you think this is a good approach?

 

Thanks,

Marina

 

 

---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Intel asm syntax and variable names

Reid Kleckner-2
Suppose I have a global variable named 'EAX'. How do Intel assemblers normally escape register names to access such a global variable?

On Thu, Jul 23, 2015 at 1:42 AM, Yatsina, Marina <[hidden email]> wrote:

Hi all,

 

I’ve encountered an issue with x86 Intel asm syntax when using certain variable names.

 

If you look at the following example, where I try to do a mov to a memory location named “flags2”, llvm- mc works fine:

 

>cat test_good.s

mov eax, flags2

>llvm-mc.exe -x86-asm-syntax=intel test_good.s -o -

        .text

        movl    flags2, %eax

 

But if the memory location is named “flags”, llvm-mc fails:

 

>cat test_bad.s

mov eax, flags

>llvm-mc.exe -x86-asm-syntax=intel test_bad.s -o -

test_bad.s:1:1: error: invalid operand for instruction

mov eax, flags

^

        .text

 

After investigation, I saw that the memory location named “flags” was matched to the EFLAGS register in the MatchRegisterName() function in the generated X86GenAsmMatcher.inc.

 

case 'f': // 1 string to match.

      if (memcmp(Name.data()+1, "lags", 4))

        break;

      return 25;      // "flags"

 

So basically, what I’m seeing with “flags” (which should be a legit variable name) is that the X86AsmParser creates a reference to an implicit register instead of a reference to memory.

There are additional issues here as well - what if we compile to SSE, but use a variable named “ZMM0” which is a register in AVX-512? Should this be allowed?

 

We probably need some way to mark the registers (using attributes or predicates?) so that we’d know which ones are part of the legal set of registers that can be referenced in the architecture we’re compiling too.

Do you think this is a good approach?

 

Thanks,

Marina

 

 

---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Intel asm syntax and variable names

Yatsina, Marina

Microsoft assembler treats mov to EAX as a register, even if there is a global memory also named EAX – meaning the register takes precedence.

But here I have a bit of a different situation – I have a global variable, which name happens to match an implicit register or a register that does not exist in the current arch, just in future ones. Microsoft assembler treats these cases as memory locations, llvm treats them as registers, causing compilation errors.

 

 

From: Reid Kleckner [mailto:[hidden email]]
Sent: Thursday, July 23, 2015 18:54
To: Yatsina, Marina
Cc: [hidden email]
Subject: Re: [LLVMdev] Intel asm syntax and variable names

 

Suppose I have a global variable named 'EAX'. How do Intel assemblers normally escape register names to access such a global variable?

 

On Thu, Jul 23, 2015 at 1:42 AM, Yatsina, Marina <[hidden email]> wrote:

Hi all,

 

I’ve encountered an issue with x86 Intel asm syntax when using certain variable names.

 

If you look at the following example, where I try to do a mov to a memory location named “flags2”, llvm- mc works fine:

 

>cat test_good.s

mov eax, flags2

>llvm-mc.exe -x86-asm-syntax=intel test_good.s -o -

        .text

        movl    flags2, %eax

 

But if the memory location is named “flags”, llvm-mc fails:

 

>cat test_bad.s

mov eax, flags

>llvm-mc.exe -x86-asm-syntax=intel test_bad.s -o -

test_bad.s:1:1: error: invalid operand for instruction

mov eax, flags

^

        .text

 

After investigation, I saw that the memory location named “flags” was matched to the EFLAGS register in the MatchRegisterName() function in the generated X86GenAsmMatcher.inc.

 

case 'f': // 1 string to match.

      if (memcmp(Name.data()+1, "lags", 4))

        break;

      return 25;      // "flags"

 

So basically, what I’m seeing with “flags” (which should be a legit variable name) is that the X86AsmParser creates a reference to an implicit register instead of a reference to memory.

There are additional issues here as well - what if we compile to SSE, but use a variable named “ZMM0” which is a register in AVX-512? Should this be allowed?

 

We probably need some way to mark the registers (using attributes or predicates?) so that we’d know which ones are part of the legal set of registers that can be referenced in the architecture we’re compiling too.

Do you think this is a good approach?

 

Thanks,

Marina

 

 

---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

 

---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Intel asm syntax and variable names

Reid Kleckner-2
So, there is no prior art for escaping the name of a global symbol with the same name as a register? If there is, I'd rather we just implement it and leave it at that.

We can probably fix the 'flags' case easily in LLVM, but I'd rather not bend over backwards to make ZMM0 be a global name when AVX is disabled.

On Thu, Jul 23, 2015 at 9:12 AM, Yatsina, Marina <[hidden email]> wrote:

Microsoft assembler treats mov to EAX as a register, even if there is a global memory also named EAX – meaning the register takes precedence.

But here I have a bit of a different situation – I have a global variable, which name happens to match an implicit register or a register that does not exist in the current arch, just in future ones. Microsoft assembler treats these cases as memory locations, llvm treats them as registers, causing compilation errors.

 

 

From: Reid Kleckner [mailto:[hidden email]]
Sent: Thursday, July 23, 2015 18:54
To: Yatsina, Marina
Cc: [hidden email]
Subject: Re: [LLVMdev] Intel asm syntax and variable names

 

Suppose I have a global variable named 'EAX'. How do Intel assemblers normally escape register names to access such a global variable?

 

On Thu, Jul 23, 2015 at 1:42 AM, Yatsina, Marina <[hidden email]> wrote:

Hi all,

 

I’ve encountered an issue with x86 Intel asm syntax when using certain variable names.

 

If you look at the following example, where I try to do a mov to a memory location named “flags2”, llvm- mc works fine:

 

>cat test_good.s

mov eax, flags2

>llvm-mc.exe -x86-asm-syntax=intel test_good.s -o -

        .text

        movl    flags2, %eax

 

But if the memory location is named “flags”, llvm-mc fails:

 

>cat test_bad.s

mov eax, flags

>llvm-mc.exe -x86-asm-syntax=intel test_bad.s -o -

test_bad.s:1:1: error: invalid operand for instruction

mov eax, flags

^

        .text

 

After investigation, I saw that the memory location named “flags” was matched to the EFLAGS register in the MatchRegisterName() function in the generated X86GenAsmMatcher.inc.

 

case 'f': // 1 string to match.

      if (memcmp(Name.data()+1, "lags", 4))

        break;

      return 25;      // "flags"

 

So basically, what I’m seeing with “flags” (which should be a legit variable name) is that the X86AsmParser creates a reference to an implicit register instead of a reference to memory.

There are additional issues here as well - what if we compile to SSE, but use a variable named “ZMM0” which is a register in AVX-512? Should this be allowed?

 

We probably need some way to mark the registers (using attributes or predicates?) so that we’d know which ones are part of the legal set of registers that can be referenced in the architecture we’re compiling too.

Do you think this is a good approach?

 

Thanks,

Marina

 

 

---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

 

---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Intel asm syntax and variable names

Matthias Braun-2
Some targets don't have the problem because they prefix all names with an undercore. Apart from that I am not aware of any solution to the problem of keywords clashing with variable names in intel syntax.

- Matthias

On Jul 23, 2015, at 9:18 AM, Reid Kleckner <[hidden email]> wrote:

So, there is no prior art for escaping the name of a global symbol with the same name as a register? If there is, I'd rather we just implement it and leave it at that.

We can probably fix the 'flags' case easily in LLVM, but I'd rather not bend over backwards to make ZMM0 be a global name when AVX is disabled.

On Thu, Jul 23, 2015 at 9:12 AM, Yatsina, Marina <[hidden email]> wrote:

Microsoft assembler treats mov to EAX as a register, even if there is a global memory also named EAX – meaning the register takes precedence.

But here I have a bit of a different situation – I have a global variable, which name happens to match an implicit register or a register that does not exist in the current arch, just in future ones. Microsoft assembler treats these cases as memory locations, llvm treats them as registers, causing compilation errors.

 

 

From: Reid Kleckner [mailto:[hidden email]]
Sent: Thursday, July 23, 2015 18:54
To: Yatsina, Marina
Cc: [hidden email]
Subject: Re: [LLVMdev] Intel asm syntax and variable names

 

Suppose I have a global variable named 'EAX'. How do Intel assemblers normally escape register names to access such a global variable?

 

On Thu, Jul 23, 2015 at 1:42 AM, Yatsina, Marina <[hidden email]> wrote:

Hi all,

 

I’ve encountered an issue with x86 Intel asm syntax when using certain variable names.

 

If you look at the following example, where I try to do a mov to a memory location named “flags2”, llvm- mc works fine:

 

>cat test_good.s

mov eax, flags2

>llvm-mc.exe -x86-asm-syntax=intel test_good.s -o -

        .text

        movl    flags2, %eax

 

But if the memory location is named “flags”, llvm-mc fails:

 

>cat test_bad.s

mov eax, flags

>llvm-mc.exe -x86-asm-syntax=intel test_bad.s -o -

test_bad.s:1:1: error: invalid operand for instruction

mov eax, flags

^

        .text

 

After investigation, I saw that the memory location named “flags” was matched to the EFLAGS register in the MatchRegisterName() function in the generated X86GenAsmMatcher.inc.

 

case 'f': // 1 string to match.

      if (memcmp(Name.data()+1, "lags", 4))

        break;

      return 25;      // "flags"

 

So basically, what I’m seeing with “flags” (which should be a legit variable name) is that the X86AsmParser creates a reference to an implicit register instead of a reference to memory.

There are additional issues here as well - what if we compile to SSE, but use a variable named “ZMM0” which is a register in AVX-512? Should this be allowed?

 

We probably need some way to mark the registers (using attributes or predicates?) so that we’d know which ones are part of the legal set of registers that can be referenced in the architecture we’re compiling too.

Do you think this is a good approach?

 

Thanks,

Marina

 

 

---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

 

---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Intel asm syntax and variable names

Yatsina, Marina

Hi,

 

I’ve uploaded a workaround for the issue.

Please let me know if you think it’s ok.

 

http://reviews.llvm.org/D11512

http://reviews.llvm.org/D11513

 

Thanks,

Marina

 

 

From: Matthias Braun [mailto:[hidden email]]
Sent: Thursday, July 23, 2015 20:12
To: Reid Kleckner
Cc: Yatsina, Marina; [hidden email]
Subject: Re: [LLVMdev] Intel asm syntax and variable names

 

Some targets don't have the problem because they prefix all names with an undercore. Apart from that I am not aware of any solution to the problem of keywords clashing with variable names in intel syntax.

 

- Matthias

 

On Jul 23, 2015, at 9:18 AM, Reid Kleckner <[hidden email]> wrote:

 

So, there is no prior art for escaping the name of a global symbol with the same name as a register? If there is, I'd rather we just implement it and leave it at that.

 

We can probably fix the 'flags' case easily in LLVM, but I'd rather not bend over backwards to make ZMM0 be a global name when AVX is disabled.

 

On Thu, Jul 23, 2015 at 9:12 AM, Yatsina, Marina <[hidden email]> wrote:

Microsoft assembler treats mov to EAX as a register, even if there is a global memory also named EAX – meaning the register takes precedence.

But here I have a bit of a different situation – I have a global variable, which name happens to match an implicit register or a register that does not exist in the current arch, just in future ones. Microsoft assembler treats these cases as memory locations, llvm treats them as registers, causing compilation errors.

 

 

From: Reid Kleckner [mailto:[hidden email]]
Sent: Thursday, July 23, 2015 18:54
To: Yatsina, Marina
Cc: [hidden email]
Subject: Re: [LLVMdev] Intel asm syntax and variable names

 

Suppose I have a global variable named 'EAX'. How do Intel assemblers normally escape register names to access such a global variable?

 

On Thu, Jul 23, 2015 at 1:42 AM, Yatsina, Marina <[hidden email]> wrote:

Hi all,

 

I’ve encountered an issue with x86 Intel asm syntax when using certain variable names.

 

If you look at the following example, where I try to do a mov to a memory location named “flags2”, llvm- mc works fine:

 

>cat test_good.s

mov eax, flags2

>llvm-mc.exe -x86-asm-syntax=intel test_good.s -o -

        .text

        movl    flags2, %eax

 

But if the memory location is named “flags”, llvm-mc fails:

 

>cat test_bad.s

mov eax, flags

>llvm-mc.exe -x86-asm-syntax=intel test_bad.s -o -

test_bad.s:1:1: error: invalid operand for instruction

mov eax, flags

^

        .text

 

After investigation, I saw that the memory location named “flags” was matched to the EFLAGS register in the MatchRegisterName() function in the generated X86GenAsmMatcher.inc.

 

case 'f': // 1 string to match.

      if (memcmp(Name.data()+1, "lags", 4))

        break;

      return 25;      // "flags"

 

So basically, what I’m seeing with “flags” (which should be a legit variable name) is that the X86AsmParser creates a reference to an implicit register instead of a reference to memory.

There are additional issues here as well - what if we compile to SSE, but use a variable named “ZMM0” which is a register in AVX-512? Should this be allowed?

 

We probably need some way to mark the registers (using attributes or predicates?) so that we’d know which ones are part of the legal set of registers that can be referenced in the architecture we’re compiling too.

Do you think this is a good approach?

 

Thanks,

Marina

 

 

---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

 

---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

 

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

 

---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev