[llvm-dev] [DbgInfo] Potential bug in location list address ranges

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

[llvm-dev] [DbgInfo] Potential bug in location list address ranges

Adam Nemet via llvm-dev
Hi all,

Consider this ARM assembly code of a C function:

00008124 <foo>:
    8124:                   push    {r4, r6, r7, lr}
    8126:                   add     r7, sp, #8
    8128:                   mov     r4, r0
    812a:                   ldrsb.w r0, [r2]
    812e:                   cmp     r0, #1
    8130:                   itt     lt
    8132:                   movlt   r0, #85 ; 0x55
    8134:                   poplt   {r4, r6, r7, pc}            // a function return

    8136:                   ldrb.w  ip, [r1, #3]
    813a:                   ldrb.w  lr, [r4, #3]
    813e:                   movs    r0, #85 ; 0x55
    8140:                   cmp     lr, ip
    8142:                   bne.n   8168 <foo+0x44>

    8144:                   ldrb.w  ip, [r1, #2]
    8148:                   ldrb    r3, [r4, #2]
    814a:                   cmp     r3, ip
    814c:                   it      ne
    814e:                   popne   {r4, r6, r7, pc}          // a function return

    8150:                   ldrb.w  ip, [r1, #1]
    8154:                   ldrb    r3, [r4, #1]
    8156:                   cmp     r3, ip
    8158:                   bne.n   8168 <foo+0x44>

    815a:                   ldrb    r1, [r1, #0]
    815c:                   ldrb    r3, [r4, #0]
    815e:                   cmp     r3, r1
    8160:                   ittt    eq
    8162:                   moveq   r0, #3
    8164:                   strbeq  r0, [r2, #0]
    8166:                   moveq   r0, #170        ; 0xaa
    8168:                   pop     {r4, r6, r7, pc}          // a function return

I have a variable bar and here's its corresponding DWARF DIE:

 <2><3b>: Abbrev Number: 3 (DW_TAG_formal_parameter)
    <3c>   DW_AT_location    : 0x0 (location list)
    <40>   DW_AT_name        : (indirect string, offset: 0x9e): bar
    <44>   DW_AT_decl_file   : 1
    <45>   DW_AT_decl_line   : 34
    <46>   DW_AT_type        : <0x153>

 // Its location list
    00000000 00008124 0000812a (DW_OP_reg0 (r0))
    0000000b 0000812a 00008136 (DW_OP_reg4 (r4))
    00000016 <End of list>

As you can see, it says that we can find bar in r4 from 0x812a to 0x8134 (poplt).  However, this is only true when the cmp instruction at 0x812e yields less than (lt).  So if the value in r0 is greater than 1 (which is the case of my input), we should still be able to read the value of bar from r4 in the remaining of the function.

I don't know if we can consider this a bug, because I don't even know what should be the correct location information for bar. However, in this case, since the conditional instruction that clobbers r4 is a function return, I'd expect to read the value of bar from r4 in the remaining of the function. 

If the conditional instruction poplt was addlt r4, r0, 3 for example, what should be the correct location list of bar?

For now, my only idea is to check if the clobbering MI is a conditional return in DbgValueHistoryCalculator which computes the end address of a location llist entry. But I do not feel like this is the correct fix though.

Looking forward to hearing your thoughts on this,

Thank you for reading this,

Son Tuan Vu

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [DbgInfo] Potential bug in location list address ranges

Adam Nemet via llvm-dev


On Apr 27, 2018, at 7:48 AM, Son Tuan VU <[hidden email]> wrote:

Hi all,

Consider this ARM assembly code of a C function:

00008124 <foo>:
    8124:                   push    {r4, r6, r7, lr}
    8126:                   add     r7, sp, #8
    8128:                   mov     r4, r0
    812a:                   ldrsb.w r0, [r2]
    812e:                   cmp     r0, #1
    8130:                   itt     lt
    8132:                   movlt   r0, #85 ; 0x55
    8134:                   poplt   {r4, r6, r7, pc}            // a function return

    8136:                   ldrb.w  ip, [r1, #3]
    813a:                   ldrb.w  lr, [r4, #3]
    813e:                   movs    r0, #85 ; 0x55
    8140:                   cmp     lr, ip
    8142:                   bne.n   8168 <foo+0x44>

    8144:                   ldrb.w  ip, [r1, #2]
    8148:                   ldrb    r3, [r4, #2]
    814a:                   cmp     r3, ip
    814c:                   it      ne
    814e:                   popne   {r4, r6, r7, pc}          // a function return

    8150:                   ldrb.w  ip, [r1, #1]
    8154:                   ldrb    r3, [r4, #1]
    8156:                   cmp     r3, ip
    8158:                   bne.n   8168 <foo+0x44>

    815a:                   ldrb    r1, [r1, #0]
    815c:                   ldrb    r3, [r4, #0]
    815e:                   cmp     r3, r1
    8160:                   ittt    eq
    8162:                   moveq   r0, #3
    8164:                   strbeq  r0, [r2, #0]
    8166:                   moveq   r0, #170        ; 0xaa
    8168:                   pop     {r4, r6, r7, pc}          // a function return

I have a variable bar and here's its corresponding DWARF DIE:

 <2><3b>: Abbrev Number: 3 (DW_TAG_formal_parameter)
    <3c>   DW_AT_location    : 0x0 (location list)
    <40>   DW_AT_name        : (indirect string, offset: 0x9e): bar
    <44>   DW_AT_decl_file   : 1
    <45>   DW_AT_decl_line   : 34
    <46>   DW_AT_type        : <0x153>

 // Its location list
    00000000 00008124 0000812a (DW_OP_reg0 (r0))
    0000000b 0000812a 00008136 (DW_OP_reg4 (r4))
    00000016 <End of list>

As you can see, it says that we can find bar in r4 from 0x812a to 0x8134 (poplt).  However, this is only true when the cmp instruction at 0x812e yields less than (lt).  So if the value in r0 is greater than 1 (which is the case of my input), we should still be able to read the value of bar from r4 in the remaining of the function.

I don't know if we can consider this a bug, because I don't even know what should be the correct location information for bar. However, in this case, since the conditional instruction that clobbers r4 is a function return, I'd expect to read the value of bar from r4 in the remaining of the function. 

I can't tell for sure whether the debug info is correct without also seeing the source code, but as a general point: Debug information is must-information that holds over all paths through the program. Debug information that is only accurate for some paths is a bug. A serious bug, because if the user can't rely on the debug info to be correct in some cases, they can't rely on any of the debug info to be correct.

-- adrian


If the conditional instruction poplt was addlt r4, r0, 3 for example, what should be the correct location list of bar?

For now, my only idea is to check if the clobbering MI is a conditional return in DbgValueHistoryCalculator which computes the end address of a location llist entry. But I do not feel like this is the correct fix though.

Looking forward to hearing your thoughts on this,

Thank you for reading this,

Son Tuan Vu


_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [DbgInfo] Potential bug in location list address ranges

Adam Nemet via llvm-dev

As Adrian said, we'd need to see the source of foo() to assess what the location-list for bar ought to be.

Without actually going to look, I would guess that 'poplt' is considered a conditional move, therefore r4's contents are not guaranteed after it executes (i.e. it is a clobber).  If one operand of 'poplt' is 'pc' then of course it is also a conditional indirect branch (which is probably but not necessarily a return).  This combination might be worth handling differently for location-list purposes.

But this is a tricky area, and we'd need to consider the consequences carefully.

--paulr

 

From: [hidden email] [mailto:[hidden email]]
Sent: Friday, April 27, 2018 11:22 AM
To: Son Tuan VU
Cc: Robinson, Paul; Vedant Kumar; [hidden email]; llvm-dev
Subject: Re: [DbgInfo] Potential bug in location list address ranges

 

 



On Apr 27, 2018, at 7:48 AM, Son Tuan VU <[hidden email]> wrote:

 

Hi all,

 

Consider this ARM assembly code of a C function:

 

00008124 <foo>:

    8124:                   push    {r4, r6, r7, lr}

    8126:                   add     r7, sp, #8

    8128:                   mov     r4, r0

    812a:                   ldrsb.w r0, [r2]

    812e:                   cmp     r0, #1

    8130:                   itt     lt

    8132:                   movlt   r0, #85 ; 0x55

    8134:                   poplt   {r4, r6, r7, pc}            // a function return

 

    8136:                   ldrb.w  ip, [r1, #3]

    813a:                   ldrb.w  lr, [r4, #3]

    813e:                   movs    r0, #85 ; 0x55

    8140:                   cmp     lr, ip

    8142:                   bne.n   8168 <foo+0x44>

 

    8144:                   ldrb.w  ip, [r1, #2]

    8148:                   ldrb    r3, [r4, #2]

    814a:                   cmp     r3, ip

    814c:                   it      ne

    814e:                   popne   {r4, r6, r7, pc}          // a function return

 

    8150:                   ldrb.w  ip, [r1, #1]

    8154:                   ldrb    r3, [r4, #1]

    8156:                   cmp     r3, ip

    8158:                   bne.n   8168 <foo+0x44>

 

    815a:                   ldrb    r1, [r1, #0]

    815c:                   ldrb    r3, [r4, #0]

    815e:                   cmp     r3, r1

    8160:                   ittt    eq

    8162:                   moveq   r0, #3

    8164:                   strbeq  r0, [r2, #0]

    8166:                   moveq   r0, #170        ; 0xaa

    8168:                   pop     {r4, r6, r7, pc}          // a function return

 

I have a variable bar and here's its corresponding DWARF DIE:

 

 <2><3b>: Abbrev Number: 3 (DW_TAG_formal_parameter)

    <3c>   DW_AT_location    : 0x0 (location list)

    <40>   DW_AT_name        : (indirect string, offset: 0x9e): bar

    <44>   DW_AT_decl_file   : 1

    <45>   DW_AT_decl_line   : 34

    <46>   DW_AT_type        : <0x153>

 

 // Its location list

    00000000 00008124 0000812a (DW_OP_reg0 (r0))

    0000000b 0000812a 00008136 (DW_OP_reg4 (r4))

    00000016 <End of list>

 

As you can see, it says that we can find bar in r4 from 0x812a to 0x8134 (poplt).  However, this is only true when the cmp instruction at 0x812e yields less than (lt).  So if the value in r0 is greater than 1 (which is the case of my input), we should still be able to read the value of bar from r4 in the remaining of the function.

 

I don't know if we can consider this a bug, because I don't even know what should be the correct location information for bar. However, in this case, since the conditional instruction that clobbers r4 is a function return, I'd expect to read the value of bar from r4 in the remaining of the function. 

 

I can't tell for sure whether the debug info is correct without also seeing the source code, but as a general point: Debug information is must-information that holds over all paths through the program. Debug information that is only accurate for some paths is a bug. A serious bug, because if the user can't rely on the debug info to be correct in some cases, they can't rely on any of the debug info to be correct.

 

-- adrian



 

If the conditional instruction poplt was addlt r4, r0, 3 for example, what should be the correct location list of bar?

 

For now, my only idea is to check if the clobbering MI is a conditional return in DbgValueHistoryCalculator which computes the end address of a location llist entry. But I do not feel like this is the correct fix though.

 

Looking forward to hearing your thoughts on this,

 

Thank you for reading this,

 

Son Tuan Vu

 


_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [DbgInfo] Potential bug in location list address ranges

Adam Nemet via llvm-dev
Thank you all for taking a look at this.  I pasted the C source then deleted it because I was afraid that it was too long to read...

Here's the code of foo. Its real name is verifyPIN. The variable bar is userPin.

int verifyPIN(char *userPin, char *cardPin, int *cpt)
{
  int i;
  int status;
  int diff;

  if (*cpt > 0) {
    status = 0x55;
    diff = 0x55;

    for (i = 0; i < 4; i++) {
      if (userPin[i] != cardPin[i]) {
        diff = 0xAA;
      }
    }

    if (diff == 0x55) {
      status = 0xAA;
    }
    else {
      status = 0x55;
    }

    if (status == 0xAA) {
      *cpt = 3;
      return 0xAA;
    } else {
      *cpt--;
      return 0x55;
    }
  }

  return 0x55;
}

@paul: Yes you are right, I have investigated the backend and it all starts at IfConversionPass. r4 is clobbered by poplt, and there's no logic to handle conditional instruction in DbgValueHistoryCalculator, thus the issue at the binary level.

Son Tuan Vu

On Fri, Apr 27, 2018 at 5:53 PM, <[hidden email]> wrote:

As Adrian said, we'd need to see the source of foo() to assess what the location-list for bar ought to be.

Without actually going to look, I would guess that 'poplt' is considered a conditional move, therefore r4's contents are not guaranteed after it executes (i.e. it is a clobber).  If one operand of 'poplt' is 'pc' then of course it is also a conditional indirect branch (which is probably but not necessarily a return).  This combination might be worth handling differently for location-list purposes.

But this is a tricky area, and we'd need to consider the consequences carefully.

--paulr

 

From: [hidden email] [mailto:[hidden email]]
Sent: Friday, April 27, 2018 11:22 AM
To: Son Tuan VU
Cc: Robinson, Paul; Vedant Kumar; [hidden email]; llvm-dev
Subject: Re: [DbgInfo] Potential bug in location list address ranges

 

 



On Apr 27, 2018, at 7:48 AM, Son Tuan VU <[hidden email]> wrote:

 

Hi all,

 

Consider this ARM assembly code of a C function:

 

00008124 <foo>:

    8124:                   push    {r4, r6, r7, lr}

    8126:                   add     r7, sp, #8

    8128:                   mov     r4, r0

    812a:                   ldrsb.w r0, [r2]

    812e:                   cmp     r0, #1

    8130:                   itt     lt

    8132:                   movlt   r0, #85 ; 0x55

    8134:                   poplt   {r4, r6, r7, pc}            // a function return

 

    8136:                   ldrb.w  ip, [r1, #3]

    813a:                   ldrb.w  lr, [r4, #3]

    813e:                   movs    r0, #85 ; 0x55

    8140:                   cmp     lr, ip

    8142:                   bne.n   8168 <foo+0x44>

 

    8144:                   ldrb.w  ip, [r1, #2]

    8148:                   ldrb    r3, [r4, #2]

    814a:                   cmp     r3, ip

    814c:                   it      ne

    814e:                   popne   {r4, r6, r7, pc}          // a function return

 

    8150:                   ldrb.w  ip, [r1, #1]

    8154:                   ldrb    r3, [r4, #1]

    8156:                   cmp     r3, ip

    8158:                   bne.n   8168 <foo+0x44>

 

    815a:                   ldrb    r1, [r1, #0]

    815c:                   ldrb    r3, [r4, #0]

    815e:                   cmp     r3, r1

    8160:                   ittt    eq

    8162:                   moveq   r0, #3

    8164:                   strbeq  r0, [r2, #0]

    8166:                   moveq   r0, #170        ; 0xaa

    8168:                   pop     {r4, r6, r7, pc}          // a function return

 

I have a variable bar and here's its corresponding DWARF DIE:

 

 <2><3b>: Abbrev Number: 3 (DW_TAG_formal_parameter)

    <3c>   DW_AT_location    : 0x0 (location list)

    <40>   DW_AT_name        : (indirect string, offset: 0x9e): bar

    <44>   DW_AT_decl_file   : 1

    <45>   DW_AT_decl_line   : 34

    <46>   DW_AT_type        : <0x153>

 

 // Its location list

    00000000 00008124 0000812a (DW_OP_reg0 (r0))

    0000000b 0000812a 00008136 (DW_OP_reg4 (r4))

    00000016 <End of list>

 

As you can see, it says that we can find bar in r4 from 0x812a to 0x8134 (poplt).  However, this is only true when the cmp instruction at 0x812e yields less than (lt).  So if the value in r0 is greater than 1 (which is the case of my input), we should still be able to read the value of bar from r4 in the remaining of the function.

 

I don't know if we can consider this a bug, because I don't even know what should be the correct location information for bar. However, in this case, since the conditional instruction that clobbers r4 is a function return, I'd expect to read the value of bar from r4 in the remaining of the function. 

 

I can't tell for sure whether the debug info is correct without also seeing the source code, but as a general point: Debug information is must-information that holds over all paths through the program. Debug information that is only accurate for some paths is a bug. A serious bug, because if the user can't rely on the debug info to be correct in some cases, they can't rely on any of the debug info to be correct.

 

-- adrian



 

If the conditional instruction poplt was addlt r4, r0, 3 for example, what should be the correct location list of bar?

 

For now, my only idea is to check if the clobbering MI is a conditional return in DbgValueHistoryCalculator which computes the end address of a location llist entry. But I do not feel like this is the correct fix though.

 

Looking forward to hearing your thoughts on this,

 

Thank you for reading this,

 

Son Tuan Vu

 



_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [DbgInfo] Potential bug in location list address ranges

Adam Nemet via llvm-dev
Hello,

Has anyone taken a look at this bug? I really want to fix this, but as Paul pointed out, this requires a lot of care...

Thank you for your help

Son Tuan Vu

On Fri, Apr 27, 2018 at 7:29 PM, Son Tuan VU <[hidden email]> wrote:
Thank you all for taking a look at this.  I pasted the C source then deleted it because I was afraid that it was too long to read...

Here's the code of foo. Its real name is verifyPIN. The variable bar is userPin.

int verifyPIN(char *userPin, char *cardPin, int *cpt)
{
  int i;
  int status;
  int diff;

  if (*cpt > 0) {
    status = 0x55;
    diff = 0x55;

    for (i = 0; i < 4; i++) {
      if (userPin[i] != cardPin[i]) {
        diff = 0xAA;
      }
    }

    if (diff == 0x55) {
      status = 0xAA;
    }
    else {
      status = 0x55;
    }

    if (status == 0xAA) {
      *cpt = 3;
      return 0xAA;
    } else {
      *cpt--;
      return 0x55;
    }
  }

  return 0x55;
}

@paul: Yes you are right, I have investigated the backend and it all starts at IfConversionPass. r4 is clobbered by poplt, and there's no logic to handle conditional instruction in DbgValueHistoryCalculator, thus the issue at the binary level.

Son Tuan Vu

On Fri, Apr 27, 2018 at 5:53 PM, <[hidden email]> wrote:

As Adrian said, we'd need to see the source of foo() to assess what the location-list for bar ought to be.

Without actually going to look, I would guess that 'poplt' is considered a conditional move, therefore r4's contents are not guaranteed after it executes (i.e. it is a clobber).  If one operand of 'poplt' is 'pc' then of course it is also a conditional indirect branch (which is probably but not necessarily a return).  This combination might be worth handling differently for location-list purposes.

But this is a tricky area, and we'd need to consider the consequences carefully.

--paulr

 

From: [hidden email] [mailto:[hidden email]]
Sent: Friday, April 27, 2018 11:22 AM
To: Son Tuan VU
Cc: Robinson, Paul; Vedant Kumar; [hidden email]; llvm-dev
Subject: Re: [DbgInfo] Potential bug in location list address ranges

 

 



On Apr 27, 2018, at 7:48 AM, Son Tuan VU <[hidden email]> wrote:

 

Hi all,

 

Consider this ARM assembly code of a C function:

 

00008124 <foo>:

    8124:                   push    {r4, r6, r7, lr}

    8126:                   add     r7, sp, #8

    8128:                   mov     r4, r0

    812a:                   ldrsb.w r0, [r2]

    812e:                   cmp     r0, #1

    8130:                   itt     lt

    8132:                   movlt   r0, #85 ; 0x55

    8134:                   poplt   {r4, r6, r7, pc}            // a function return

 

    8136:                   ldrb.w  ip, [r1, #3]

    813a:                   ldrb.w  lr, [r4, #3]

    813e:                   movs    r0, #85 ; 0x55

    8140:                   cmp     lr, ip

    8142:                   bne.n   8168 <foo+0x44>

 

    8144:                   ldrb.w  ip, [r1, #2]

    8148:                   ldrb    r3, [r4, #2]

    814a:                   cmp     r3, ip

    814c:                   it      ne

    814e:                   popne   {r4, r6, r7, pc}          // a function return

 

    8150:                   ldrb.w  ip, [r1, #1]

    8154:                   ldrb    r3, [r4, #1]

    8156:                   cmp     r3, ip

    8158:                   bne.n   8168 <foo+0x44>

 

    815a:                   ldrb    r1, [r1, #0]

    815c:                   ldrb    r3, [r4, #0]

    815e:                   cmp     r3, r1

    8160:                   ittt    eq

    8162:                   moveq   r0, #3

    8164:                   strbeq  r0, [r2, #0]

    8166:                   moveq   r0, #170        ; 0xaa

    8168:                   pop     {r4, r6, r7, pc}          // a function return

 

I have a variable bar and here's its corresponding DWARF DIE:

 

 <2><3b>: Abbrev Number: 3 (DW_TAG_formal_parameter)

    <3c>   DW_AT_location    : 0x0 (location list)

    <40>   DW_AT_name        : (indirect string, offset: 0x9e): bar

    <44>   DW_AT_decl_file   : 1

    <45>   DW_AT_decl_line   : 34

    <46>   DW_AT_type        : <0x153>

 

 // Its location list

    00000000 00008124 0000812a (DW_OP_reg0 (r0))

    0000000b 0000812a 00008136 (DW_OP_reg4 (r4))

    00000016 <End of list>

 

As you can see, it says that we can find bar in r4 from 0x812a to 0x8134 (poplt).  However, this is only true when the cmp instruction at 0x812e yields less than (lt).  So if the value in r0 is greater than 1 (which is the case of my input), we should still be able to read the value of bar from r4 in the remaining of the function.

 

I don't know if we can consider this a bug, because I don't even know what should be the correct location information for bar. However, in this case, since the conditional instruction that clobbers r4 is a function return, I'd expect to read the value of bar from r4 in the remaining of the function. 

 

I can't tell for sure whether the debug info is correct without also seeing the source code, but as a general point: Debug information is must-information that holds over all paths through the program. Debug information that is only accurate for some paths is a bug. A serious bug, because if the user can't rely on the debug info to be correct in some cases, they can't rely on any of the debug info to be correct.

 

-- adrian



 

If the conditional instruction poplt was addlt r4, r0, 3 for example, what should be the correct location list of bar?

 

For now, my only idea is to check if the clobbering MI is a conditional return in DbgValueHistoryCalculator which computes the end address of a location llist entry. But I do not feel like this is the correct fix though.

 

Looking forward to hearing your thoughts on this,

 

Thank you for reading this,

 

Son Tuan Vu

 




_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [DbgInfo] Potential bug in location list address ranges

Adam Nemet via llvm-dev
Could you file a bug report about this (bugs.llvm.org)? If you don't have an account on bugzilla, I'd be happy to file one for you. Please provide exact instructions to reproduce the issue including any compilation flags.

thanks,
vedant

On May 7, 2018, at 9:16 AM, Son Tuan VU <[hidden email]> wrote:

Hello,

Has anyone taken a look at this bug? I really want to fix this, but as Paul pointed out, this requires a lot of care...

Thank you for your help

Son Tuan Vu

On Fri, Apr 27, 2018 at 7:29 PM, Son Tuan VU <[hidden email]> wrote:
Thank you all for taking a look at this.  I pasted the C source then deleted it because I was afraid that it was too long to read...

Here's the code of foo. Its real name is verifyPIN. The variable bar is userPin.

int verifyPIN(char *userPin, char *cardPin, int *cpt)
{
  int i;
  int status;
  int diff;

  if (*cpt > 0) {
    status = 0x55;
    diff = 0x55;

    for (i = 0; i < 4; i++) {
      if (userPin[i] != cardPin[i]) {
        diff = 0xAA;
      }
    }

    if (diff == 0x55) {
      status = 0xAA;
    }
    else {
      status = 0x55;
    }

    if (status == 0xAA) {
      *cpt = 3;
      return 0xAA;
    } else {
      *cpt--;
      return 0x55;
    }
  }

  return 0x55;
}

@paul: Yes you are right, I have investigated the backend and it all starts at IfConversionPass. r4 is clobbered by poplt, and there's no logic to handle conditional instruction in DbgValueHistoryCalculator, thus the issue at the binary level.

Son Tuan Vu

On Fri, Apr 27, 2018 at 5:53 PM, <[hidden email]> wrote:

As Adrian said, we'd need to see the source of foo() to assess what the location-list for bar ought to be.

Without actually going to look, I would guess that 'poplt' is considered a conditional move, therefore r4's contents are not guaranteed after it executes (i.e. it is a clobber).  If one operand of 'poplt' is 'pc' then of course it is also a conditional indirect branch (which is probably but not necessarily a return).  This combination might be worth handling differently for location-list purposes.

But this is a tricky area, and we'd need to consider the consequences carefully.

--paulr

 

From: [hidden email] [mailto:[hidden email]]
Sent: Friday, April 27, 2018 11:22 AM
To: Son Tuan VU
Cc: Robinson, Paul; Vedant Kumar; [hidden email]; llvm-dev
Subject: Re: [DbgInfo] Potential bug in location list address ranges

 

 



On Apr 27, 2018, at 7:48 AM, Son Tuan VU <[hidden email]> wrote:

 

Hi all,

 

Consider this ARM assembly code of a C function:

 

00008124 <foo>:

    8124:                   push    {r4, r6, r7, lr}

    8126:                   add     r7, sp, #8

    8128:                   mov     r4, r0

    812a:                   ldrsb.w r0, [r2]

    812e:                   cmp     r0, #1

    8130:                   itt     lt

    8132:                   movlt   r0, #85 ; 0x55

    8134:                   poplt   {r4, r6, r7, pc}            // a function return

 

    8136:                   ldrb.w  ip, [r1, #3]

    813a:                   ldrb.w  lr, [r4, #3]

    813e:                   movs    r0, #85 ; 0x55

    8140:                   cmp     lr, ip

    8142:                   bne.n   8168 <foo+0x44>

 

    8144:                   ldrb.w  ip, [r1, #2]

    8148:                   ldrb    r3, [r4, #2]

    814a:                   cmp     r3, ip

    814c:                   it      ne

    814e:                   popne   {r4, r6, r7, pc}          // a function return

 

    8150:                   ldrb.w  ip, [r1, #1]

    8154:                   ldrb    r3, [r4, #1]

    8156:                   cmp     r3, ip

    8158:                   bne.n   8168 <foo+0x44>

 

    815a:                   ldrb    r1, [r1, #0]

    815c:                   ldrb    r3, [r4, #0]

    815e:                   cmp     r3, r1

    8160:                   ittt    eq

    8162:                   moveq   r0, #3

    8164:                   strbeq  r0, [r2, #0]

    8166:                   moveq   r0, #170        ; 0xaa

    8168:                   pop     {r4, r6, r7, pc}          // a function return

 

I have a variable bar and here's its corresponding DWARF DIE:

 

 <2><3b>: Abbrev Number: 3 (DW_TAG_formal_parameter)

    <3c>   DW_AT_location    : 0x0 (location list)

    <40>   DW_AT_name        : (indirect string, offset: 0x9e): bar

    <44>   DW_AT_decl_file   : 1

    <45>   DW_AT_decl_line   : 34

    <46>   DW_AT_type        : <0x153>

 

 // Its location list

    00000000 00008124 0000812a (DW_OP_reg0 (r0))

    0000000b 0000812a 00008136 (DW_OP_reg4 (r4))

    00000016 <End of list>

 

As you can see, it says that we can find bar in r4 from 0x812a to 0x8134 (poplt).  However, this is only true when the cmp instruction at 0x812e yields less than (lt).  So if the value in r0 is greater than 1 (which is the case of my input), we should still be able to read the value of bar from r4 in the remaining of the function.

 

I don't know if we can consider this a bug, because I don't even know what should be the correct location information for bar. However, in this case, since the conditional instruction that clobbers r4 is a function return, I'd expect to read the value of bar from r4 in the remaining of the function. 

 

I can't tell for sure whether the debug info is correct without also seeing the source code, but as a general point: Debug information is must-information that holds over all paths through the program. Debug information that is only accurate for some paths is a bug. A serious bug, because if the user can't rely on the debug info to be correct in some cases, they can't rely on any of the debug info to be correct.

 

-- adrian



 

If the conditional instruction poplt was addlt r4, r0, 3 for example, what should be the correct location list of bar?

 

For now, my only idea is to check if the clobbering MI is a conditional return in DbgValueHistoryCalculator which computes the end address of a location llist entry. But I do not feel like this is the correct fix though.

 

Looking forward to hearing your thoughts on this,

 

Thank you for reading this,

 

Son Tuan Vu

 





_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [DbgInfo] Potential bug in location list address ranges

Adam Nemet via llvm-dev
Hello,

This is the bug report related to this thread https://bugs.llvm.org/show_bug.cgi?id=37391. I didn't receive an e-mail for creating the bug so I post it here just in case you miss the notification.

Thanks,

Son Tuan Vu

On Mon, May 7, 2018 at 9:36 PM, Vedant Kumar <[hidden email]> wrote:
Could you file a bug report about this (bugs.llvm.org)? If you don't have an account on bugzilla, I'd be happy to file one for you. Please provide exact instructions to reproduce the issue including any compilation flags.

thanks,
vedant

On May 7, 2018, at 9:16 AM, Son Tuan VU <[hidden email]> wrote:

Hello,

Has anyone taken a look at this bug? I really want to fix this, but as Paul pointed out, this requires a lot of care...

Thank you for your help

Son Tuan Vu

On Fri, Apr 27, 2018 at 7:29 PM, Son Tuan VU <[hidden email]> wrote:
Thank you all for taking a look at this.  I pasted the C source then deleted it because I was afraid that it was too long to read...

Here's the code of foo. Its real name is verifyPIN. The variable bar is userPin.

int verifyPIN(char *userPin, char *cardPin, int *cpt)
{
  int i;
  int status;
  int diff;

  if (*cpt > 0) {
    status = 0x55;
    diff = 0x55;

    for (i = 0; i < 4; i++) {
      if (userPin[i] != cardPin[i]) {
        diff = 0xAA;
      }
    }

    if (diff == 0x55) {
      status = 0xAA;
    }
    else {
      status = 0x55;
    }

    if (status == 0xAA) {
      *cpt = 3;
      return 0xAA;
    } else {
      *cpt--;
      return 0x55;
    }
  }

  return 0x55;
}

@paul: Yes you are right, I have investigated the backend and it all starts at IfConversionPass. r4 is clobbered by poplt, and there's no logic to handle conditional instruction in DbgValueHistoryCalculator, thus the issue at the binary level.

Son Tuan Vu

On Fri, Apr 27, 2018 at 5:53 PM, <[hidden email]> wrote:

As Adrian said, we'd need to see the source of foo() to assess what the location-list for bar ought to be.

Without actually going to look, I would guess that 'poplt' is considered a conditional move, therefore r4's contents are not guaranteed after it executes (i.e. it is a clobber).  If one operand of 'poplt' is 'pc' then of course it is also a conditional indirect branch (which is probably but not necessarily a return).  This combination might be worth handling differently for location-list purposes.

But this is a tricky area, and we'd need to consider the consequences carefully.

--paulr

 

From: [hidden email] [mailto:[hidden email]]
Sent: Friday, April 27, 2018 11:22 AM
To: Son Tuan VU
Cc: Robinson, Paul; Vedant Kumar; [hidden email]; llvm-dev
Subject: Re: [DbgInfo] Potential bug in location list address ranges

 

 



On Apr 27, 2018, at 7:48 AM, Son Tuan VU <[hidden email]> wrote:

 

Hi all,

 

Consider this ARM assembly code of a C function:

 

00008124 <foo>:

    8124:                   push    {r4, r6, r7, lr}

    8126:                   add     r7, sp, #8

    8128:                   mov     r4, r0

    812a:                   ldrsb.w r0, [r2]

    812e:                   cmp     r0, #1

    8130:                   itt     lt

    8132:                   movlt   r0, #85 ; 0x55

    8134:                   poplt   {r4, r6, r7, pc}            // a function return

 

    8136:                   ldrb.w  ip, [r1, #3]

    813a:                   ldrb.w  lr, [r4, #3]

    813e:                   movs    r0, #85 ; 0x55

    8140:                   cmp     lr, ip

    8142:                   bne.n   8168 <foo+0x44>

 

    8144:                   ldrb.w  ip, [r1, #2]

    8148:                   ldrb    r3, [r4, #2]

    814a:                   cmp     r3, ip

    814c:                   it      ne

    814e:                   popne   {r4, r6, r7, pc}          // a function return

 

    8150:                   ldrb.w  ip, [r1, #1]

    8154:                   ldrb    r3, [r4, #1]

    8156:                   cmp     r3, ip

    8158:                   bne.n   8168 <foo+0x44>

 

    815a:                   ldrb    r1, [r1, #0]

    815c:                   ldrb    r3, [r4, #0]

    815e:                   cmp     r3, r1

    8160:                   ittt    eq

    8162:                   moveq   r0, #3

    8164:                   strbeq  r0, [r2, #0]

    8166:                   moveq   r0, #170        ; 0xaa

    8168:                   pop     {r4, r6, r7, pc}          // a function return

 

I have a variable bar and here's its corresponding DWARF DIE:

 

 <2><3b>: Abbrev Number: 3 (DW_TAG_formal_parameter)

    <3c>   DW_AT_location    : 0x0 (location list)

    <40>   DW_AT_name        : (indirect string, offset: 0x9e): bar

    <44>   DW_AT_decl_file   : 1

    <45>   DW_AT_decl_line   : 34

    <46>   DW_AT_type        : <0x153>

 

 // Its location list

    00000000 00008124 0000812a (DW_OP_reg0 (r0))

    0000000b 0000812a 00008136 (DW_OP_reg4 (r4))

    00000016 <End of list>

 

As you can see, it says that we can find bar in r4 from 0x812a to 0x8134 (poplt).  However, this is only true when the cmp instruction at 0x812e yields less than (lt).  So if the value in r0 is greater than 1 (which is the case of my input), we should still be able to read the value of bar from r4 in the remaining of the function.

 

I don't know if we can consider this a bug, because I don't even know what should be the correct location information for bar. However, in this case, since the conditional instruction that clobbers r4 is a function return, I'd expect to read the value of bar from r4 in the remaining of the function. 

 

I can't tell for sure whether the debug info is correct without also seeing the source code, but as a general point: Debug information is must-information that holds over all paths through the program. Debug information that is only accurate for some paths is a bug. A serious bug, because if the user can't rely on the debug info to be correct in some cases, they can't rely on any of the debug info to be correct.

 

-- adrian



 

If the conditional instruction poplt was addlt r4, r0, 3 for example, what should be the correct location list of bar?

 

For now, my only idea is to check if the clobbering MI is a conditional return in DbgValueHistoryCalculator which computes the end address of a location llist entry. But I do not feel like this is the correct fix though.

 

Looking forward to hearing your thoughts on this,

 

Thank you for reading this,

 

Son Tuan Vu

 






_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev