[llvm-dev] [RFC] Removing debug locations from ConstantSDNodes

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
19 messages Options
Reply | Threaded
Open this post in threaded view
|

[llvm-dev] [RFC] Removing debug locations from ConstantSDNodes

Robin Eklind via llvm-dev
Some time ago I saw a report of strange single-stepping behavior given Swift code that looks like this:

1|  var input = [16, 14, 10, 9, 8, 7, 4, 3, 2, 1] //< Only number 1 is relevant.
2|  print("start")

TL;DR: The debugger steps from line 2 -> 1 -> 2 in this code example. One clean way to fix this bug is to remove debug locations from ConstantSDNodes, and I'm asking if there are any objections to doing that.

In more detail:

The problem appears to be that we assign a debug location to the ConstantSDNode for "1". When this constant is used again (as it happens to be in the lowered version of the call to print()) the debugger steps back to the first use of the constant. Here's a snippet of MIR output that illustrates the problem:

%19:gr64 = ADD64ri8 %18, 8, implicit-def dead %eflags; GR64:%19,%18 dbg:destroy-after-foreach.swift:2:7
%20:gr32 = MOV32ri64 1; GR32:%20 dbg:destroy-after-foreach.swift:1:44
%21:gr64 = SUBREG_TO_REG 0, killed %20, sub_32bit; GR64:%21 GR32:%20 dbg:destroy-after-foreach.swift:1:44
%rdi = COPY %21; GR64:%21 dbg:destroy-after-foreach.swift:2:7

The out-of-order stepping behavior is confusing and unexpected to users. ISTM that the simplest way to fix this bug is to not attach debug locations to constant SDNodes. This matches the model used by IR (Constants don't have DebugLocs). As an added benefit, it would let us delete a fair amount of code which traffics in SDLocs. I'm not sure what (if any) impact this would have on sampling-based profilers. SamplePGO, at least, has smoothing methods specifically designed to paper over this sort of "out-of-place" debug location issue (see https://storage.googleapis.com/pub-tools-public-publication-data/pdf/45290.pdf, Section 4.1.4, "Sources of Profile Inaccuracies").

I brought all of this up a while ago on IRC: some feedback was positive, but I didn't get a clear +1 or -1 to proceed.

Please let me know what you think, if there are cleaner alternatives to consider, etc. Any comments/feedback would be appreciated.

Someone (Reid?) mentioned that we could try sinking constants to their point of first use as an alternative, and (IIUC) create new nodes with distinct DebugLocs for each use of a constant. I don't recall the details of that alternative clearly. Based on my (likely incorrect) understanding of it, dropping locations from constants outright might be simpler.

thanks,
vedant

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Removing debug locations from ConstantSDNodes

Robin Eklind via llvm-dev
On Tue, Jun 19, 2018 at 5:46 PM Vedant Kumar <[hidden email]> wrote:
Someone (Reid?) mentioned that we could try sinking constants to their point of first use as an alternative, and (IIUC) create new nodes with distinct DebugLocs for each use of a constant. I don't recall the details of that alternative clearly. Based on my (likely incorrect) understanding of it, dropping locations from constants outright might be simpler.

Our use case was in fastisel, so things are different. I don't think my solution will help you.

In our case, users were complaining about code like this:
volatile int do_something;
void f() {
  ++do_something;
  foo(1, 2, 3);
  ++do_something;
}

We'd generate locations like:
  .loc line 1
  incl do_something(%rip)
  movl $1, %ecx
  movl $2, %edx
  movl $3, %r8d
  .loc line 2 # line 2 starts here, instead of 3 instructions earlier
  callq foo
  .loc line 3
  incl do_something(%rip)

Our users really wanted the line table entry for line 2 to include the constant materialization code for some VS debugger feature.

I think if you remove locations from ConstantSDNodes, you might want to add a late pass that propagates source locations backwards onto location-less instructions. This would also avoid some special cases when a basic block starts with an instruction that lacks location information. See CodeViewDebug::beginInstruction and DwarfDebug::beginInstruction for what we do today.

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Removing debug locations from ConstantSDNodes

Robin Eklind via llvm-dev

Our users really wanted the line table entry for line 2 to include the constant materialization code for some VS debugger feature.

 

Same here.  I have serious doubts that the fastisel tactic of bubbling constants up to the top of the block really helps code size or performance that much, and it distinctly hurts the debugging experience.  It's reordering code at –O0 and that is generally a Bad Thing™.

--paulr

 

From: Reid Kleckner [mailto:[hidden email]]
Sent: Tuesday, June 19, 2018 9:37 PM
To: Vedant Kumar; Robinson, Paul
Cc: llvm-dev; Justin Bogner; David Li; David Blaikie
Subject: Re: [RFC] Removing debug locations from ConstantSDNodes

 

On Tue, Jun 19, 2018 at 5:46 PM Vedant Kumar <[hidden email]> wrote:

Someone (Reid?) mentioned that we could try sinking constants to their point of first use as an alternative, and (IIUC) create new nodes with distinct DebugLocs for each use of a constant. I don't recall the details of that alternative clearly. Based on my (likely incorrect) understanding of it, dropping locations from constants outright might be simpler.

 

Our use case was in fastisel, so things are different. I don't think my solution will help you.

 

In our case, users were complaining about code like this:

volatile int do_something;

void f() {

  ++do_something;

  foo(1, 2, 3);

  ++do_something;
}

 

We'd generate locations like:

  .loc line 1

  incl do_something(%rip)

  movl $1, %ecx

  movl $2, %edx

  movl $3, %r8d

  .loc line 2 # line 2 starts here, instead of 3 instructions earlier

  callq foo

  .loc line 3

  incl do_something(%rip)

 

Our users really wanted the line table entry for line 2 to include the constant materialization code for some VS debugger feature.

 

I think if you remove locations from ConstantSDNodes, you might want to add a late pass that propagates source locations backwards onto location-less instructions. This would also avoid some special cases when a basic block starts with an instruction that lacks location information. See CodeViewDebug::beginInstruction and DwarfDebug::beginInstruction for what we do today.


_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Removing debug locations from ConstantSDNodes

Robin Eklind via llvm-dev
In reply to this post by Robin Eklind via llvm-dev
[hidden email]  to comment on the potential impact. In general, I prefer the debug info is kept when sample PGO is on.

David


On Tue, Jun 19, 2018 at 5:46 PM Vedant Kumar <[hidden email]> wrote:
Some time ago I saw a report of strange single-stepping behavior given Swift code that looks like this:

1|  var input = [16, 14, 10, 9, 8, 7, 4, 3, 2, 1] //< Only number 1 is relevant.
2|  print("start")

TL;DR: The debugger steps from line 2 -> 1 -> 2 in this code example. One clean way to fix this bug is to remove debug locations from ConstantSDNodes, and I'm asking if there are any objections to doing that.

In more detail:

The problem appears to be that we assign a debug location to the ConstantSDNode for "1". When this constant is used again (as it happens to be in the lowered version of the call to print()) the debugger steps back to the first use of the constant. Here's a snippet of MIR output that illustrates the problem:

%19:gr64 = ADD64ri8 %18, 8, implicit-def dead %eflags; GR64:%19,%18 dbg:destroy-after-foreach.swift:2:7
%20:gr32 = MOV32ri64 1; GR32:%20 dbg:destroy-after-foreach.swift:1:44
%21:gr64 = SUBREG_TO_REG 0, killed %20, sub_32bit; GR64:%21 GR32:%20 dbg:destroy-after-foreach.swift:1:44
%rdi = COPY %21; GR64:%21 dbg:destroy-after-foreach.swift:2:7

The out-of-order stepping behavior is confusing and unexpected to users. ISTM that the simplest way to fix this bug is to not attach debug locations to constant SDNodes. This matches the model used by IR (Constants don't have DebugLocs). As an added benefit, it would let us delete a fair amount of code which traffics in SDLocs. I'm not sure what (if any) impact this would have on sampling-based profilers. SamplePGO, at least, has smoothing methods specifically designed to paper over this sort of "out-of-place" debug location issue (see https://storage.googleapis.com/pub-tools-public-publication-data/pdf/45290.pdf, Section 4.1.4, "Sources of Profile Inaccuracies").

I brought all of this up a while ago on IRC: some feedback was positive, but I didn't get a clear +1 or -1 to proceed.

Please let me know what you think, if there are cleaner alternatives to consider, etc. Any comments/feedback would be appreciated.

Someone (Reid?) mentioned that we could try sinking constants to their point of first use as an alternative, and (IIUC) create new nodes with distinct DebugLocs for each use of a constant. I don't recall the details of that alternative clearly. Based on my (likely incorrect) understanding of it, dropping locations from constants outright might be simpler.

thanks,
vedant

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Removing debug locations from ConstantSDNodes

Robin Eklind via llvm-dev
In reply to this post by Robin Eklind via llvm-dev
On Tue, Jun 19, 2018 at 7:20 PM <[hidden email]> wrote:

Well, I hope I fixed that in FastISel in r327581, r331087, and r331088. Vedant is talking about SDISel.

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Removing debug locations from ConstantSDNodes

Robin Eklind via llvm-dev
In reply to this post by Robin Eklind via llvm-dev
On Jun 19, 2018, at 6:36 PM, Reid Kleckner <[hidden email]> wrote:

On Tue, Jun 19, 2018 at 5:46 PM Vedant Kumar <[hidden email]> wrote:
Someone (Reid?) mentioned that we could try sinking constants to their point of first use as an alternative, and (IIUC) create new nodes with distinct DebugLocs for each use of a constant. I don't recall the details of that alternative clearly. Based on my (likely incorrect) understanding of it, dropping locations from constants outright might be simpler.

Our use case was in fastisel, so things are different. I don't think my solution will help you.

In our case, users were complaining about code like this:
volatile int do_something;
void f() {
  ++do_something;
  foo(1, 2, 3);
  ++do_something;
}

We'd generate locations like:
  .loc line 1
  incl do_something(%rip)
  movl $1, %ecx
  movl $2, %edx
  movl $3, %r8d
  .loc line 2 # line 2 starts here, instead of 3 instructions earlier
  callq foo
  .loc line 3
  incl do_something(%rip)

Our users really wanted the line table entry for line 2 to include the constant materialization code for some VS debugger feature.

Got it. (For others following along, there's more discussion about this debugger feature in the commit description for r327581.)


I think if you remove locations from ConstantSDNodes, you might want to add a late pass that propagates source locations backwards onto location-less instructions. This would also avoid some special cases when a basic block starts with an instruction that lacks location information. See CodeViewDebug::beginInstruction and DwarfDebug::beginInstruction for what we do today.

Interesting, I hadn't considered doing this.

What happens in the special case where a block starts with a location-less instruction? Ah, I see that CodeViewDebug::beginInstruction does a forward scan to find the first instruction with location. This can fail though: there might not be such an instruction, in which case... I assume we either apply a line-0 location, or nothing at all, similar to Dwarf::beginInstruction?

It's a bit unclear to me what the benefits of a late pass that backwards-propagated locations would be. I suppose we'd have fewer forward scans in CodeViewDebug::beginInstruction, so is the benefit a compile-time win (which would offset the cost of a late pass)? Would the final debug info quality would be better in some way I'm just missing?

thanks!
vedant

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Removing debug locations from ConstantSDNodes

Robin Eklind via llvm-dev

DwarfDebug::beginInstruction sets the location to line 0 because otherwise the location is implicitly the same location as the last instruction of the physically preceding block.  That location is often completely unrelated to what's at the top of the new block.  Line 0 isn't great, but at least it's not a complete lie.

--paulr

 

From: [hidden email] [mailto:[hidden email]]
Sent: Wednesday, June 20, 2018 1:48 PM
To: Reid Kleckner
Cc: Robinson, Paul; llvm-dev; Justin Bogner; David Li; David Blaikie
Subject: Re: [RFC] Removing debug locations from ConstantSDNodes

 

On Jun 19, 2018, at 6:36 PM, Reid Kleckner <[hidden email]> wrote:

 

On Tue, Jun 19, 2018 at 5:46 PM Vedant Kumar <[hidden email]> wrote:

Someone (Reid?) mentioned that we could try sinking constants to their point of first use as an alternative, and (IIUC) create new nodes with distinct DebugLocs for each use of a constant. I don't recall the details of that alternative clearly. Based on my (likely incorrect) understanding of it, dropping locations from constants outright might be simpler.

 

Our use case was in fastisel, so things are different. I don't think my solution will help you.

 

In our case, users were complaining about code like this:

volatile int do_something;

void f() {

  ++do_something;

  foo(1, 2, 3);

  ++do_something;
}

 

We'd generate locations like:

  .loc line 1

  incl do_something(%rip)

  movl $1, %ecx

  movl $2, %edx

  movl $3, %r8d

  .loc line 2 # line 2 starts here, instead of 3 instructions earlier

  callq foo

  .loc line 3

  incl do_something(%rip)



Our users really wanted the line table entry for line 2 to include the constant materialization code for some VS debugger feature.

 

Got it. (For others following along, there's more discussion about this debugger feature in the commit description for r327581.)



I think if you remove locations from ConstantSDNodes, you might want to add a late pass that propagates source locations backwards onto location-less instructions. This would also avoid some special cases when a basic block starts with an instruction that lacks location information. See CodeViewDebug::beginInstruction and DwarfDebug::beginInstruction for what we do today.

 

Interesting, I hadn't considered doing this.

 

What happens in the special case where a block starts with a location-less instruction? Ah, I see that CodeViewDebug::beginInstruction does a forward scan to find the first instruction with location. This can fail though: there might not be such an instruction, in which case... I assume we either apply a line-0 location, or nothing at all, similar to Dwarf::beginInstruction?

 

It's a bit unclear to me what the benefits of a late pass that backwards-propagated locations would be. I suppose we'd have fewer forward scans in CodeViewDebug::beginInstruction, so is the benefit a compile-time win (which would offset the cost of a late pass)? Would the final debug info quality would be better in some way I'm just missing?

 

thanks!

vedant


_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Removing debug locations from ConstantSDNodes

Robin Eklind via llvm-dev
FWIW: Debug information on constants feels odd to me. They are just values not something that is executed so conceptually I would not expect them to "happen" at a specific time/place in the program. That said most numbers are copied into registers or stored into memory and that is of course an interesting action. So in the original example I would hope to see debug info on whatever instructions are used to fill the array with values.

That said I'm not familiar with the inner workings of dwarf or other debugger formats, so it may very well be reasonable to backfill the information in a late pass to avoid having assembler instructions without debug info as some people proposed.

- Matthias

On Jun 20, 2018, at 11:09 AM, via llvm-dev <[hidden email]> wrote:

DwarfDebug::beginInstruction sets the location to line 0 because otherwise the location is implicitly the same location as the last instruction of the physically preceding block.  That location is often completely unrelated to what's at the top of the new block.  Line 0 isn't great, but at least it's not a complete lie.
--paulr
From: [hidden email] [[hidden email]] 
Sent: Wednesday, June 20, 2018 1:48 PM
To: Reid Kleckner
Cc: Robinson, Paul; llvm-dev; Justin Bogner; David Li; David Blaikie
Subject: Re: [RFC] Removing debug locations from ConstantSDNodes
 
On Jun 19, 2018, at 6:36 PM, Reid Kleckner <[hidden email]> wrote:
 
On Tue, Jun 19, 2018 at 5:46 PM Vedant Kumar <[hidden email]> wrote:
Someone (Reid?) mentioned that we could try sinking constants to their point of first use as an alternative, and (IIUC) create new nodes with distinct DebugLocs for each use of a constant. I don't recall the details of that alternative clearly. Based on my (likely incorrect) understanding of it, dropping locations from constants outright might be simpler.
 
Our use case was in fastisel, so things are different. I don't think my solution will help you.
 
In our case, users were complaining about code like this:
volatile int do_something;
void f() {
  ++do_something;
  foo(1, 2, 3);
  ++do_something;
}
 
We'd generate locations like:
  .loc line 1
  incl do_something(%rip)
  movl $1, %ecx
  movl $2, %edx
  movl $3, %r8d
  .loc line 2 # line 2 starts here, instead of 3 instructions earlier
  callq foo
  .loc line 3
  incl do_something(%rip)


Our users really wanted the line table entry for line 2 to include the constant materialization code for some VS debugger feature.
 
Got it. (For others following along, there's more discussion about this debugger feature in the commit description for r327581.)



I think if you remove locations from ConstantSDNodes, you might want to add a late pass that propagates source locations backwards onto location-less instructions. This would also avoid some special cases when a basic block starts with an instruction that lacks location information. See CodeViewDebug::beginInstruction and DwarfDebug::beginInstruction for what we do today.
 
Interesting, I hadn't considered doing this.
 
What happens in the special case where a block starts with a location-less instruction? Ah, I see that CodeViewDebug::beginInstruction does a forward scan to find the first instruction with location. This can fail though: there might not be such an instruction, in which case... I assume we either apply a line-0 location, or nothing at all, similar to Dwarf::beginInstruction?
 
It's a bit unclear to me what the benefits of a late pass that backwards-propagated locations would be. I suppose we'd have fewer forward scans in CodeViewDebug::beginInstruction, so is the benefit a compile-time win (which would offset the cost of a late pass)? Would the final debug info quality would be better in some way I'm just missing?
 
thanks!
vedant
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Removing debug locations from ConstantSDNodes

Robin Eklind via llvm-dev
I don't think this would affect SamplePGO because it is unlikely that constants are the only instructions in the basic block. Other instructions in the same BB should be sufficient to reserve the debug info.

Dehao

On Wed, Jun 20, 2018 at 4:32 PM Matthias Braun via llvm-dev <[hidden email]> wrote:
FWIW: Debug information on constants feels odd to me. They are just values not something that is executed so conceptually I would not expect them to "happen" at a specific time/place in the program. That said most numbers are copied into registers or stored into memory and that is of course an interesting action. So in the original example I would hope to see debug info on whatever instructions are used to fill the array with values.

That said I'm not familiar with the inner workings of dwarf or other debugger formats, so it may very well be reasonable to backfill the information in a late pass to avoid having assembler instructions without debug info as some people proposed.

- Matthias

On Jun 20, 2018, at 11:09 AM, via llvm-dev <[hidden email]> wrote:

DwarfDebug::beginInstruction sets the location to line 0 because otherwise the location is implicitly the same location as the last instruction of the physically preceding block.  That location is often completely unrelated to what's at the top of the new block.  Line 0 isn't great, but at least it's not a complete lie.
--paulr
From: [hidden email] [[hidden email]] 
Sent: Wednesday, June 20, 2018 1:48 PM
To: Reid Kleckner
Cc: Robinson, Paul; llvm-dev; Justin Bogner; David Li; David Blaikie
Subject: Re: [RFC] Removing debug locations from ConstantSDNodes
 
On Jun 19, 2018, at 6:36 PM, Reid Kleckner <[hidden email]> wrote:
 
On Tue, Jun 19, 2018 at 5:46 PM Vedant Kumar <[hidden email]> wrote:
Someone (Reid?) mentioned that we could try sinking constants to their point of first use as an alternative, and (IIUC) create new nodes with distinct DebugLocs for each use of a constant. I don't recall the details of that alternative clearly. Based on my (likely incorrect) understanding of it, dropping locations from constants outright might be simpler.
 
Our use case was in fastisel, so things are different. I don't think my solution will help you.
 
In our case, users were complaining about code like this:
volatile int do_something;
void f() {
  ++do_something;
  foo(1, 2, 3);
  ++do_something;
}
 
We'd generate locations like:
  .loc line 1
  incl do_something(%rip)
  movl $1, %ecx
  movl $2, %edx
  movl $3, %r8d
  .loc line 2 # line 2 starts here, instead of 3 instructions earlier
  callq foo
  .loc line 3
  incl do_something(%rip)


Our users really wanted the line table entry for line 2 to include the constant materialization code for some VS debugger feature.
 
Got it. (For others following along, there's more discussion about this debugger feature in the commit description for r327581.)



I think if you remove locations from ConstantSDNodes, you might want to add a late pass that propagates source locations backwards onto location-less instructions. This would also avoid some special cases when a basic block starts with an instruction that lacks location information. See CodeViewDebug::beginInstruction and DwarfDebug::beginInstruction for what we do today.
 
Interesting, I hadn't considered doing this.
 
What happens in the special case where a block starts with a location-less instruction? Ah, I see that CodeViewDebug::beginInstruction does a forward scan to find the first instruction with location. This can fail though: there might not be such an instruction, in which case... I assume we either apply a line-0 location, or nothing at all, similar to Dwarf::beginInstruction?
 
It's a bit unclear to me what the benefits of a late pass that backwards-propagated locations would be. I suppose we'd have fewer forward scans in CodeViewDebug::beginInstruction, so is the benefit a compile-time win (which would offset the cost of a late pass)? Would the final debug info quality would be better in some way I'm just missing?
 
thanks!
vedant
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Removing debug locations from ConstantSDNodes

Robin Eklind via llvm-dev
In reply to this post by Robin Eklind via llvm-dev

[hidden email] wrote:

 

FWIW: Debug information on constants feels odd to me. They are just values not something that is executed so conceptually I would not expect them to "happen" at a specific time/place in the program. That said most numbers are copied into registers or stored into memory and that is of course an interesting action. So in the original example I would hope to see debug info on whatever instructions are used to fill the array with values.

 

+1.  Constants are not inherently interesting, but loading one as an action preparatory to computing a value or setting up a parameter in a call, the instruction ought to be associated with the relevant source statement.

--paulr

 


_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Removing debug locations from ConstantSDNodes

Robin Eklind via llvm-dev

Isn’t this a typcial situation when the is_stmt field in the DWARF line table should be used?

 

If we set is_stmt=0 for the instruction loading the constant, then a debugger can choose not to stop on that instruction when doing “step” on source level. That way we can keep the original source location for a ConstantSDNode, but also telling the debugger that this isn’t a recommended breakpoint location for that line. Using is_stmt=0 is ofcourse only interesting if the constant load I hoisted so that it isn’t adjacent to other is_stmt=1 instructions on the same line.

 

/Björn

 

From: llvm-dev <[hidden email]> On Behalf Of via llvm-dev
Sent: den 21 juni 2018 15:08
To: [hidden email]
Cc: [hidden email]; [hidden email]; [hidden email]
Subject: Re: [llvm-dev] [RFC] Removing debug locations from ConstantSDNodes

 

[hidden email] wrote:

 

FWIW: Debug information on constants feels odd to me. They are just values not something that is executed so conceptually I would not expect them to "happen" at a specific time/place in the program. That said most numbers are copied into registers or stored into memory and that is of course an interesting action. So in the original example I would hope to see debug info on whatever instructions are used to fill the array with values.

 

+1.  Constants are not inherently interesting, but loading one as an action preparatory to computing a value or setting up a parameter in a call, the instruction ought to be associated with the relevant source statement.

--paulr

 


_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Removing debug locations from ConstantSDNodes

Robin Eklind via llvm-dev

Which brings up yet another big pain point: We don't carry an is_stmt annotation around on instructions, we just always set is_stmt=1 on the first instruction with a source location different from the previous one, without regard to whether it's an "interesting" place to stop.

--paulr

 

From: llvm-dev [mailto:[hidden email]] On Behalf Of Björn Pettersson A via llvm-dev
Sent: Thursday, June 21, 2018 12:28 PM
To: [hidden email]
Cc: [hidden email]; [hidden email]
Subject: Re: [llvm-dev] [RFC] Removing debug locations from ConstantSDNodes

 

Isn’t this a typcial situation when the is_stmt field in the DWARF line table should be used?

 

If we set is_stmt=0 for the instruction loading the constant, then a debugger can choose not to stop on that instruction when doing “step” on source level. That way we can keep the original source location for a ConstantSDNode, but also telling the debugger that this isn’t a recommended breakpoint location for that line. Using is_stmt=0 is ofcourse only interesting if the constant load I hoisted so that it isn’t adjacent to other is_stmt=1 instructions on the same line.

 

/Björn

 

From: llvm-dev <[hidden email]> On Behalf Of via llvm-dev
Sent: den 21 juni 2018 15:08
To: [hidden email]
Cc: [hidden email]; [hidden email]; [hidden email]
Subject: Re: [llvm-dev] [RFC] Removing debug locations from ConstantSDNodes

 

[hidden email] wrote:

 

FWIW: Debug information on constants feels odd to me. They are just values not something that is executed so conceptually I would not expect them to "happen" at a specific time/place in the program. That said most numbers are copied into registers or stored into memory and that is of course an interesting action. So in the original example I would hope to see debug info on whatever instructions are used to fill the array with values.

 

+1.  Constants are not inherently interesting, but loading one as an action preparatory to computing a value or setting up a parameter in a call, the instruction ought to be associated with the relevant source statement.

--paulr

 


_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Removing debug locations from ConstantSDNodes

Robin Eklind via llvm-dev
In reply to this post by Robin Eklind via llvm-dev
Yeah, I'm with you here. I don't think there's any particular debug info gain to having debug info on materializing a constant either.

Vedant: Ultimately I think I'm in favor of this plan.

Thanks!

-eric

On Wed, Jun 20, 2018 at 4:32 PM Matthias Braun via llvm-dev <[hidden email]> wrote:
FWIW: Debug information on constants feels odd to me. They are just values not something that is executed so conceptually I would not expect them to "happen" at a specific time/place in the program. That said most numbers are copied into registers or stored into memory and that is of course an interesting action. So in the original example I would hope to see debug info on whatever instructions are used to fill the array with values.

That said I'm not familiar with the inner workings of dwarf or other debugger formats, so it may very well be reasonable to backfill the information in a late pass to avoid having assembler instructions without debug info as some people proposed.

- Matthias

On Jun 20, 2018, at 11:09 AM, via llvm-dev <[hidden email]> wrote:

DwarfDebug::beginInstruction sets the location to line 0 because otherwise the location is implicitly the same location as the last instruction of the physically preceding block.  That location is often completely unrelated to what's at the top of the new block.  Line 0 isn't great, but at least it's not a complete lie.
--paulr
From: [hidden email] [[hidden email]] 
Sent: Wednesday, June 20, 2018 1:48 PM
To: Reid Kleckner
Cc: Robinson, Paul; llvm-dev; Justin Bogner; David Li; David Blaikie
Subject: Re: [RFC] Removing debug locations from ConstantSDNodes
 
On Jun 19, 2018, at 6:36 PM, Reid Kleckner <[hidden email]> wrote:
 
On Tue, Jun 19, 2018 at 5:46 PM Vedant Kumar <[hidden email]> wrote:
Someone (Reid?) mentioned that we could try sinking constants to their point of first use as an alternative, and (IIUC) create new nodes with distinct DebugLocs for each use of a constant. I don't recall the details of that alternative clearly. Based on my (likely incorrect) understanding of it, dropping locations from constants outright might be simpler.
 
Our use case was in fastisel, so things are different. I don't think my solution will help you.
 
In our case, users were complaining about code like this:
volatile int do_something;
void f() {
  ++do_something;
  foo(1, 2, 3);
  ++do_something;
}
 
We'd generate locations like:
  .loc line 1
  incl do_something(%rip)
  movl $1, %ecx
  movl $2, %edx
  movl $3, %r8d
  .loc line 2 # line 2 starts here, instead of 3 instructions earlier
  callq foo
  .loc line 3
  incl do_something(%rip)


Our users really wanted the line table entry for line 2 to include the constant materialization code for some VS debugger feature.
 
Got it. (For others following along, there's more discussion about this debugger feature in the commit description for r327581.)



I think if you remove locations from ConstantSDNodes, you might want to add a late pass that propagates source locations backwards onto location-less instructions. This would also avoid some special cases when a basic block starts with an instruction that lacks location information. See CodeViewDebug::beginInstruction and DwarfDebug::beginInstruction for what we do today.
 
Interesting, I hadn't considered doing this.
 
What happens in the special case where a block starts with a location-less instruction? Ah, I see that CodeViewDebug::beginInstruction does a forward scan to find the first instruction with location. This can fail though: there might not be such an instruction, in which case... I assume we either apply a line-0 location, or nothing at all, similar to Dwarf::beginInstruction?
 
It's a bit unclear to me what the benefits of a late pass that backwards-propagated locations would be. I suppose we'd have fewer forward scans in CodeViewDebug::beginInstruction, so is the benefit a compile-time win (which would offset the cost of a late pass)? Would the final debug info quality would be better in some way I'm just missing?
 
thanks!
vedant
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Removing debug locations from ConstantSDNodes

Robin Eklind via llvm-dev
In reply to this post by Robin Eklind via llvm-dev
Yes, this is quite the annoying thing, but consumers are... inconsistent with how they're using it. I believe I recall a case wehere a debugger would refuse to stop on lines that didn't have is_stmt on them at all.

-eric

On Thu, Jun 21, 2018 at 9:57 AM via llvm-dev <[hidden email]> wrote:

Which brings up yet another big pain point: We don't carry an is_stmt annotation around on instructions, we just always set is_stmt=1 on the first instruction with a source location different from the previous one, without regard to whether it's an "interesting" place to stop.

--paulr

 

From: llvm-dev [mailto:[hidden email]] On Behalf Of Björn Pettersson A via llvm-dev
Sent: Thursday, June 21, 2018 12:28 PM
To: [hidden email]
Cc: [hidden email]; [hidden email]


Subject: Re: [llvm-dev] [RFC] Removing debug locations from ConstantSDNodes

 

Isn’t this a typcial situation when the is_stmt field in the DWARF line table should be used?

 

If we set is_stmt=0 for the instruction loading the constant, then a debugger can choose not to stop on that instruction when doing “step” on source level. That way we can keep the original source location for a ConstantSDNode, but also telling the debugger that this isn’t a recommended breakpoint location for that line. Using is_stmt=0 is ofcourse only interesting if the constant load I hoisted so that it isn’t adjacent to other is_stmt=1 instructions on the same line.

 

/Björn

 

From: llvm-dev <[hidden email]> On Behalf Of via llvm-dev
Sent: den 21 juni 2018 15:08
To: [hidden email]
Cc: [hidden email]; [hidden email]; [hidden email]
Subject: Re: [llvm-dev] [RFC] Removing debug locations from ConstantSDNodes

 

[hidden email] wrote:

 

FWIW: Debug information on constants feels odd to me. They are just values not something that is executed so conceptually I would not expect them to "happen" at a specific time/place in the program. That said most numbers are copied into registers or stored into memory and that is of course an interesting action. So in the original example I would hope to see debug info on whatever instructions are used to fill the array with values.

 

+1.  Constants are not inherently interesting, but loading one as an action preparatory to computing a value or setting up a parameter in a call, the instruction ought to be associated with the relevant source statement.

--paulr

 

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Removing debug locations from ConstantSDNodes

Robin Eklind via llvm-dev
In reply to this post by Robin Eklind via llvm-dev
Thanks for your feedback everyone.

I'll try to summarize what people have said:

- There seems to be general consensus that we should move away from applying a static debug location to constant SD nodes. This should improve single-stepping and be more consistent with how constants are modeled in IR. 
(I think Björn suggested working around this by applying an is_stmt attribute to constant materialization instructions, but there are a number of barriers to doing that (e.g we only have partial compiler/debugger support for doing this).)

- It may be helpful to have a late pass backfill missing locations. This would make a constant materialization look like a part of the expression it's for. I think this needs a bit more thought. For example: how would we avoid backfilling bogus locations onto instructions which aren't used to materialize constants?

For now, I'll get started on patches to drop debug locations from constant nodes.

vedant

On Jun 21, 2018, at 11:50 AM, Eric Christopher via llvm-dev <[hidden email]> wrote:

Yeah, I'm with you here. I don't think there's any particular debug info gain to having debug info on materializing a constant either.

Vedant: Ultimately I think I'm in favor of this plan.

Thanks!

-eric

On Wed, Jun 20, 2018 at 4:32 PM Matthias Braun via llvm-dev <[hidden email]> wrote:
FWIW: Debug information on constants feels odd to me. They are just values not something that is executed so conceptually I would not expect them to "happen" at a specific time/place in the program. That said most numbers are copied into registers or stored into memory and that is of course an interesting action. So in the original example I would hope to see debug info on whatever instructions are used to fill the array with values.

That said I'm not familiar with the inner workings of dwarf or other debugger formats, so it may very well be reasonable to backfill the information in a late pass to avoid having assembler instructions without debug info as some people proposed.

- Matthias

On Jun 20, 2018, at 11:09 AM, via llvm-dev <[hidden email]> wrote:

DwarfDebug::beginInstruction sets the location to line 0 because otherwise the location is implicitly the same location as the last instruction of the physically preceding block.  That location is often completely unrelated to what's at the top of the new block.  Line 0 isn't great, but at least it's not a complete lie.
--paulr
From: [hidden email] [[hidden email]] 
Sent: Wednesday, June 20, 2018 1:48 PM
To: Reid Kleckner
Cc: Robinson, Paul; llvm-dev; Justin Bogner; David Li; David Blaikie
Subject: Re: [RFC] Removing debug locations from ConstantSDNodes
 
On Jun 19, 2018, at 6:36 PM, Reid Kleckner <[hidden email]> wrote:
 
On Tue, Jun 19, 2018 at 5:46 PM Vedant Kumar <[hidden email]> wrote:
Someone (Reid?) mentioned that we could try sinking constants to their point of first use as an alternative, and (IIUC) create new nodes with distinct DebugLocs for each use of a constant. I don't recall the details of that alternative clearly. Based on my (likely incorrect) understanding of it, dropping locations from constants outright might be simpler.
 
Our use case was in fastisel, so things are different. I don't think my solution will help you.
 
In our case, users were complaining about code like this:
volatile int do_something;
void f() {
  ++do_something;
  foo(1, 2, 3);
  ++do_something;
}
 
We'd generate locations like:
  .loc line 1
  incl do_something(%rip)
  movl $1, %ecx
  movl $2, %edx
  movl $3, %r8d
  .loc line 2 # line 2 starts here, instead of 3 instructions earlier
  callq foo
  .loc line 3
  incl do_something(%rip)


Our users really wanted the line table entry for line 2 to include the constant materialization code for some VS debugger feature.
 
Got it. (For others following along, there's more discussion about this debugger feature in the commit description for r327581.)



I think if you remove locations from ConstantSDNodes, you might want to add a late pass that propagates source locations backwards onto location-less instructions. This would also avoid some special cases when a basic block starts with an instruction that lacks location information. See CodeViewDebug::beginInstruction and DwarfDebug::beginInstruction for what we do today.
 
Interesting, I hadn't considered doing this.
 
What happens in the special case where a block starts with a location-less instruction? Ah, I see that CodeViewDebug::beginInstruction does a forward scan to find the first instruction with location. This can fail though: there might not be such an instruction, in which case... I assume we either apply a line-0 location, or nothing at all, similar to Dwarf::beginInstruction?
 
It's a bit unclear to me what the benefits of a late pass that backwards-propagated locations would be. I suppose we'd have fewer forward scans in CodeViewDebug::beginInstruction, so is the benefit a compile-time win (which would offset the cost of a late pass)? Would the final debug info quality would be better in some way I'm just missing?
 
thanks!
vedant
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Removing debug locations from ConstantSDNodes

Robin Eklind via llvm-dev
In reply to this post by Robin Eklind via llvm-dev
Sorry to resurrect the old thread – I completely missed it!

On Thu, Jun 21, 2018 at 6:08 AM via llvm-dev <[hidden email]> wrote:

> +1.  Constants are not inherently interesting, but loading one as an action preparatory to computing a value or setting up a parameter in a call, the instruction ought to be associated with the relevant source statement.
And now we have a clear problem. I'm seeing many cases when such
instruction got associated with the *previous* source statement which
does not make any sense at all. This happens because the constant now
is location-less and therefore the location of such
constant-materialization instruction is implicitly defined by the
previous available location.

This seems to be quite a serious regression in the quality of the
debug info produced by selection dag isel in LLVM 7 release as
compared to LLVM 6.

--
With best regards, Anton Korobeynikov
Department of Statistical Modelling, Saint Petersburg State University
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Removing debug locations from ConstantSDNodes

Robin Eklind via llvm-dev


> On Oct 23, 2018, at 11:31 PM, Anton Korobeynikov via llvm-dev <[hidden email]> wrote:
>
> Sorry to resurrect the old thread – I completely missed it!
>
> On Thu, Jun 21, 2018 at 6:08 AM via llvm-dev <[hidden email]> wrote:
>
>> +1.  Constants are not inherently interesting, but loading one as an action preparatory to computing a value or setting up a parameter in a call, the instruction ought to be associated with the relevant source statement.
> And now we have a clear problem. I'm seeing many cases when such
> instruction got associated with the *previous* source statement which
> does not make any sense at all.

I see. The default behavior of the DWARF emitter is to reuse the previous location (within a basic block) for instructions without a location. If you need a quick workaround, you can use the cl::opt -use-unknown-locations=Enable to emit line 0 entries for constant materialization instructions. Note that this may significantly bloat line tables.


> This happens because the constant now
> is location-less and therefore the location of such
> constant-materialization instruction is implicitly defined by the
> previous available location.
>
> This seems to be quite a serious regression in the quality of the
> debug info produced by selection dag isel in LLVM 7 release as
> compared to LLVM 6.

What sorts of problems is this causing in practice? Would they be addressed by using line 0 locations?

My experience has been that this change resulted in significant improvements in the single-stepping & debugging experience for Swift users. I haven’t seen any new jumpy line table bugs when compiling with `swiftc -Onone`.

vedant

>
> --
> With best regards, Anton Korobeynikov
> Department of Statistical Modelling, Saint Petersburg State University
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Removing debug locations from ConstantSDNodes

Robin Eklind via llvm-dev
Hi Vedant,

> I see. The default behavior of the DWARF emitter is to reuse the previous location (within a basic block) for instructions without a location. If you need a quick workaround, you can use the cl::opt -use-unknown-locations=Enable to emit line 0 entries for constant materialization instructions. Note that this may significantly bloat line tables.
Right. And therefore it is not an option.

> What sorts of problems is this causing in practice? Would they be addressed by using line 0 locations?
First of all, line 0 is not an option because it significantly bloats
the line table.

> My experience has been that this change resulted in significant improvements in the single-stepping & debugging experience for Swift users. I haven’t seen any new jumpy line table bugs when compiling with `swiftc -Onone`.
The debug information now is simply incorrect due to change, because
we associate constants with wrong lines. Therefore everything which is
connected with line locations could be wrong:

1. If you put breakpoint on the particular line, e.g. a function call
with several constant arguments then the debugger will stop in the
middle of call sequence – after some of the constants already loaded.

Funny enough, sometimes we have proper line info for constans – if
regalloc choose to rematerialize the constant, then the location will
be properly inherited. This is why you probably have not seen much
problems on x86 where all the constants are trivially materializable.
The situation is different on e.g. ARM and AArch64 where e.g. not all
floating points constants could be trivially materialized.

2. I expect that the sampling-based profiling could be somehow affected as well.

Probably all these problems Paul had in mind when he said about
"backfilling" constants, but it seems there was no further follow-up.

I think that we need to revert this change and figure our the solution
that will yield correct debug info.

--
With best regards, Anton Korobeynikov
Department of Statistical Modelling, Saint Petersburg State University
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [RFC] Removing debug locations from ConstantSDNodes

Robin Eklind via llvm-dev


vedant (sent from my iPhone)

> On Oct 26, 2018, at 12:04 PM, Anton Korobeynikov <[hidden email]> wrote:
>
> Hi Vedant,
>
>> I see. The default behavior of the DWARF emitter is to reuse the previous location (within a basic block) for instructions without a location. If you need a quick workaround, you can use the cl::opt -use-unknown-locations=Enable to emit line 0 entries for constant materialization instructions. Note that this may significantly bloat line tables.
> Right. And therefore it is not an option.

What are the actual limits on the size of debug info in your uses cases? Do you surpass those limits with this option enabled?


>> What sorts of problems is this causing in practice? Would they be addressed by using line 0 locations?
> First of all, line 0 is not an option because it significantly bloats
> the line table.

If it solves your problem, I don’t see why not.


>> My experience has been that this change resulted in significant improvements in the single-stepping & debugging experience for Swift users. I haven’t seen any new jumpy line table bugs when compiling with `swiftc -Onone`.
> The debug information now is simply incorrect due to change, because
> we associate constants with wrong lines.

That happened before the change. All uses of a constant after the first would refer to the wrong line.


> Therefore everything which is
> connected with line locations could be wrong:
>
> 1. If you put breakpoint on the particular line, e.g. a function call
> with several constant arguments then the debugger will stop in the
> middle of call sequence – after some of the constants already loaded.

Prior to the patch, you would jump to the first use of the constant, then jump back to the call...

That’s much, much worse.


>
> Funny enough, sometimes we have proper line info for constans – if
> regalloc choose to rematerialize the constant, then the location will
> be properly inherited.

That only happens if the constant doesn’t have a location.


> This is why you probably have not seen much
> problems on x86 where all the constants are trivially materializable.
> The situation is different on e.g. ARM and AArch64 where e.g. not all
> floating points constants could be trivially materialized.
>
> 2. I expect that the sampling-based profiling could be somehow affected as well.

I double checked this with sample pgo folks earlier in the thread.

Again, the situation was much worse without this patch. Say you take a sample and see constant materialization code. If that code has a location associated with the first use of a constant, you’re marking the wrong block hot.


> Probably all these problems Paul had in mind when he said about
> "backfilling" constants, but it seems there was no further follow-up.

I pointed out various problems with the backfilling scheme. Happy to hear a workable idea.


> I think that we need to revert this change and figure our the solution
> that will yield correct debug info.

This would be a major regression for stepping behavior, and I don’t think we should do that.

best,


>
> --
> With best regards, Anton Korobeynikov
> Department of Statistical Modelling, Saint Petersburg State University
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev