[llvm-dev] RFC: libtrace

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
18 messages Options
Reply | Threaded
Open this post in threaded view
|

[llvm-dev] RFC: libtrace

Robin Eklind via llvm-dev

Hi all,


We have been thinking internally about a lightweight llvm-based ptracer.  To address one question up front: the primary way in which this differs from LLDB is that it targets a more narrow use case -- there is no scripting support, no clang integration, no dynamic extensibility, no support for running jitted code in the target, and no user interface.  We have several use cases internally that call for varying levels of functionality from such a utility, and being able to use as little as possible of the library as is necessary for the given task is important for the scale in which we wish to use it.


We are still in early discussions and planning, but I think this would be a good addition to the LLVM upstream.  Since we’re approaching this as a set of small isolated components, my thinking is to work on this completely upstream, directly under the llvm project (as opposed to making a separate subproject), but I’m open to discussion if anyone feels differently.


LLDB has solved a lot of the difficult problems needed for such a tool.  So in the spirit of code reuse, we think it’s worth trying componentize LLDB by sinking pieces into LLVM and rebasing LLDB as well as these smaller tools on top of these components, so that smaller tools can reduce code duplication and contribute to the overall health of the code base.  At the same time we think that in doing so we can break things up into more granular pieces, ultimately exposing a larger testing surface and enabling us to create exhaustive tests, giving LLDB more fine grained testing of important subsystems.


A good example of this would be LLDB’s DWARF parsing code, which is more featureful than LLVM’s but has kind of evolved in parallel.  Sinking this into LLVM would be one early target of such an effort, although over time there would likely be more.


Anyone have any thoughts / strong opinions on this proposal, or where the code should live?  Also, does anyone have any suggestions on things they’d like to see come out of this? Whether it’s a specific new tool, new functionality to an existing tool, an architectural or design change to some existing tool or library, or something else entirely, all feedback and ideas are welcome.


Thanks,

Zach



_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [lldb-dev] RFC: libtrace

Robin Eklind via llvm-dev
Just to be clear, by "no clang integration" do you mean "no expression parser" or do you mean something more radical?  For instance, adding a TypeSystem and its DWARF parser for C family languages that uses a different underlying representation than Clang AST's to store the results would be a lot of work that wouldn't be terribly interesting to lldb.  I don't think that's what you meant, but wanted to be sure.

Jim

> On Jun 26, 2018, at 11:58 AM, Zachary Turner via lldb-dev <[hidden email]> wrote:
>
> Hi all,
>
> We have been thinking internally about a lightweight llvm-based ptracer.  To address one question up front: the primary way in which this differs from LLDB is that it targets a more narrow use case -- there is no scripting support, no clang integration, no dynamic extensibility, no support for running jitted code in the target, and no user interface.  We have several use cases internally that call for varying levels of functionality from such a utility, and being able to use as little as possible of the library as is necessary for the given task is important for the scale in which we wish to use it.
>
> We are still in early discussions and planning, but I think this would be a good addition to the LLVM upstream.  Since we’re approaching this as a set of small isolated components, my thinking is to work on this completely upstream, directly under the llvm project (as opposed to making a separate subproject), but I’m open to discussion if anyone feels differently.
>
> LLDB has solved a lot of the difficult problems needed for such a tool.  So in the spirit of code reuse, we think it’s worth trying componentize LLDB by sinking pieces into LLVM and rebasing LLDB as well as these smaller tools on top of these components, so that smaller tools can reduce code duplication and contribute to the overall health of the code base.  At the same time we think that in doing so we can break things up into more granular pieces, ultimately exposing a larger testing surface and enabling us to create exhaustive tests, giving LLDB more fine grained testing of important subsystems.
>
> A good example of this would be LLDB’s DWARF parsing code, which is more featureful than LLVM’s but has kind of evolved in parallel.  Sinking this into LLVM would be one early target of such an effort, although over time there would likely be more.
>
> Anyone have any thoughts / strong opinions on this proposal, or where the code should live?  Also, does anyone have any suggestions on things they’d like to see come out of this?  Whether it’s a specific new tool, new functionality to an existing tool, an architectural or design change to some existing tool or library, or something else entirely, all feedback and ideas are welcome.
>
> Thanks,
> Zach
>
> _______________________________________________
> lldb-dev mailing list
> [hidden email]
> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [lldb-dev] RFC: libtrace

Robin Eklind via llvm-dev
no expression parser or knowledge of any specific programming language.

Basically I just mean that the parsing of the native DWARF format itself is in scope, but anything beyond that is out of scope.  For symbolication we have things like llvm-symbolizer that already just work and are built on top of LLVM's dwarf parsing code.  Similarly, LLDB's type system could be built on top of it as well.  Given that I think everyone mostly agrees that unifying on one DWARF parser is a good idea in principle, this would mean no functional change from LLDB's point of view, it would just continue to do exactly what it does regarding parsing C++ expressions and converting these into types that clang understands.

It will probably be useful someday to have an expression parser and language specific type system, but when that comes I don't think we'd want anything radically different than what LLDB already has.

On Tue, Jun 26, 2018 at 12:26 PM Jim Ingham <[hidden email]> wrote:
Just to be clear, by "no clang integration" do you mean "no expression parser" or do you mean something more radical?  For instance, adding a TypeSystem and its DWARF parser for C family languages that uses a different underlying representation than Clang AST's to store the results would be a lot of work that wouldn't be terribly interesting to lldb.  I don't think that's what you meant, but wanted to be sure.

Jim

> On Jun 26, 2018, at 11:58 AM, Zachary Turner via lldb-dev <[hidden email]> wrote:
>
> Hi all,
>
> We have been thinking internally about a lightweight llvm-based ptracer.  To address one question up front: the primary way in which this differs from LLDB is that it targets a more narrow use case -- there is no scripting support, no clang integration, no dynamic extensibility, no support for running jitted code in the target, and no user interface.  We have several use cases internally that call for varying levels of functionality from such a utility, and being able to use as little as possible of the library as is necessary for the given task is important for the scale in which we wish to use it.
>
> We are still in early discussions and planning, but I think this would be a good addition to the LLVM upstream.  Since we’re approaching this as a set of small isolated components, my thinking is to work on this completely upstream, directly under the llvm project (as opposed to making a separate subproject), but I’m open to discussion if anyone feels differently.
>
> LLDB has solved a lot of the difficult problems needed for such a tool.  So in the spirit of code reuse, we think it’s worth trying componentize LLDB by sinking pieces into LLVM and rebasing LLDB as well as these smaller tools on top of these components, so that smaller tools can reduce code duplication and contribute to the overall health of the code base.  At the same time we think that in doing so we can break things up into more granular pieces, ultimately exposing a larger testing surface and enabling us to create exhaustive tests, giving LLDB more fine grained testing of important subsystems.
>
> A good example of this would be LLDB’s DWARF parsing code, which is more featureful than LLVM’s but has kind of evolved in parallel.  Sinking this into LLVM would be one early target of such an effort, although over time there would likely be more.
>
> Anyone have any thoughts / strong opinions on this proposal, or where the code should live?  Also, does anyone have any suggestions on things they’d like to see come out of this?  Whether it’s a specific new tool, new functionality to an existing tool, an architectural or design change to some existing tool or library, or something else entirely, all feedback and ideas are welcome.
>
> Thanks,
> Zach
>
> _______________________________________________
> lldb-dev mailing list
> [hidden email]
> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [lldb-dev] RFC: libtrace

Robin Eklind via llvm-dev
So you aren't planning to print values at all, just stop points (i.e. you are only interested in the line table and function symbols part of DWARF)?

Given what you've described so far, I'm wondering if what you really want is the NativeProcess classes with some symbol-file reading pulled in? Is there anything that you couldn't do from there?

Jim


> On Jun 26, 2018, at 12:48 PM, Zachary Turner <[hidden email]> wrote:
>
> no expression parser or knowledge of any specific programming language.
>
> Basically I just mean that the parsing of the native DWARF format itself is in scope, but anything beyond that is out of scope.  For symbolication we have things like llvm-symbolizer that already just work and are built on top of LLVM's dwarf parsing code.  Similarly, LLDB's type system could be built on top of it as well.  Given that I think everyone mostly agrees that unifying on one DWARF parser is a good idea in principle, this would mean no functional change from LLDB's point of view, it would just continue to do exactly what it does regarding parsing C++ expressions and converting these into types that clang understands.
>
> It will probably be useful someday to have an expression parser and language specific type system, but when that comes I don't think we'd want anything radically different than what LLDB already has.
>
> On Tue, Jun 26, 2018 at 12:26 PM Jim Ingham <[hidden email]> wrote:
> Just to be clear, by "no clang integration" do you mean "no expression parser" or do you mean something more radical?  For instance, adding a TypeSystem and its DWARF parser for C family languages that uses a different underlying representation than Clang AST's to store the results would be a lot of work that wouldn't be terribly interesting to lldb.  I don't think that's what you meant, but wanted to be sure.
>
> Jim
>
> > On Jun 26, 2018, at 11:58 AM, Zachary Turner via lldb-dev <[hidden email]> wrote:
> >
> > Hi all,
> >
> > We have been thinking internally about a lightweight llvm-based ptracer.  To address one question up front: the primary way in which this differs from LLDB is that it targets a more narrow use case -- there is no scripting support, no clang integration, no dynamic extensibility, no support for running jitted code in the target, and no user interface.  We have several use cases internally that call for varying levels of functionality from such a utility, and being able to use as little as possible of the library as is necessary for the given task is important for the scale in which we wish to use it.
> >
> > We are still in early discussions and planning, but I think this would be a good addition to the LLVM upstream.  Since we’re approaching this as a set of small isolated components, my thinking is to work on this completely upstream, directly under the llvm project (as opposed to making a separate subproject), but I’m open to discussion if anyone feels differently.
> >
> > LLDB has solved a lot of the difficult problems needed for such a tool.  So in the spirit of code reuse, we think it’s worth trying componentize LLDB by sinking pieces into LLVM and rebasing LLDB as well as these smaller tools on top of these components, so that smaller tools can reduce code duplication and contribute to the overall health of the code base.  At the same time we think that in doing so we can break things up into more granular pieces, ultimately exposing a larger testing surface and enabling us to create exhaustive tests, giving LLDB more fine grained testing of important subsystems.
> >
> > A good example of this would be LLDB’s DWARF parsing code, which is more featureful than LLVM’s but has kind of evolved in parallel.  Sinking this into LLVM would be one early target of such an effort, although over time there would likely be more.
> >
> > Anyone have any thoughts / strong opinions on this proposal, or where the code should live?  Also, does anyone have any suggestions on things they’d like to see come out of this?  Whether it’s a specific new tool, new functionality to an existing tool, an architectural or design change to some existing tool or library, or something else entirely, all feedback and ideas are welcome.
> >
> > Thanks,
> > Zach
> >
> > _______________________________________________
> > lldb-dev mailing list
> > [hidden email]
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
>

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [lldb-dev] RFC: libtrace

Robin Eklind via llvm-dev
You'd probably need to pull the Unwinder in if you want backtraces, but that part shouldn't be that hard to disentangle.  I don't think you'd need much else?

Basing your work on NativeProcess rather than lldb proper would also cut the number of observer processes in half and avoid the context switches between the server and the debugger.  That seems more appropriate for a lightweight tool.

Jim


> On Jun 26, 2018, at 12:59 PM, Jim Ingham via lldb-dev <[hidden email]> wrote:
>
> So you aren't planning to print values at all, just stop points (i.e. you are only interested in the line table and function symbols part of DWARF)?
>
> Given what you've described so far, I'm wondering if what you really want is the NativeProcess classes with some symbol-file reading pulled in? Is there anything that you couldn't do from there?
>
> Jim
>
>
>> On Jun 26, 2018, at 12:48 PM, Zachary Turner <[hidden email]> wrote:
>>
>> no expression parser or knowledge of any specific programming language.
>>
>> Basically I just mean that the parsing of the native DWARF format itself is in scope, but anything beyond that is out of scope.  For symbolication we have things like llvm-symbolizer that already just work and are built on top of LLVM's dwarf parsing code.  Similarly, LLDB's type system could be built on top of it as well.  Given that I think everyone mostly agrees that unifying on one DWARF parser is a good idea in principle, this would mean no functional change from LLDB's point of view, it would just continue to do exactly what it does regarding parsing C++ expressions and converting these into types that clang understands.
>>
>> It will probably be useful someday to have an expression parser and language specific type system, but when that comes I don't think we'd want anything radically different than what LLDB already has.
>>
>> On Tue, Jun 26, 2018 at 12:26 PM Jim Ingham <[hidden email]> wrote:
>> Just to be clear, by "no clang integration" do you mean "no expression parser" or do you mean something more radical?  For instance, adding a TypeSystem and its DWARF parser for C family languages that uses a different underlying representation than Clang AST's to store the results would be a lot of work that wouldn't be terribly interesting to lldb.  I don't think that's what you meant, but wanted to be sure.
>>
>> Jim
>>
>>> On Jun 26, 2018, at 11:58 AM, Zachary Turner via lldb-dev <[hidden email]> wrote:
>>>
>>> Hi all,
>>>
>>> We have been thinking internally about a lightweight llvm-based ptracer.  To address one question up front: the primary way in which this differs from LLDB is that it targets a more narrow use case -- there is no scripting support, no clang integration, no dynamic extensibility, no support for running jitted code in the target, and no user interface.  We have several use cases internally that call for varying levels of functionality from such a utility, and being able to use as little as possible of the library as is necessary for the given task is important for the scale in which we wish to use it.
>>>
>>> We are still in early discussions and planning, but I think this would be a good addition to the LLVM upstream.  Since we’re approaching this as a set of small isolated components, my thinking is to work on this completely upstream, directly under the llvm project (as opposed to making a separate subproject), but I’m open to discussion if anyone feels differently.
>>>
>>> LLDB has solved a lot of the difficult problems needed for such a tool.  So in the spirit of code reuse, we think it’s worth trying componentize LLDB by sinking pieces into LLVM and rebasing LLDB as well as these smaller tools on top of these components, so that smaller tools can reduce code duplication and contribute to the overall health of the code base.  At the same time we think that in doing so we can break things up into more granular pieces, ultimately exposing a larger testing surface and enabling us to create exhaustive tests, giving LLDB more fine grained testing of important subsystems.
>>>
>>> A good example of this would be LLDB’s DWARF parsing code, which is more featureful than LLVM’s but has kind of evolved in parallel.  Sinking this into LLVM would be one early target of such an effort, although over time there would likely be more.
>>>
>>> Anyone have any thoughts / strong opinions on this proposal, or where the code should live?  Also, does anyone have any suggestions on things they’d like to see come out of this?  Whether it’s a specific new tool, new functionality to an existing tool, an architectural or design change to some existing tool or library, or something else entirely, all feedback and ideas are welcome.
>>>
>>> Thanks,
>>> Zach
>>>
>>> _______________________________________________
>>> lldb-dev mailing list
>>> [hidden email]
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
>>
>
> _______________________________________________
> lldb-dev mailing list
> [hidden email]
> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] RFC: libtrace

Robin Eklind via llvm-dev
In reply to this post by Robin Eklind via llvm-dev


> On Jun 26, 2018, at 11:58 AM, Zachary Turner via llvm-dev <[hidden email]> wrote:
>
> Hi all,
>
> We have been thinking internally about a lightweight llvm-based ptracer.  To address one question up front: the primary way in which this differs from LLDB is that it targets a more narrow use case -- there is no scripting support, no clang integration, no dynamic extensibility, no support for running jitted code in the target, and no user interface.  We have several use cases internally that call for varying levels of functionality from such a utility, and being able to use as little as possible of the library as is necessary for the given task is important for the scale in which we wish to use it.
>
> We are still in early discussions and planning, but I think this would be a good addition to the LLVM upstream.  Since we’re approaching this as a set of small isolated components, my thinking is to work on this completely upstream, directly under the llvm project (as opposed to making a separate subproject), but I’m open to discussion if anyone feels differently.
>
> LLDB has solved a lot of the difficult problems needed for such a tool.  So in the spirit of code reuse, we think it’s worth trying componentize LLDB by sinking pieces into LLVM and rebasing LLDB as well as these smaller tools on top of these components, so that smaller tools can reduce code duplication and contribute to the overall health of the code base.

Do you have a rough idea of what components specifically the new tool would need to function?

>  At the same time we think that in doing so we can break things up into more granular pieces, ultimately exposing a larger testing surface and enabling us to create exhaustive tests, giving LLDB more fine grained testing of important subsystems.

Are you thinking of the new utility as something that would naturally live in llvm/tools or as something that would live in the LLDB repository?

>
> A good example of this would be LLDB’s DWARF parsing code, which is more featureful than LLVM’s but has kind of evolved in parallel.  Sinking this into LLVM would be one early target of such an effort, although over time there would likely be more.

As you are undoubtedly aware we've been carefully rearchitecting LLVM's DWARF parser over the last few years to eventually become featureful enough so that LLDB could use it, so any help on that front would be most welcome. As long as we are careful to not regress in performance/lazyness, features and fault-tolerance, deduplicating the implementations can only be good for LLVM and LLDB.

-- adrian

>
> Anyone have any thoughts / strong opinions on this proposal, or where the code should live?  Also, does anyone have any suggestions on things they’d like to see come out of this?  Whether it’s a specific new tool, new functionality to an existing tool, an architectural or design change to some existing tool or library, or something else entirely, all feedback and ideas are welcome.
>
> Thanks,
> Zach
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [lldb-dev] RFC: libtrace

Robin Eklind via llvm-dev
In reply to this post by Robin Eklind via llvm-dev
The various NativeProcess implementations are definitely a good starting point and I'll probably be looking at them to understand all the ins and outs of each platform.  I'm not sure if the API / interface we want will be the same, so I don't think we can just copy it all down.  But a lot of the core logic we probably can.  Depending on how much of it we end up implementing and how close we get to the current functionality of the NativeProcess classes, this could be another area for code reuse similar to what I mentioned with the DWARF reading.  i.e. we could write lots of low-level tests of the tracing functionality specifically, then update the NativeProcess implementations to use this.

On Tue, Jun 26, 2018 at 1:09 PM Jim Ingham <[hidden email]> wrote:
You'd probably need to pull the Unwinder in if you want backtraces, but that part shouldn't be that hard to disentangle.  I don't think you'd need much else?

Basing your work on NativeProcess rather than lldb proper would also cut the number of observer processes in half and avoid the context switches between the server and the debugger.  That seems more appropriate for a lightweight tool.

Jim


> On Jun 26, 2018, at 12:59 PM, Jim Ingham via lldb-dev <[hidden email]> wrote:
>
> So you aren't planning to print values at all, just stop points (i.e. you are only interested in the line table and function symbols part of DWARF)?
>
> Given what you've described so far, I'm wondering if what you really want is the NativeProcess classes with some symbol-file reading pulled in? Is there anything that you couldn't do from there?
>
> Jim
>
>
>> On Jun 26, 2018, at 12:48 PM, Zachary Turner <[hidden email]> wrote:
>>
>> no expression parser or knowledge of any specific programming language.
>>
>> Basically I just mean that the parsing of the native DWARF format itself is in scope, but anything beyond that is out of scope.  For symbolication we have things like llvm-symbolizer that already just work and are built on top of LLVM's dwarf parsing code.  Similarly, LLDB's type system could be built on top of it as well.  Given that I think everyone mostly agrees that unifying on one DWARF parser is a good idea in principle, this would mean no functional change from LLDB's point of view, it would just continue to do exactly what it does regarding parsing C++ expressions and converting these into types that clang understands.
>>
>> It will probably be useful someday to have an expression parser and language specific type system, but when that comes I don't think we'd want anything radically different than what LLDB already has.
>>
>> On Tue, Jun 26, 2018 at 12:26 PM Jim Ingham <[hidden email]> wrote:
>> Just to be clear, by "no clang integration" do you mean "no expression parser" or do you mean something more radical?  For instance, adding a TypeSystem and its DWARF parser for C family languages that uses a different underlying representation than Clang AST's to store the results would be a lot of work that wouldn't be terribly interesting to lldb.  I don't think that's what you meant, but wanted to be sure.
>>
>> Jim
>>
>>> On Jun 26, 2018, at 11:58 AM, Zachary Turner via lldb-dev <[hidden email]> wrote:
>>>
>>> Hi all,
>>>
>>> We have been thinking internally about a lightweight llvm-based ptracer.  To address one question up front: the primary way in which this differs from LLDB is that it targets a more narrow use case -- there is no scripting support, no clang integration, no dynamic extensibility, no support for running jitted code in the target, and no user interface.  We have several use cases internally that call for varying levels of functionality from such a utility, and being able to use as little as possible of the library as is necessary for the given task is important for the scale in which we wish to use it.
>>>
>>> We are still in early discussions and planning, but I think this would be a good addition to the LLVM upstream.  Since we’re approaching this as a set of small isolated components, my thinking is to work on this completely upstream, directly under the llvm project (as opposed to making a separate subproject), but I’m open to discussion if anyone feels differently.
>>>
>>> LLDB has solved a lot of the difficult problems needed for such a tool.  So in the spirit of code reuse, we think it’s worth trying componentize LLDB by sinking pieces into LLVM and rebasing LLDB as well as these smaller tools on top of these components, so that smaller tools can reduce code duplication and contribute to the overall health of the code base.  At the same time we think that in doing so we can break things up into more granular pieces, ultimately exposing a larger testing surface and enabling us to create exhaustive tests, giving LLDB more fine grained testing of important subsystems.
>>>
>>> A good example of this would be LLDB’s DWARF parsing code, which is more featureful than LLVM’s but has kind of evolved in parallel.  Sinking this into LLVM would be one early target of such an effort, although over time there would likely be more.
>>>
>>> Anyone have any thoughts / strong opinions on this proposal, or where the code should live?  Also, does anyone have any suggestions on things they’d like to see come out of this?  Whether it’s a specific new tool, new functionality to an existing tool, an architectural or design change to some existing tool or library, or something else entirely, all feedback and ideas are welcome.
>>>
>>> Thanks,
>>> Zach
>>>
>>> _______________________________________________
>>> lldb-dev mailing list
>>> [hidden email]
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
>>
>
> _______________________________________________
> lldb-dev mailing list
> [hidden email]
> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] RFC: libtrace

Robin Eklind via llvm-dev
In reply to this post by Robin Eklind via llvm-dev


On Tue, Jun 26, 2018 at 1:28 PM Adrian Prantl <[hidden email]> wrote:


> On Jun 26, 2018, at 11:58 AM, Zachary Turner via llvm-dev <[hidden email]> wrote:
>
> Hi all,
>
> We have been thinking internally about a lightweight llvm-based ptracer.  To address one question up front: the primary way in which this differs from LLDB is that it targets a more narrow use case -- there is no scripting support, no clang integration, no dynamic extensibility, no support for running jitted code in the target, and no user interface.  We have several use cases internally that call for varying levels of functionality from such a utility, and being able to use as little as possible of the library as is necessary for the given task is important for the scale in which we wish to use it.
>
> We are still in early discussions and planning, but I think this would be a good addition to the LLVM upstream.  Since we’re approaching this as a set of small isolated components, my thinking is to work on this completely upstream, directly under the llvm project (as opposed to making a separate subproject), but I’m open to discussion if anyone feels differently.
>
> LLDB has solved a lot of the difficult problems needed for such a tool.  So in the spirit of code reuse, we think it’s worth trying componentize LLDB by sinking pieces into LLVM and rebasing LLDB as well as these smaller tools on top of these components, so that smaller tools can reduce code duplication and contribute to the overall health of the code base.

Do you have a rough idea of what components specifically the new tool would need to function?

* process & thread control
* platform agnostic ptrace wrapper (not all platforms even have ptrace, and those that do the usage and capabilities vary quite a bit)
* install various kinds of traps
* monitor cpu performance counters
* symbol file parsing
* symbol resolution (name <-> addr and line <-> addr)
* unwinding and backtrace generation

 

>  At the same time we think that in doing so we can break things up into more granular pieces, ultimately exposing a larger testing surface and enabling us to create exhaustive tests, giving LLDB more fine grained testing of important subsystems.

Are you thinking of the new utility as something that would naturally live in llvm/tools or as something that would live in the LLDB repository?
I would rather put it under LLDB and then link LLDB against certain pieces in cases where that makes sense.
 

>
> A good example of this would be LLDB’s DWARF parsing code, which is more featureful than LLVM’s but has kind of evolved in parallel.  Sinking this into LLVM would be one early target of such an effort, although over time there would likely be more.

As you are undoubtedly aware we've been carefully rearchitecting LLVM's DWARF parser over the last few years to eventually become featureful enough so that LLDB could use it, so any help on that front would be most welcome. As long as we are careful to not regress in performance/lazyness, features and fault-tolerance, deduplicating the implementations can only be good for LLVM and LLDB.

Yea, this is the general idea.   Has anyone actively been working on this specific effort recently?  To my knowledge someone started and then never finished, but the efforts also never made it upstream, so my understanding is that it's a goal, but one that nobody has made significant headway on.

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] RFC: libtrace

Robin Eklind via llvm-dev

> On Jun 26, 2018, at 1:38 PM, Zachary Turner <[hidden email]> wrote:
>
>> On Tue, Jun 26, 2018 at 1:28 PM Adrian Prantl <[hidden email]> wrote:
>>
>>> > On Jun 26, 2018, at 11:58 AM, Zachary Turner via llvm-dev <[hidden email]> wrote:
>>> > A good example of this would be LLDB’s DWARF parsing code, which is more featureful than LLVM’s but has kind of evolved in parallel.  Sinking this into LLVM would be one early target of such an effort, although over time there would likely be more.
>>>
>>> As you are undoubtedly aware we've been carefully rearchitecting LLVM's DWARF parser over the last few years to eventually become featureful enough so that LLDB could use it, so any help on that front would be most welcome. As long as we are careful to not regress in performance/lazyness, features and fault-tolerance, deduplicating the implementations can only be good for LLVM and LLDB.
>>>
>> Yea, this is the general idea.   Has anyone actively been working on this specific effort recently?  To my knowledge someone started and then never finished, but the efforts also never made it upstream, so my understanding is that it's a goal, but one that nobody has made significant headway on.
>
That's not true. Greg Clayton started the effort in 2016 and landed many of the ground-breaking changes. The design ideas fleshed out during that initial effort (thanks to David Blaikie who spent a lot of time reviewing the new interfaces!) such as improved error handling where then picked up the entire team of contributors who worked on DWARF 5 support in LLVM and we've continued down that path ever since. The greatly improved llvm-dwarfdump was also born out of this effort, for example. We also payed attention that every refactoring of LLDB DWARF parser code would bring it closer to the new LLVM parser interface to narrow the gaps between the implementations.

-- adrian

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] RFC: libtrace

Robin Eklind via llvm-dev
Ahh, thanks.  I thought those changes never landed, but it's good to hear that they did.

On Tue, Jun 26, 2018 at 1:49 PM Adrian Prantl <[hidden email]> wrote:

> On Jun 26, 2018, at 1:38 PM, Zachary Turner <[hidden email]> wrote:
>
>> On Tue, Jun 26, 2018 at 1:28 PM Adrian Prantl <[hidden email]> wrote:
>>
>>> > On Jun 26, 2018, at 11:58 AM, Zachary Turner via llvm-dev <[hidden email]> wrote:
>>> > A good example of this would be LLDB’s DWARF parsing code, which is more featureful than LLVM’s but has kind of evolved in parallel.  Sinking this into LLVM would be one early target of such an effort, although over time there would likely be more.
>>>
>>> As you are undoubtedly aware we've been carefully rearchitecting LLVM's DWARF parser over the last few years to eventually become featureful enough so that LLDB could use it, so any help on that front would be most welcome. As long as we are careful to not regress in performance/lazyness, features and fault-tolerance, deduplicating the implementations can only be good for LLVM and LLDB.
>>>
>> Yea, this is the general idea.   Has anyone actively been working on this specific effort recently?  To my knowledge someone started and then never finished, but the efforts also never made it upstream, so my understanding is that it's a goal, but one that nobody has made significant headway on.
>
That's not true. Greg Clayton started the effort in 2016 and landed many of the ground-breaking changes. The design ideas fleshed out during that initial effort (thanks to David Blaikie who spent a lot of time reviewing the new interfaces!) such as improved error handling where then picked up the entire team of contributors who worked on DWARF 5 support in LLVM and we've continued down that path ever since. The greatly improved llvm-dwarfdump was also born out of this effort, for example. We also payed attention that every refactoring of LLDB DWARF parser code would bring it closer to the new LLVM parser interface to narrow the gaps between the implementations.

-- adrian


_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [lldb-dev] RFC: libtrace

Robin Eklind via llvm-dev
In reply to this post by Robin Eklind via llvm-dev
One important question, does this tool need to work remotely?  

I'm guessing the answer to this is no, since if you are working remotely you won't have a performant enough solution to really be an effective tracer.  And if the guts of the debugger are remote, you care a lot less about the complexity of the remote part.  If you can always debug with the Host platform - in lldb terms - then it really does seem like you want to start with the NativeProcess classes.  That won't get you macOS hosting, but OTOH this would be good reason to get macOS onto the NativeProcess classes/lldb-server and off of debugserver...

> On Jun 26, 2018, at 1:38 PM, Zachary Turner via lldb-dev <[hidden email]> wrote:
>
>
>
> On Tue, Jun 26, 2018 at 1:28 PM Adrian Prantl <[hidden email]> wrote:
>
>
> > On Jun 26, 2018, at 11:58 AM, Zachary Turner via llvm-dev <[hidden email]> wrote:
> >
> > Hi all,
> >
> > We have been thinking internally about a lightweight llvm-based ptracer.  To address one question up front: the primary way in which this differs from LLDB is that it targets a more narrow use case -- there is no scripting support, no clang integration, no dynamic extensibility, no support for running jitted code in the target, and no user interface.  We have several use cases internally that call for varying levels of functionality from such a utility, and being able to use as little as possible of the library as is necessary for the given task is important for the scale in which we wish to use it.
> >
> > We are still in early discussions and planning, but I think this would be a good addition to the LLVM upstream.  Since we’re approaching this as a set of small isolated components, my thinking is to work on this completely upstream, directly under the llvm project (as opposed to making a separate subproject), but I’m open to discussion if anyone feels differently.
> >
> > LLDB has solved a lot of the difficult problems needed for such a tool.  So in the spirit of code reuse, we think it’s worth trying componentize LLDB by sinking pieces into LLVM and rebasing LLDB as well as these smaller tools on top of these components, so that smaller tools can reduce code duplication and contribute to the overall health of the code base.
>
> Do you have a rough idea of what components specifically the new tool would need to function?
>
> * process & thread control
> * platform agnostic ptrace wrapper (not all platforms even have ptrace, and those that do the usage and capabilities vary quite a bit)
> * install various kinds of traps
> * monitor cpu performance counters

This part is all the job of the NativeProcess classes.  That's not terribly surprising, since their whole reason for being was as a low-level abstraction for process control without any of the higher-level work that lldb does.

> * symbol file parsing
> * symbol resolution (name <-> addr and line <-> addr)

This will involve getting object-file readers and a symbol file reader into your trace tool.  These should be pretty easy to extract from lldb, though you probably don't need the plugin architecture.

> * unwinding and backtrace generation

Jason says this will be somewhat tricky to pull out of lldb.  OTOH much of the complexity of unwind is reconstructing all the non-volatile registers, and if you don't care about values, you don't really need that.  So some kind of lightweight pc/sp only backtrace would be more appropriate, and probably faster for your needs.

Jim

>
>  
>
> >  At the same time we think that in doing so we can break things up into more granular pieces, ultimately exposing a larger testing surface and enabling us to create exhaustive tests, giving LLDB more fine grained testing of important subsystems.
>
> Are you thinking of the new utility as something that would naturally live in llvm/tools or as something that would live in the LLDB repository?
> I would rather put it under LLDB and then link LLDB against certain pieces in cases where that makes sense.
>  
>
> >
> > A good example of this would be LLDB’s DWARF parsing code, which is more featureful than LLVM’s but has kind of evolved in parallel.  Sinking this into LLVM would be one early target of such an effort, although over time there would likely be more.
>
> As you are undoubtedly aware we've been carefully rearchitecting LLVM's DWARF parser over the last few years to eventually become featureful enough so that LLDB could use it, so any help on that front would be most welcome. As long as we are careful to not regress in performance/lazyness, features and fault-tolerance, deduplicating the implementations can only be good for LLVM and LLDB.
>
> Yea, this is the general idea.   Has anyone actively been working on this specific effort recently?  To my knowledge someone started and then never finished, but the efforts also never made it upstream, so my understanding is that it's a goal, but one that nobody has made significant headway on.
> _______________________________________________
> lldb-dev mailing list
> [hidden email]
> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [lldb-dev] RFC: libtrace

Robin Eklind via llvm-dev


> On Jun 26, 2018, at 2:00 PM, Jim Ingham via lldb-dev <[hidden email]> wrote:
>
>
>> * unwinding and backtrace generation
>
> Jason says this will be somewhat tricky to pull out of lldb.  OTOH much of the complexity of unwind is reconstructing all the non-volatile registers, and if you don't care about values, you don't really need that.  So some kind of lightweight pc/sp only backtrace would be more appropriate, and probably faster for your needs.

If it were me & performance were the utmost concern, and I had a restricted platform set that I needed to support where I can assume the presence of eh_frame and that it is trustworthy in prologue/epilogues, then I'd probably just write a simple Unwind/RegisterContext plugin pair that exclusively live off of that.

If it's just stack walking, and we can assume no omit-frame-pointer code and we can assume the 0th function is always stopped in a non-prologue/epilogue location, then even simpler would be the old RegisterContextMacOSXFrameBackchain plugin would get you there.  That's what we used before we had the modern unwind/registercontext plugin that we use today.  It doesn't track spilled registers at all, it just looks for saved pc/framepointer values on the stack.


A general problem with stopping the inferior process and examining things is that it is slow.  Even if you use a NativeHost approach and get debugserver/lldb-server out of the equation, if you stop in a hot location it's very difficult to make this performant.  We've prototyped things like this in the past and it was always far too slow.  I don't know what your use case looks like, but I do worry about having one process controlling an inferior process in general for fast-turnaround data collection/experiments, it doesn't seem like the best way to go about it.




>
> Jim
>
>>
>>
>>
>>> At the same time we think that in doing so we can break things up into more granular pieces, ultimately exposing a larger testing surface and enabling us to create exhaustive tests, giving LLDB more fine grained testing of important subsystems.
>>
>> Are you thinking of the new utility as something that would naturally live in llvm/tools or as something that would live in the LLDB repository?
>> I would rather put it under LLDB and then link LLDB against certain pieces in cases where that makes sense.
>>
>>
>>>
>>> A good example of this would be LLDB’s DWARF parsing code, which is more featureful than LLVM’s but has kind of evolved in parallel.  Sinking this into LLVM would be one early target of such an effort, although over time there would likely be more.
>>
>> As you are undoubtedly aware we've been carefully rearchitecting LLVM's DWARF parser over the last few years to eventually become featureful enough so that LLDB could use it, so any help on that front would be most welcome. As long as we are careful to not regress in performance/lazyness, features and fault-tolerance, deduplicating the implementations can only be good for LLVM and LLDB.
>>
>> Yea, this is the general idea.   Has anyone actively been working on this specific effort recently?  To my knowledge someone started and then never finished, but the efforts also never made it upstream, so my understanding is that it's a goal, but one that nobody has made significant headway on.
>> _______________________________________________
>> lldb-dev mailing list
>> [hidden email]
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
>
> _______________________________________________
> lldb-dev mailing list
> [hidden email]
> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [lldb-dev] RFC: libtrace

Robin Eklind via llvm-dev
In reply to this post by Robin Eklind via llvm-dev
Yes that’s what I’ve been thinking about as well.

One thing I’ve been giving a lot of thought to is whether to serialize the handling of trace events.  I want to balance the “this is a library and you should be able to get it to work for you no matter what your use case is” aspect with the “you really just don’t want to go there, we know what’s best for you” aspect.  Then there’s the  fact that not all platforms behave the same, but we’d like a consistent set of expectations that makes it easy to use for everyone.

So I’m leaning towards having the library serialize all tace events, because it’s a nice common denominator that every platform can implement.

To be clear though, I don’t mean that if 2 processes are being traced simultaneously and A stops followed by B stopping, then the tool will necessarily block before handling  B’s stop.  I just mean that A and B’s stop handlers will be invoked on a single thread (not the threads which are tracing  A or B).  

So A stops, posts its stop event on the blessed thread and waits.  Then B stops and does the same thing.  A’s handler runs, for whatever reason decides it will continue later, saves off the event somewhere, then processes B’s.  Later something happens, it decides to continue A, signals A’s thread which wakes up.

I think this kind of design eliminates a large class of race conditions without sacrificing any performance.  

LLDB doesn’t currently work like this, but it would be nice not to end up with another split similar to the dwarf split, so I’m curious if you can think of any fundamental assumptions of LLDB’s architecture that this would violate.  This way we’d at least know that it’s possible to use the api in lldb (assuming it does everything lldb needs obviously) 

Thoughts?

On Tue, Jun 26, 2018 at 1:09 PM Jim Ingham <[hidden email]> wrote:
You'd probably need to pull the Unwinder in if you want backtraces, but that part shouldn't be that hard to disentangle.  I don't think you'd need much else?

Basing your work on NativeProcess rather than lldb proper would also cut the number of observer processes in half and avoid the context switches between the server and the debugger.  That seems more appropriate for a lightweight tool.

Jim


> On Jun 26, 2018, at 12:59 PM, Jim Ingham via lldb-dev <[hidden email]> wrote:
>
> So you aren't planning to print values at all, just stop points (i.e. you are only interested in the line table and function symbols part of DWARF)?
>
> Given what you've described so far, I'm wondering if what you really want is the NativeProcess classes with some symbol-file reading pulled in? Is there anything that you couldn't do from there?
>
> Jim
>
>
>> On Jun 26, 2018, at 12:48 PM, Zachary Turner <[hidden email]> wrote:
>>
>> no expression parser or knowledge of any specific programming language.
>>
>> Basically I just mean that the parsing of the native DWARF format itself is in scope, but anything beyond that is out of scope.  For symbolication we have things like llvm-symbolizer that already just work and are built on top of LLVM's dwarf parsing code.  Similarly, LLDB's type system could be built on top of it as well.  Given that I think everyone mostly agrees that unifying on one DWARF parser is a good idea in principle, this would mean no functional change from LLDB's point of view, it would just continue to do exactly what it does regarding parsing C++ expressions and converting these into types that clang understands.
>>
>> It will probably be useful someday to have an expression parser and language specific type system, but when that comes I don't think we'd want anything radically different than what LLDB already has.
>>
>> On Tue, Jun 26, 2018 at 12:26 PM Jim Ingham <[hidden email]> wrote:
>> Just to be clear, by "no clang integration" do you mean "no expression parser" or do you mean something more radical?  For instance, adding a TypeSystem and its DWARF parser for C family languages that uses a different underlying representation than Clang AST's to store the results would be a lot of work that wouldn't be terribly interesting to lldb.  I don't think that's what you meant, but wanted to be sure.
>>
>> Jim
>>
>>> On Jun 26, 2018, at 11:58 AM, Zachary Turner via lldb-dev <[hidden email]> wrote:
>>>
>>> Hi all,
>>>
>>> We have been thinking internally about a lightweight llvm-based ptracer.  To address one question up front: the primary way in which this differs from LLDB is that it targets a more narrow use case -- there is no scripting support, no clang integration, no dynamic extensibility, no support for running jitted code in the target, and no user interface.  We have several use cases internally that call for varying levels of functionality from such a utility, and being able to use as little as possible of the library as is necessary for the given task is important for the scale in which we wish to use it.
>>>
>>> We are still in early discussions and planning, but I think this would be a good addition to the LLVM upstream.  Since we’re approaching this as a set of small isolated components, my thinking is to work on this completely upstream, directly under the llvm project (as opposed to making a separate subproject), but I’m open to discussion if anyone feels differently.
>>>
>>> LLDB has solved a lot of the difficult problems needed for such a tool.  So in the spirit of code reuse, we think it’s worth trying componentize LLDB by sinking pieces into LLVM and rebasing LLDB as well as these smaller tools on top of these components, so that smaller tools can reduce code duplication and contribute to the overall health of the code base.  At the same time we think that in doing so we can break things up into more granular pieces, ultimately exposing a larger testing surface and enabling us to create exhaustive tests, giving LLDB more fine grained testing of important subsystems.
>>>
>>> A good example of this would be LLDB’s DWARF parsing code, which is more featureful than LLVM’s but has kind of evolved in parallel.  Sinking this into LLVM would be one early target of such an effort, although over time there would likely be more.
>>>
>>> Anyone have any thoughts / strong opinions on this proposal, or where the code should live?  Also, does anyone have any suggestions on things they’d like to see come out of this?  Whether it’s a specific new tool, new functionality to an existing tool, an architectural or design change to some existing tool or library, or something else entirely, all feedback and ideas are welcome.
>>>
>>> Thanks,
>>> Zach
>>>
>>> _______________________________________________
>>> lldb-dev mailing list
>>> [hidden email]
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
>>
>
> _______________________________________________
> lldb-dev mailing list
> [hidden email]
> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev


_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [lldb-dev] RFC: libtrace

Robin Eklind via llvm-dev
On Wed, 27 Jun 2018 at 01:14, Zachary Turner via lldb-dev
<[hidden email]> wrote:

>
> Yes that’s what I’ve been thinking about as well.
>
> One thing I’ve been giving a lot of thought to is whether to serialize the handling of trace events.  I want to balance the “this is a library and you should be able to get it to work for you no matter what your use case is” aspect with the “you really just don’t want to go there, we know what’s best for you” aspect.  Then there’s the  fact that not all platforms behave the same, but we’d like a consistent set of expectations that makes it easy to use for everyone.
>
> So I’m leaning towards having the library serialize all tace events, because it’s a nice common denominator that every platform can implement.
>
> To be clear though, I don’t mean that if 2 processes are being traced simultaneously and A stops followed by B stopping, then the tool will necessarily block before handling  B’s stop.  I just mean that A and B’s stop handlers will be invoked on a single thread (not the threads which are tracing  A or B).
>
> So A stops, posts its stop event on the blessed thread and waits.  Then B stops and does the same thing.  A’s handler runs, for whatever reason decides it will continue later, saves off the event somewhere, then processes B’s.  Later something happens, it decides to continue A, signals A’s thread which wakes up.
>
> I think this kind of design eliminates a large class of race conditions without sacrificing any performance.
>

Does this mean that you will always have to have at least two threads
(the one doing the tracing and the one where stop handlers are
invoked)? Because if that's true, then I'm not sure I buy the
no-performance-sacrifice part. Given that with ptrace (on linux at
least, but I think that holds for some other OSs too), all debugging
operations have to happen on a specific thread, if that thread is not
the one where the core logic happens, you will have to do a lot of
ping-pong to do all the debugging operations (read/write
registers/memory, set breakpoints, etc.). Of all the use cases, the
one where this matters most may be actually yours -- I'm not sure I
understand it fully but if the goal is to have as little impact on the
traced process, then this is going to be a problem, because every
microsecond you spend context-switching between these two threads is a
microsecond when the target process is not executing. In lldb-server
we avoid these context switches (and race conditions!) by being single
threaded. It think it would be good to keep things this way by having
the new api (the lowest layers of it?) accessible in a single-threaded
manner, at least on platforms where this is possible (everything
except windows, I guess).
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [lldb-dev] RFC: libtrace

Robin Eklind via llvm-dev
suppose process A (single threaded) is tracing process B (2 threads). If trace events happen on both threads of B, then the second thread can’t continue until both threads’ trace events have been fully handled, synchronously. If process A has a second thread though, the tracer thread can enqueue work via a lock free queue (or worst case scenario, a mutex), and continue immediately. So it seems less overhead this way.

That said, there seems to be no harm in exposing the lowest levels of the API with all of their os specific quirks, and one could be built on top that standardizes the assumptions and requirements
On Wed, Jun 27, 2018 at 12:56 AM Pavel Labath <[hidden email]> wrote:
On Wed, 27 Jun 2018 at 01:14, Zachary Turner via lldb-dev
<[hidden email]> wrote:
>
> Yes that’s what I’ve been thinking about as well.
>
> One thing I’ve been giving a lot of thought to is whether to serialize the handling of trace events.  I want to balance the “this is a library and you should be able to get it to work for you no matter what your use case is” aspect with the “you really just don’t want to go there, we know what’s best for you” aspect.  Then there’s the  fact that not all platforms behave the same, but we’d like a consistent set of expectations that makes it easy to use for everyone.
>
> So I’m leaning towards having the library serialize all tace events, because it’s a nice common denominator that every platform can implement.
>
> To be clear though, I don’t mean that if 2 processes are being traced simultaneously and A stops followed by B stopping, then the tool will necessarily block before handling  B’s stop.  I just mean that A and B’s stop handlers will be invoked on a single thread (not the threads which are tracing  A or B).
>
> So A stops, posts its stop event on the blessed thread and waits.  Then B stops and does the same thing.  A’s handler runs, for whatever reason decides it will continue later, saves off the event somewhere, then processes B’s.  Later something happens, it decides to continue A, signals A’s thread which wakes up.
>
> I think this kind of design eliminates a large class of race conditions without sacrificing any performance.
>

Does this mean that you will always have to have at least two threads
(the one doing the tracing and the one where stop handlers are
invoked)? Because if that's true, then I'm not sure I buy the
no-performance-sacrifice part. Given that with ptrace (on linux at
least, but I think that holds for some other OSs too), all debugging
operations have to happen on a specific thread, if that thread is not
the one where the core logic happens, you will have to do a lot of
ping-pong to do all the debugging operations (read/write
registers/memory, set breakpoints, etc.). Of all the use cases, the
one where this matters most may be actually yours -- I'm not sure I
understand it fully but if the goal is to have as little impact on the
traced process, then this is going to be a problem, because every
microsecond you spend context-switching between these two threads is a
microsecond when the target process is not executing. In lldb-server
we avoid these context switches (and race conditions!) by being single
threaded. It think it would be good to keep things this way by having
the new api (the lowest layers of it?) accessible in a single-threaded
manner, at least on platforms where this is possible (everything
except windows, I guess).

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [lldb-dev] RFC: libtrace

Robin Eklind via llvm-dev
On Wed, 27 Jun 2018 at 14:11, Zachary Turner <[hidden email]> wrote:
>
> suppose process A (single threaded) is tracing process B (2 threads). If trace events happen on both threads of B, then the second thread can’t continue until both threads’ trace events have been fully handled, synchronously. If process A has a second thread though, the tracer thread can enqueue work via a lock free queue (or worst case scenario, a mutex), and continue immediately. So it seems less overhead this way.

I think that depends a lot on the use case. I can certainly see how
multithreading would be beneficial if you have a lot of work to do
that can be done asynchronously. However, there isn't a ton of other
work in lldb-server. We always either wait for the process to stop (in
which case we quickly want to gather information and notify the client
about that), or we wait for a command from the client (in which case
we want to quickly execute it). Threading doesn't help either of those
cases.

If your tracer is doing some kind of asynchronous processing of the
process events (during which the inferior process can continue
running) then offloading that to a background thread makes sense.
However, even in that case, I think that the collection of the actual
data needed for the background processing would be best done
synchronously on the ptrace thread because: a) there will be no
context switching involved; b) It allows the client to specify exactly
the kind of data it wants to collect. Then the collected data can be
enqueued to the background thread or whatever, but this does not need
to be too tightly integrated with the core APIs.
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [lldb-dev] RFC: libtrace

Robin Eklind via llvm-dev
In reply to this post by Robin Eklind via llvm-dev

> On Jun 26, 2018, at 5:14 PM, Zachary Turner <[hidden email]> wrote:
>
> Yes that’s what I’ve been thinking about as well.
>
> One thing I’ve been giving a lot of thought to is whether to serialize the handling of trace events.  I want to balance the “this is a library and you should be able to get it to work for you no matter what your use case is” aspect with the “you really just don’t want to go there, we know what’s best for you” aspect.  Then there’s the  fact that not all platforms behave the same, but we’d like a consistent set of expectations that makes it easy to use for everyone.
>
> So I’m leaning towards having the library serialize all tace events, because it’s a nice common denominator that every platform can implement.
>
> To be clear though, I don’t mean that if 2 processes are being traced simultaneously and A stops followed by B stopping, then the tool will necessarily block before handling  B’s stop.  I just mean that A and B’s stop handlers will be invoked on a single thread (not the threads which are tracing  A or B).  
>
> So A stops, posts its stop event on the blessed thread and waits.  Then B stops and does the same thing.  A’s handler runs, for whatever reason decides it will continue later, saves off the event somewhere, then processes B’s.  Later something happens, it decides to continue A, signals A’s thread which wakes up.
>
> I think this kind of design eliminates a large class of race conditions without sacrificing any performance.  
>
> LLDB doesn’t currently work like this, but it would be nice not to end up with another split similar to the dwarf split, so I’m curious if you can think of any fundamental assumptions of LLDB’s architecture that this would violate.  This way we’d at least know that it’s possible to use the api in lldb (assuming it does everything lldb needs obviously)

What you describe is actually pretty much how the lldb driver works.  Every time the lower levels of the Process (e.g. ProcessGDBRemote) class notice something interesting happening to the process they are managing, they post an event to the Listener in charge of driving that process.  Then the process is allowed to continue on its way, either stopped or continued depending (the event records whether a restart has occurred.)  The upper levels only know about what happened to the process when they fetch an event off the event queue.  For a single process that serializes the reporting of process state.  

As to multiple processes, you can decide whether you want to serialize all the process events using the same mechanism or not, depending on your use case.

In the lldb driver, there's one Listener that waits on all processes (the Debugger's listener).  These events all get effectively serialized in its event loop.  So if you were just straight using lldb classes you could trivially implement what you want to achieve.

That being said, I don't think you want to use lldb's process event system for your ptracer.  It has a lot of complexity which supports handling reactions to events (breakpoint commands and conditions) that have to operate in the same context as user commands even though they happen before the user has regained control, and which might or might not restart the process out from under you.  They also manage the task of concealing the vast majority of stops from the higher level clients - for instance to pretend that a single "source line step over" didn't actually require lots  of stops and starts.  I don't think anything you have described requires handling either of these tasks.

But you could use the general event system to achieve the serialization of reporting w/o hooking into the lldb private/public state thread system.

Jim

>
> Thoughts?
>
> On Tue, Jun 26, 2018 at 1:09 PM Jim Ingham <[hidden email]> wrote:
> You'd probably need to pull the Unwinder in if you want backtraces, but that part shouldn't be that hard to disentangle.  I don't think you'd need much else?
>
> Basing your work on NativeProcess rather than lldb proper would also cut the number of observer processes in half and avoid the context switches between the server and the debugger.  That seems more appropriate for a lightweight tool.
>
> Jim
>
>
> > On Jun 26, 2018, at 12:59 PM, Jim Ingham via lldb-dev <[hidden email]> wrote:
> >
> > So you aren't planning to print values at all, just stop points (i.e. you are only interested in the line table and function symbols part of DWARF)?
> >
> > Given what you've described so far, I'm wondering if what you really want is the NativeProcess classes with some symbol-file reading pulled in? Is there anything that you couldn't do from there?
> >
> > Jim
> >
> >
> >> On Jun 26, 2018, at 12:48 PM, Zachary Turner <[hidden email]> wrote:
> >>
> >> no expression parser or knowledge of any specific programming language.
> >>
> >> Basically I just mean that the parsing of the native DWARF format itself is in scope, but anything beyond that is out of scope.  For symbolication we have things like llvm-symbolizer that already just work and are built on top of LLVM's dwarf parsing code.  Similarly, LLDB's type system could be built on top of it as well.  Given that I think everyone mostly agrees that unifying on one DWARF parser is a good idea in principle, this would mean no functional change from LLDB's point of view, it would just continue to do exactly what it does regarding parsing C++ expressions and converting these into types that clang understands.
> >>
> >> It will probably be useful someday to have an expression parser and language specific type system, but when that comes I don't think we'd want anything radically different than what LLDB already has.
> >>
> >> On Tue, Jun 26, 2018 at 12:26 PM Jim Ingham <[hidden email]> wrote:
> >> Just to be clear, by "no clang integration" do you mean "no expression parser" or do you mean something more radical?  For instance, adding a TypeSystem and its DWARF parser for C family languages that uses a different underlying representation than Clang AST's to store the results would be a lot of work that wouldn't be terribly interesting to lldb.  I don't think that's what you meant, but wanted to be sure.
> >>
> >> Jim
> >>
> >>> On Jun 26, 2018, at 11:58 AM, Zachary Turner via lldb-dev <[hidden email]> wrote:
> >>>
> >>> Hi all,
> >>>
> >>> We have been thinking internally about a lightweight llvm-based ptracer.  To address one question up front: the primary way in which this differs from LLDB is that it targets a more narrow use case -- there is no scripting support, no clang integration, no dynamic extensibility, no support for running jitted code in the target, and no user interface.  We have several use cases internally that call for varying levels of functionality from such a utility, and being able to use as little as possible of the library as is necessary for the given task is important for the scale in which we wish to use it.
> >>>
> >>> We are still in early discussions and planning, but I think this would be a good addition to the LLVM upstream.  Since we’re approaching this as a set of small isolated components, my thinking is to work on this completely upstream, directly under the llvm project (as opposed to making a separate subproject), but I’m open to discussion if anyone feels differently.
> >>>
> >>> LLDB has solved a lot of the difficult problems needed for such a tool.  So in the spirit of code reuse, we think it’s worth trying componentize LLDB by sinking pieces into LLVM and rebasing LLDB as well as these smaller tools on top of these components, so that smaller tools can reduce code duplication and contribute to the overall health of the code base.  At the same time we think that in doing so we can break things up into more granular pieces, ultimately exposing a larger testing surface and enabling us to create exhaustive tests, giving LLDB more fine grained testing of important subsystems.
> >>>
> >>> A good example of this would be LLDB’s DWARF parsing code, which is more featureful than LLVM’s but has kind of evolved in parallel.  Sinking this into LLVM would be one early target of such an effort, although over time there would likely be more.
> >>>
> >>> Anyone have any thoughts / strong opinions on this proposal, or where the code should live?  Also, does anyone have any suggestions on things they’d like to see come out of this?  Whether it’s a specific new tool, new functionality to an existing tool, an architectural or design change to some existing tool or library, or something else entirely, all feedback and ideas are welcome.
> >>>
> >>> Thanks,
> >>> Zach
> >>>
> >>> _______________________________________________
> >>> lldb-dev mailing list
> >>> [hidden email]
> >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
> >>
> >
> > _______________________________________________
> > lldb-dev mailing list
> > [hidden email]
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
>

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] [lldb-dev] RFC: libtrace

Robin Eklind via llvm-dev
The major difference between the way lldb works now and what a simple tracer library is going to want w.r.t. reporting events is that lldb currently assumes either

(a)control gets returned to the user with the process stopped and they will examine it at their leisure

(b) that the user restarted the process after gathering whatever data they needed out of band in the command callback before letting it go again.

So the stop events themselves carry very little data.  

If you decide to go with an event based approach, you'll need to gather everything you want be reported up, pack that into the event and then let the process restart and post the event.  That would be pretty easy to do with the lldb events as currently constituted.  Event types are pretty flexible and also use the llvm isa stuff so you can cast events to your special ptrace event type and call whatever reporting methods you want to add.

Doing it that way, you could for instance write another lldb-server like tool that runs a bunch of NativeProcess sessions the difference being that they post events rather than sending out gdb-remote T packets when they stop.

Jim
 

> On Jun 27, 2018, at 10:18 AM, Jim Ingham <[hidden email]> wrote:
>
>>
>> On Jun 26, 2018, at 5:14 PM, Zachary Turner <[hidden email]> wrote:
>>
>> Yes that’s what I’ve been thinking about as well.
>>
>> One thing I’ve been giving a lot of thought to is whether to serialize the handling of trace events.  I want to balance the “this is a library and you should be able to get it to work for you no matter what your use case is” aspect with the “you really just don’t want to go there, we know what’s best for you” aspect.  Then there’s the  fact that not all platforms behave the same, but we’d like a consistent set of expectations that makes it easy to use for everyone.
>>
>> So I’m leaning towards having the library serialize all tace events, because it’s a nice common denominator that every platform can implement.
>>
>> To be clear though, I don’t mean that if 2 processes are being traced simultaneously and A stops followed by B stopping, then the tool will necessarily block before handling  B’s stop.  I just mean that A and B’s stop handlers will be invoked on a single thread (not the threads which are tracing  A or B).  
>>
>> So A stops, posts its stop event on the blessed thread and waits.  Then B stops and does the same thing.  A’s handler runs, for whatever reason decides it will continue later, saves off the event somewhere, then processes B’s.  Later something happens, it decides to continue A, signals A’s thread which wakes up.
>>
>> I think this kind of design eliminates a large class of race conditions without sacrificing any performance.  
>>
>> LLDB doesn’t currently work like this, but it would be nice not to end up with another split similar to the dwarf split, so I’m curious if you can think of any fundamental assumptions of LLDB’s architecture that this would violate.  This way we’d at least know that it’s possible to use the api in lldb (assuming it does everything lldb needs obviously)
>
> What you describe is actually pretty much how the lldb driver works.  Every time the lower levels of the Process (e.g. ProcessGDBRemote) class notice something interesting happening to the process they are managing, they post an event to the Listener in charge of driving that process.  Then the process is allowed to continue on its way, either stopped or continued depending (the event records whether a restart has occurred.)  The upper levels only know about what happened to the process when they fetch an event off the event queue.  For a single process that serializes the reporting of process state.  
>
> As to multiple processes, you can decide whether you want to serialize all the process events using the same mechanism or not, depending on your use case.
>
> In the lldb driver, there's one Listener that waits on all processes (the Debugger's listener).  These events all get effectively serialized in its event loop.  So if you were just straight using lldb classes you could trivially implement what you want to achieve.
>
> That being said, I don't think you want to use lldb's process event system for your ptracer.  It has a lot of complexity which supports handling reactions to events (breakpoint commands and conditions) that have to operate in the same context as user commands even though they happen before the user has regained control, and which might or might not restart the process out from under you.  They also manage the task of concealing the vast majority of stops from the higher level clients - for instance to pretend that a single "source line step over" didn't actually require lots  of stops and starts.  I don't think anything you have described requires handling either of these tasks.
>
> But you could use the general event system to achieve the serialization of reporting w/o hooking into the lldb private/public state thread system.
>
> Jim
>
>>
>> Thoughts?
>>
>> On Tue, Jun 26, 2018 at 1:09 PM Jim Ingham <[hidden email]> wrote:
>> You'd probably need to pull the Unwinder in if you want backtraces, but that part shouldn't be that hard to disentangle.  I don't think you'd need much else?
>>
>> Basing your work on NativeProcess rather than lldb proper would also cut the number of observer processes in half and avoid the context switches between the server and the debugger.  That seems more appropriate for a lightweight tool.
>>
>> Jim
>>
>>
>>> On Jun 26, 2018, at 12:59 PM, Jim Ingham via lldb-dev <[hidden email]> wrote:
>>>
>>> So you aren't planning to print values at all, just stop points (i.e. you are only interested in the line table and function symbols part of DWARF)?
>>>
>>> Given what you've described so far, I'm wondering if what you really want is the NativeProcess classes with some symbol-file reading pulled in? Is there anything that you couldn't do from there?
>>>
>>> Jim
>>>
>>>
>>>> On Jun 26, 2018, at 12:48 PM, Zachary Turner <[hidden email]> wrote:
>>>>
>>>> no expression parser or knowledge of any specific programming language.
>>>>
>>>> Basically I just mean that the parsing of the native DWARF format itself is in scope, but anything beyond that is out of scope.  For symbolication we have things like llvm-symbolizer that already just work and are built on top of LLVM's dwarf parsing code.  Similarly, LLDB's type system could be built on top of it as well.  Given that I think everyone mostly agrees that unifying on one DWARF parser is a good idea in principle, this would mean no functional change from LLDB's point of view, it would just continue to do exactly what it does regarding parsing C++ expressions and converting these into types that clang understands.
>>>>
>>>> It will probably be useful someday to have an expression parser and language specific type system, but when that comes I don't think we'd want anything radically different than what LLDB already has.
>>>>
>>>> On Tue, Jun 26, 2018 at 12:26 PM Jim Ingham <[hidden email]> wrote:
>>>> Just to be clear, by "no clang integration" do you mean "no expression parser" or do you mean something more radical?  For instance, adding a TypeSystem and its DWARF parser for C family languages that uses a different underlying representation than Clang AST's to store the results would be a lot of work that wouldn't be terribly interesting to lldb.  I don't think that's what you meant, but wanted to be sure.
>>>>
>>>> Jim
>>>>
>>>>> On Jun 26, 2018, at 11:58 AM, Zachary Turner via lldb-dev <[hidden email]> wrote:
>>>>>
>>>>> Hi all,
>>>>>
>>>>> We have been thinking internally about a lightweight llvm-based ptracer.  To address one question up front: the primary way in which this differs from LLDB is that it targets a more narrow use case -- there is no scripting support, no clang integration, no dynamic extensibility, no support for running jitted code in the target, and no user interface.  We have several use cases internally that call for varying levels of functionality from such a utility, and being able to use as little as possible of the library as is necessary for the given task is important for the scale in which we wish to use it.
>>>>>
>>>>> We are still in early discussions and planning, but I think this would be a good addition to the LLVM upstream.  Since we’re approaching this as a set of small isolated components, my thinking is to work on this completely upstream, directly under the llvm project (as opposed to making a separate subproject), but I’m open to discussion if anyone feels differently.
>>>>>
>>>>> LLDB has solved a lot of the difficult problems needed for such a tool.  So in the spirit of code reuse, we think it’s worth trying componentize LLDB by sinking pieces into LLVM and rebasing LLDB as well as these smaller tools on top of these components, so that smaller tools can reduce code duplication and contribute to the overall health of the code base.  At the same time we think that in doing so we can break things up into more granular pieces, ultimately exposing a larger testing surface and enabling us to create exhaustive tests, giving LLDB more fine grained testing of important subsystems.
>>>>>
>>>>> A good example of this would be LLDB’s DWARF parsing code, which is more featureful than LLVM’s but has kind of evolved in parallel.  Sinking this into LLVM would be one early target of such an effort, although over time there would likely be more.
>>>>>
>>>>> Anyone have any thoughts / strong opinions on this proposal, or where the code should live?  Also, does anyone have any suggestions on things they’d like to see come out of this? Whether it’s a specific new tool, new functionality to an existing tool, an architectural or design change to some existing tool or library, or something else entirely, all feedback and ideas are welcome.
>>>>>
>>>>> Thanks,
>>>>> Zach
>>>>>
>>>>> _______________________________________________
>>>>> lldb-dev mailing list
>>>>> [hidden email]
>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
>>>>
>>>
>>> _______________________________________________
>>> lldb-dev mailing list
>>> [hidden email]
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev