[llvm-dev] Can creating new forms of debug info metadata be simplified? [formatting fixed]

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

[llvm-dev] Can creating new forms of debug info metadata be simplified? [formatting fixed]

韩玉 via llvm-dev
[Resending due to accidental markdown rendering - sorry]

Hi list,

Let's talk about adding a new type of debug info metadata. Here are the steps (at minimum - probably incomplete) one needs to take:

1. Create a new class in the hierarchy
2. Implement two forms of `MD_NODE_GET`
3. Specialize `MDNodeKeyImpl`
4. Modify `LLParser.cpp` and add serialization code for your special type
5. Modify `AsmWriter.cpp` and add serialization code for your special type 

I believe we can accomplish everything needed for debug info with just step 1 using a pattern found in Boost Serialization. Imagine a new API based on this concept:

```
class DIMyFancyType : public MDNode {
  StringRef FileName;

  template<typename Visitor>
  void visit(Visitor & v) {
     DINode::visit(s); // or not, if you stay true to boost.serialization
     v.name("DIMyFancyType");
     v.property("FileName",FileName);
  }
};
```

With this, we could implement steps 2-5 using a little bit of template meta-programming and we could also implement escape hatches where needed to get more specific, allowing us to keep many things in one place.

I imagine since there is now a `.def` file for the metadata (very useful!) that this is on somebody's mind and not just my own, so I'm curious about what people think. I realize that "new forms of debug metadata" is possibly not a very popular use case as there has been only one new kind added in the last few years. However, in my humble opinion, it would make it easier to add richer information allowing those of us extending LLVM to create better debuggers/debugging experiences.

Thanks for your time!

--
Sohail Somani
Fizz Buzz Inc.


_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Can creating new forms of debug info metadata be simplified? [formatting fixed]

韩玉 via llvm-dev
+some of the debug info cabal (& Duncan, as an emeritus member, and person who plumbed a lot of the current debug info syntax support in)

Visitor seems plausible though I haven't looked at the code in detail to see if it'd work perfectly.

On Tue, May 29, 2018 at 7:56 AM Sohail Somani (Fizz Buzz Inc.) via llvm-dev <[hidden email]> wrote:
[Resending due to accidental markdown rendering - sorry]

Hi list,

Let's talk about adding a new type of debug info metadata. Here are the steps (at minimum - probably incomplete) one needs to take:

1. Create a new class in the hierarchy
2. Implement two forms of `MD_NODE_GET`
3. Specialize `MDNodeKeyImpl`
4. Modify `LLParser.cpp` and add serialization code for your special type
5. Modify `AsmWriter.cpp` and add serialization code for your special type 

I believe we can accomplish everything needed for debug info with just step 1 using a pattern found in Boost Serialization. Imagine a new API based on this concept:

```
class DIMyFancyType : public MDNode {
  StringRef FileName;

  template<typename Visitor>
  void visit(Visitor & v) {
     DINode::visit(s); // or not, if you stay true to boost.serialization
     v.name("DIMyFancyType");
     v.property("FileName",FileName);
  }
};
```

With this, we could implement steps 2-5 using a little bit of template meta-programming and we could also implement escape hatches where needed to get more specific, allowing us to keep many things in one place.

I imagine since there is now a `.def` file for the metadata (very useful!) that this is on somebody's mind and not just my own, so I'm curious about what people think. I realize that "new forms of debug metadata" is possibly not a very popular use case as there has been only one new kind added in the last few years. However, in my humble opinion, it would make it easier to add richer information allowing those of us extending LLVM to create better debuggers/debugging experiences.

Thanks for your time!

--
Sohail Somani
Fizz Buzz Inc.

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Can creating new forms of debug info metadata be simplified? [formatting fixed]

韩玉 via llvm-dev


On May 29, 2018, at 12:28 PM, David Blaikie <[hidden email]> wrote:

+some of the debug info cabal (& Duncan, as an emeritus member, and person who plumbed a lot of the current debug info syntax support in)

Visitor seems plausible though I haven't looked at the code in detail to see if it'd work perfectly.

On Tue, May 29, 2018 at 7:56 AM Sohail Somani (Fizz Buzz Inc.) via llvm-dev <[hidden email]> wrote:
[Resending due to accidental markdown rendering - sorry]

Hi list,

Let's talk about adding a new type of debug info metadata. Here are the steps (at minimum - probably incomplete) one needs to take:

1. Create a new class in the hierarchy
2. Implement two forms of `MD_NODE_GET`
3. Specialize `MDNodeKeyImpl`
4. Modify `LLParser.cpp` and add serialization code for your special type
5. Modify `AsmWriter.cpp` and add serialization code for your special type 

I believe we can accomplish everything needed for debug info with just step 1 using a pattern found in Boost Serialization. Imagine a new API based on this concept:

```
class DIMyFancyType : public MDNode {
  StringRef FileName;

  template<typename Visitor>
  void visit(Visitor & v) {
     DINode::visit(s); // or not, if you stay true to boost.serialization
     v.name("DIMyFancyType");
     v.property("FileName",FileName);
  }
};
```

With this, we could implement steps 2-5 using a little bit of template meta-programming and we could also implement escape hatches where needed to get more specific, allowing us to keep many things in one place.

I imagine since there is now a `.def` file for the metadata (very useful!) that this is on somebody's mind and not just my own, so I'm curious about what people think. I realize that "new forms of debug metadata" is possibly not a very popular use case as there has been only one new kind added in the last few years. However, in my humble opinion, it would make it easier to add richer information allowing those of us extending LLVM to create better debuggers/debugging experiences.

Something (anything) along these lines seems like a good idea to me. In addition to the cost of adding new nodes, having less repetitive manually-written existing code reduces the chances for bugs and increases readability. There are some irregularities in the existing code that I'm aware of that would need to be still handled separately:
- The MDNodeKeyImpl currently is manually tuned to only hash members that are likely to differ.
- The deserialization code also supports various older serialization formats.

-- adrian


Thanks for your time!

--
Sohail Somani
Fizz Buzz Inc.

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Can creating new forms of debug info metadata be simplified? [formatting fixed]

韩玉 via llvm-dev


On May 29, 2018, at 12:55, Adrian Prantl <[hidden email]> wrote:



On May 29, 2018, at 12:28 PM, David Blaikie <[hidden email]> wrote:

+some of the debug info cabal (& Duncan, as an emeritus member, and person who plumbed a lot of the current debug info syntax support in)

Visitor seems plausible though I haven't looked at the code in detail to see if it'd work perfectly.

On Tue, May 29, 2018 at 7:56 AM Sohail Somani (Fizz Buzz Inc.) via llvm-dev <[hidden email]> wrote:
[Resending due to accidental markdown rendering - sorry]

Hi list,

Let's talk about adding a new type of debug info metadata. Here are the steps (at minimum - probably incomplete) one needs to take:

1. Create a new class in the hierarchy
2. Implement two forms of `MD_NODE_GET`
3. Specialize `MDNodeKeyImpl`
4. Modify `LLParser.cpp` and add serialization code for your special type
5. Modify `AsmWriter.cpp` and add serialization code for your special type 

I believe we can accomplish everything needed for debug info with just step 1 using a pattern found in Boost Serialization. Imagine a new API based on this concept:

```
class DIMyFancyType : public MDNode {
  StringRef FileName;

  template<typename Visitor>
  void visit(Visitor & v) {
     DINode::visit(s); // or not, if you stay true to boost.serialization
     v.name("DIMyFancyType");
     v.property("FileName",FileName);
  }
};
```

With this, we could implement steps 2-5 using a little bit of template meta-programming and we could also implement escape hatches where needed to get more specific, allowing us to keep many things in one place.

I imagine since there is now a `.def` file for the metadata (very useful!) that this is on somebody's mind and not just my own, so I'm curious about what people think. I realize that "new forms of debug metadata" is possibly not a very popular use case as there has been only one new kind added in the last few years. However, in my humble opinion, it would make it easier to add richer information allowing those of us extending LLVM to create better debuggers/debugging experiences.

SGTM!  There was a hope (not quite a plan) that these would eventually be tablegen'ed, but visitor sounds fine to me too.

Something (anything) along these lines seems like a good idea to me. In addition to the cost of adding new nodes, having less repetitive manually-written existing code reduces the chances for bugs and increases readability. There are some irregularities in the existing code that I'm aware of that would need to be still handled separately:
- The MDNodeKeyImpl currently is manually tuned to only hash members that are likely to differ.
- The deserialization code also supports various older serialization formats.

These are the two corner cases I was thinking of.


Thanks for your time!

--
Sohail Somani
Fizz Buzz Inc.

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev




_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Can creating new forms of debug info metadata be simplified? [formatting fixed]

韩玉 via llvm-dev
Thanks all for your response.

On Tue, May 29, 2018, at 5:38 PM, Duncan P. N. Exon Smith wrote:

>
>
>> On May 29, 2018, at 12:55, Adrian Prantl <[hidden email]> wrote:
>>
>>
>>
>>> On May 29, 2018, at 12:28 PM, David Blaikie <[hidden email]> wrote:
>>>
>>> +some of the debug info cabal (& Duncan, as an emeritus member, and person who plumbed a lot of the current debug info syntax support in)
>>>
>>> Visitor seems plausible though I haven't looked at the code in detail to see if it'd work perfectly.
>>>
>>>
> SGTM!  There was a hope (not quite a plan) that these would eventually be tablegen'ed, but visitor sounds fine to me too.
>
>> Something (anything) along these lines seems like a good idea to me. In addition to the cost of adding new nodes, having less repetitive manually-written existing code reduces the chances for bugs and increases readability. There are some irregularities in the existing code that I'm aware of that would need to be still handled separately:
>> - The MDNodeKeyImpl currently is manually tuned to only hash members that are likely to differ.
>> - The deserialization code also supports various older serialization formats.
>
> These are the two corner cases I was thinking of.

I had these ones in mind as well and my experience using the template visitor pattern for this kind of thing is that the API can usually be marked to add that metadata (metadata-ception).

For example, the concept of members necessary for uniquifying an instance via a hash will map nicely to the concept of "composite primary keys" which has a clean solution that I like[1], and could be used as inspiration. Or even something as simple as:

template<typename Visitor>
void visit(Visitor & v) {
   v.keys("Property1","Property2","Property3");
}

I have never done this, but I expect we could also use constexpr to unroll many things at compile-time. That could be fun!

Backward compatibility for serialization is also achievable using a method very similar to Boost serialization[2]:

template<typename Visitor>
void visit(Visitor & v) {
   v.keys("Version1Property");
   if(v.dbgInfoVersion >= 2) // always the latest when serializing out
      v.property("Version2Property",Version2Property);
   if(v.dbgInfoVersion >= 1) // otherwise set to the version when reading back in
      v.property("Version1Property","Version1Property");
}

If either of the above two use cases don't work out, TableGen'ing is a viable option as well.

Anything sound completely out of whack here? Any other use cases?

[1] https://www.webtoolkit.eu/wt/doc/reference/html/structWt_1_1Dbo_1_1dbo__traits.html
[2] https://www.boost.org/doc/libs/1_67_0/libs/serialization/doc/tutorial.html#versioning

>
>>>>
>>>> Thanks for your time!
>>>>
>>>> --
>>>> Sohail Somani
>>>> Fizz Buzz Inc.
>>>> Booking schedule: https://sohailsomani.youcanbook.me
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> [hidden email]
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Can creating new forms of debug info metadata be simplified? [formatting fixed]

韩玉 via llvm-dev


> On May 29, 2018, at 15:33, Sohail Somani (Fizz Buzz Inc.) <[hidden email]> wrote:
>
> Thanks all for your response.
>
> On Tue, May 29, 2018, at 5:38 PM, Duncan P. N. Exon Smith wrote:
>>
>>
>>> On May 29, 2018, at 12:55, Adrian Prantl <[hidden email]> wrote:
>>>
>>>
>>>
>>>> On May 29, 2018, at 12:28 PM, David Blaikie <[hidden email]> wrote:
>>>>
>>>> +some of the debug info cabal (& Duncan, as an emeritus member, and person who plumbed a lot of the current debug info syntax support in)
>>>>
>>>> Visitor seems plausible though I haven't looked at the code in detail to see if it'd work perfectly.
>>>>
>>>>
>> SGTM!  There was a hope (not quite a plan) that these would eventually be tablegen'ed, but visitor sounds fine to me too.
>>
>>> Something (anything) along these lines seems like a good idea to me. In addition to the cost of adding new nodes, having less repetitive manually-written existing code reduces the chances for bugs and increases readability. There are some irregularities in the existing code that I'm aware of that would need to be still handled separately:
>>> - The MDNodeKeyImpl currently is manually tuned to only hash members that are likely to differ.
>>> - The deserialization code also supports various older serialization formats.
>>
>> These are the two corner cases I was thinking of.
>
> I had these ones in mind as well and my experience using the template visitor pattern for this kind of thing is that the API can usually be marked to add that metadata (metadata-ception).
>
> For example, the concept of members necessary for uniquifying an instance via a hash will map nicely to the concept of "composite primary keys" which has a clean solution that I like[1], and could be used as inspiration. Or even something as simple as:
>
> template<typename Visitor>
> void visit(Visitor & v) {
>   v.keys("Property1","Property2","Property3");
> }
>
> I have never done this, but I expect we could also use constexpr to unroll many things at compile-time. That could be fun!
>
> Backward compatibility for serialization is also achievable using a method very similar to Boost serialization[2]:
>
> template<typename Visitor>
> void visit(Visitor & v) {
>   v.keys("Version1Property");
>   if(v.dbgInfoVersion >= 2) // always the latest when serializing out
>      v.property("Version2Property",Version2Property);
>   if(v.dbgInfoVersion >= 1) // otherwise set to the version when reading back in
>      v.property("Version1Property","Version1Property");
> }

As a heads up, bitcode records tend to be versioned somewhat implicitly based on their structure (e.g., number of fields).

>
> If either of the above two use cases don't work out, TableGen'ing is a viable option as well.

Or we can leave some code written explicitly (like reading bitcode, which seems the hardest to deal with).

> Anything sound completely out of whack here? Any other use cases?
>
> [1] https://www.webtoolkit.eu/wt/doc/reference/html/structWt_1_1Dbo_1_1dbo__traits.html
> [2] https://www.boost.org/doc/libs/1_67_0/libs/serialization/doc/tutorial.html#versioning
>
>>
>>>>>
>>>>> Thanks for your time!
>>>>>
>>>>> --
>>>>> Sohail Somani
>>>>> Fizz Buzz Inc.
>>>>> Booking schedule: https://sohailsomani.youcanbook.me
>>>>>
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> [hidden email]
>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Can creating new forms of debug info metadata be simplified? [formatting fixed]

韩玉 via llvm-dev
On Tue, May 29, 2018, at 7:41 PM, Duncan P. N. Exon Smith wrote:
> As a heads up, bitcode records tend to be versioned somewhat implicitly
> based on their structure (e.g., number of fields).

Yes, this is true. You want to do that to reduce space usage. Intuitively, I sense that we can work around this using a similar logic but it's something I'd have to think about some more. Alternatively, we could add some kind of a prefix field for the tag version.

> > If either of the above two use cases don't work out, TableGen'ing is a viable option as well.
>
> Or we can leave some code written explicitly (like reading bitcode,
> which seems the hardest to deal with).

I think if this were to be done, v1 would be designed to handle the 80% use case, with an escape hatch to use the current approach. So the complex cases (there is some hairy stuff) would continue to be written as is today.

Still, I don't want anyone to voluntarily have to write/read bitcode again, myself included, or perhaps especially myself :)

Sohail
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev