debug metadata incomplete for array arguments to functions?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

debug metadata incomplete for array arguments to functions?

eliben
Hello,

Consider the following two functions:

void foo(int* arg_ptr) {
...
}

void bar(int arg_arr[42]) {
...
}

------

According to the C standard, both arguments will be passed to the function as pointers. However, in the debug information metadata generated in LLVM, it appears that they are also equivalent.

More specifically, for both arg_ptr and arg_arr I get a derived type descriptor with the DW_TAG_pointer_type tag, referencing 'int' as the type derived from, in argument 8 of the relevant MDNode. There is no way to know arg_arr is in the user's eyes an array.

This reflects the compiler's view of things correctly, but is problematic for a debugger. The debugger should know that arg_arr refers to a 42-element array and isn't just a pointer into a buffer of unspecified length. This is something the user would expect.
On the other hand, the debugger should also get the information that arg_arr is actually a pointer, to be able to get to its values correctly.

Is there a way out?

Eli



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: debug metadata incomplete for array arguments to functions?

Devang Patel

On Jul 13, 2011, at 9:56 PM, Eli Bendersky wrote:

> Hello,
>
> Consider the following two functions:
>
> void foo(int* arg_ptr) {
> ...
> }
>
> void bar(int arg_arr[42]) {
> ...
> }
>
> ------
>
> According to the C standard, both arguments will be passed to the function as pointers. However, in the debug information metadata generated in LLVM, it appears that they are also equivalent.
>
> More specifically, for both arg_ptr and arg_arr I get a derived type descriptor with the DW_TAG_pointer_type tag, referencing 'int' as the type derived from, in argument 8 of the relevant MDNode. There is no way to know arg_arr is in the user's eyes an array.
>
> This reflects the compiler's view of things correctly, but is problematic for a debugger. The debugger should know that arg_arr refers to a 42-element array and isn't just a pointer into a buffer of unspecified length. This is something the user would expect.
> On the other hand, the debugger should also get the information that arg_arr is actually a pointer, to be able to get to its values correctly.
>
> Is there a way out?

The only way out is to figure out how to distinguish between these two in front-end AST nodes while emitting LLVM IR. The part of front end that emits debug info for an argument is seeing arg_arr as a pointer to int.

If you manually patch metadata in llvm IR then you'll get debug info as you expect.
-
Devang
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: debug metadata incomplete for array arguments to functions?

eliben

> This reflects the compiler's view of things correctly, but is problematic for a debugger. The debugger should know that arg_arr refers to a 42-element array and isn't just a pointer into a buffer of unspecified length. This is something the user would expect.
> On the other hand, the debugger should also get the information that arg_arr is actually a pointer, to be able to get to its values correctly.
>
> Is there a way out?

The only way out is to figure out how to distinguish between these two in front-end AST nodes while emitting LLVM IR. The part of front end that emits debug info for an argument is seeing arg_arr as a pointer to int.

If you manually patch metadata in llvm IR then you'll get debug info as you expect.
-

Suppose one would start writing a patch to Clang to rectify this, how would this information be encoded in the debug metadata, given the dual nature of the arg_arr argument? Is there a mechanism to support it, or is an extension required?

Thanks in advance
Eli







_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: debug metadata incomplete for array arguments to functions?

Renato Golin-5
On 15 July 2011 05:35, Eli Bendersky <[hidden email]> wrote:
> Suppose one would start writing a patch to Clang to rectify this, how would
> this information be encoded in the debug metadata, given the dual nature of
> the arg_arr argument? Is there a mechanism to support it, or is an extension
> required?

Hi Eli,

The first thing is to make sure it actually makes any difference. I'd
get an IR that is valid and can be debugged (but prints the wrong
declaration) and start by changing manually the metadata in the IR
until it achieves what you want (to work AND print the correct
declaration). Only then I'd start changing Clang...

Dwarf is too generic and debuggers' support is too ad-hoc for one to
say "correct implementation". I wouldn't assume anything before seeing
it working on a number of debuggers. Dwarf can be syntactically
correct and not be understood by some debuggers, but it can't ever be
syntactically incorrect, which I gather from your comment that it's
not in your case.

You can also see what other tool-chains generate from your example. It
may be the quickest path to get the right answer.

cheers,
--renato
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: debug metadata incomplete for array arguments to functions?

eliben
On Fri, Jul 15, 2011 at 11:27, Renato Golin <[hidden email]> wrote:
On 15 July 2011 05:35, Eli Bendersky <[hidden email]> wrote:
> Suppose one would start writing a patch to Clang to rectify this, how would
> this information be encoded in the debug metadata, given the dual nature of
> the arg_arr argument? Is there a mechanism to support it, or is an extension
> required?

Hi Eli,

The first thing is to make sure it actually makes any difference. I'd
get an IR that is valid and can be debugged (but prints the wrong
declaration) and start by changing manually the metadata in the IR
until it achieves what you want (to work AND print the correct
declaration). Only then I'd start changing Clang...

Dwarf is too generic and debuggers' support is too ad-hoc for one to
say "correct implementation". I wouldn't assume anything before seeing
it working on a number of debuggers. Dwarf can be syntactically
correct and not be understood by some debuggers, but it can't ever be
syntactically incorrect, which I gather from your comment that it's
not in your case.

You can also see what other tool-chains generate from your example. It
may be the quickest path to get the right answer.

Renato, I'm not sure I understand what you mean.
IIUC, debug metadata in IR is the source from which DWARF is eventually generated. If the debug metatada format doesn't support such a construct (providing an alternative "type view" for a pointer), there isn't much that can be done to get this code-gened into DWARF. So I asked Devang if to his knowledge such a mechanism exists in the current debug metadata format, or could be implemented with existing fields.

If yes, all I need is to improve debug metadata generation from clang for this case.

If not, then perhaps an extension to the debug metadata format is required to support this.

Do you imply that DWARF doesn't support this construct, so the "cause is lost"?

Thanks,
Eli


 


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: debug metadata incomplete for array arguments to functions?

Renato Golin-5
On 15 July 2011 12:18, Eli Bendersky <[hidden email]> wrote:
> Do you imply that DWARF doesn't support this construct, so the "cause is
> lost"?

Absolutely not!

I was just giving you some hints on how to investigate *how* this
construct needs to be represented in IR/Dwarf.

Using other tool-chains to produce Dwarf, that is both correct and
reflects the code as is, will help you trace back how the IR looks
like.

I was also suggesting, as Devang also said, that you start by changing
the IR by hand, before try more adventurous changes in Clang.

*If* it's not possible to represent that in IR (I have no idea), then
LLVM needs to be changed first.

Only when you're sure that there is a clear representation of your
problem in IR that you can go on changing Clang.

cheers,
--renato
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: debug metadata incomplete for array arguments to functions?

Devang Patel
In reply to this post by eliben


On Jul 14, 2011, at 9:35 PM, Eli Bendersky <[hidden email]> wrote:


> This reflects the compiler's view of things correctly, but is problematic for a debugger. The debugger should know that arg_arr refers to a 42-element array and isn't just a pointer into a buffer of unspecified length. This is something the user would expect.
> On the other hand, the debugger should also get the information that arg_arr is actually a pointer, to be able to get to its values correctly.
>
> Is there a way out?

The only way out is to figure out how to distinguish between these two in front-end AST nodes while emitting LLVM IR. The part of front end that emits debug info for an argument is seeing arg_arr as a pointer to int.

If you manually patch metadata in llvm IR then you'll get debug info as you expect.
-

Suppose one would start writing a patch to Clang to rectify this, how would this information be encoded in the debug metadata, given the dual nature of the arg_arr argument? Is there a mechanism to support it, or is an extension required?


Aha.. dual mode. I am not sure there is a straight forward way in dwarf to explicitly say this type is int[42] 

but treat it is a int*.

One possible approach is to use array type in subprogram declaration and use int * in subprogram definition
 and teach debugger to understand what you are trying to say.

-
Devang

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: debug metadata incomplete for array arguments to functions?

Renato Golin-4
On 15 July 2011 18:42, Devang Patel <[hidden email]> wrote:
>  and teach debugger to understand what you are trying to say.

That's the hard part.

--renato

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: debug metadata incomplete for array arguments to functions?

Frits van Bommel-2
In reply to this post by eliben
On 15 July 2011 06:35, Eli Bendersky <[hidden email]> wrote:

>
>> > This reflects the compiler's view of things correctly, but is
>> > problematic for a debugger. The debugger should know that arg_arr refers to
>> > a 42-element array and isn't just a pointer into a buffer of unspecified
>> > length. This is something the user would expect.
>> > On the other hand, the debugger should also get the information that
>> > arg_arr is actually a pointer, to be able to get to its values correctly.
>> >
>> > Is there a way out?
>>
>> The only way out is to figure out how to distinguish between these two in
>> front-end AST nodes while emitting LLVM IR. The part of front end that emits
>> debug info for an argument is seeing arg_arr as a pointer to int.
>>
>> If you manually patch metadata in llvm IR then you'll get debug info as
>> you expect.
>> -
>
> Suppose one would start writing a patch to Clang to rectify this, how would
> this information be encoded in the debug metadata, given the dual nature of
> the arg_arr argument? Is there a mechanism to support it, or is an extension
> required?

Can't you just encode it as if it were a reference to a static array?
In other words, generate debug info for
  void bar(int arg_arr[42]) {
  ...
  }
as if it said
  void bar(int (&arg_arr)[42]) {
  ...
  }
instead.

A reference should be equivalent to a pointer for codegen purposes,
and this should tell the debugger that the parameter is actually an
array according to the source code.

I don't know whether this might lead to the debugger showing the
implicit '&' when printing out the parameter though, but I don't think
that would be a big problem.
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: debug metadata incomplete for array arguments to functions?

John McCall-2
On Jul 17, 2011, at 4:43 AM, Frits van Bommel wrote:

> On 15 July 2011 06:35, Eli Bendersky <[hidden email]> wrote:
>>
>>>> This reflects the compiler's view of things correctly, but is
>>>> problematic for a debugger. The debugger should know that arg_arr refers to
>>>> a 42-element array and isn't just a pointer into a buffer of unspecified
>>>> length. This is something the user would expect.
>>>> On the other hand, the debugger should also get the information that
>>>> arg_arr is actually a pointer, to be able to get to its values correctly.
>>>>
>>>> Is there a way out?
>>>
>>> The only way out is to figure out how to distinguish between these two in
>>> front-end AST nodes while emitting LLVM IR. The part of front end that emits
>>> debug info for an argument is seeing arg_arr as a pointer to int.
>>>
>>> If you manually patch metadata in llvm IR then you'll get debug info as
>>> you expect.
>>> -
>>
>> Suppose one would start writing a patch to Clang to rectify this, how would
>> this information be encoded in the debug metadata, given the dual nature of
>> the arg_arr argument? Is there a mechanism to support it, or is an extension
>> required?
>
> Can't you just encode it as if it were a reference to a static array?
> In other words, generate debug info for
>  void bar(int arg_arr[42]) {
>  ...
>  }
> as if it said
>  void bar(int (&arg_arr)[42]) {
>  ...
>  }
> instead.

That would lead to &arg_arr having the wrong type and (furthermore)
would prevent users from calling the function with an array not of size 42.
You don't want to go down this road.

The interesting question here is whether there's any way to represent
the "written" type in DWARF.  If not, we could certainly introduce an
extension, and then hack various debuggers to support it, but I'm not
sure this problem really merits that.

John.
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev