RFC: Using zlib to decompress debug info sections.

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

RFC: Using zlib to decompress debug info sections.

Alexey Samsonov
Hi!

TL;DR WDYT of adding zlib decompression capabilities to LLVMObject library?

ld.gold from GNU binutils has --compress-debug-sections=zlib option,
which uses zlib to compress .debug_xxx sections and renames them to .zdebug_xxx.
binutils (and GDB) support this properly, while LLVM command line tools don't:

$ ld --version
GNU gold (GNU Binutils for Ubuntu 2.22) 1.11
$ ./bin/clang++ -g a.cc -Wl,--compress-debug-sections=zlib
$ objdump -h a.out | grep debug
 26 .debug_info   00000066  0000000000000000  0000000000000000  00002010  2**0
 27 .debug_abbrev 00000048  0000000000000000  0000000000000000  00002068  2**0
 28 .debug_aranges 00000000  0000000000000000  0000000000000000  000020bb  2**0
 29 .debug_macinfo 00000000  0000000000000000  0000000000000000  000020cf  2**0
 30 .debug_line   00000053  0000000000000000  0000000000000000  000020e3  2**0
 31 .debug_loc    00000000  0000000000000000  0000000000000000  0000213e  2**0
 32 .debug_pubtypes 00000000  0000000000000000  0000000000000000  00002152  2**0
 33 .debug_str    00000069  0000000000000000  0000000000000000  00002166  2**0
 34 .debug_ranges 00000000  0000000000000000  0000000000000000  000021d9  2**0
$ ./bin/llvm-objdump -h a.out | grep debug
 27 .zdebug_info  00000058 0000000000000000 
 28 .zdebug_abbrev 00000053 0000000000000000 
 29 .zdebug_aranges 00000014 0000000000000000 
 30 .zdebug_macinfo 00000014 0000000000000000 
 31 .zdebug_line  0000005b 0000000000000000 
 32 .zdebug_loc   00000014 0000000000000000 
 33 .zdebug_pubtypes 00000014 0000000000000000 
 34 .zdebug_str   00000073 0000000000000000 
 35 .zdebug_ranges 00000014 0000000000000000

Decompression and proper handling of debug info sections may be needed
in llvm-dwarfdump and llvm-symbolizer tools. We can implement this by:
1) Checking if zlib is present in the system during configuration.
2) Adding zlib decompression to llvm::MemoryBuffer, and section decompression to LLVMObject (this would require optional linking with -lz).
3) Using the methods in LLVM tools where needed.

Does this make sense to you?

-- 
Alexey Samsonov, MSK

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Using zlib to decompress debug info sections.

Nick Lewycky
On 04/16/2013 02:37 AM, Alexey Samsonov wrote:
> Hi!
>
> TL;DR WDYT of adding zlib decompression capabilities to LLVMObject library?

Yes, I want this.

> ld.gold from GNU binutils has --compress-debug-sections=zlib option,
> which uses zlib to compress .debug_xxx sections and renames them to
> .zdebug_xxx.
> binutils (and GDB) support this properly, while LLVM command line tools
> don't:
>
> $ ld --version
> GNU gold (GNU Binutils for Ubuntu 2.22) 1.11
> $ ./bin/clang++ -g a.cc -Wl,--compress-debug-sections=zlib
> $ objdump -h a.out | grep debug
>   26 .debug_info   00000066  0000000000000000  0000000000000000
>   00002010  2**0
>   27 .debug_abbrev 00000048  0000000000000000  0000000000000000
>   00002068  2**0
>   28 .debug_aranges 00000000  0000000000000000  0000000000000000
>   000020bb  2**0
>   29 .debug_macinfo 00000000  0000000000000000  0000000000000000
>   000020cf  2**0
>   30 .debug_line   00000053  0000000000000000  0000000000000000
>   000020e3  2**0
>   31 .debug_loc    00000000  0000000000000000  0000000000000000
>   0000213e  2**0
>   32 .debug_pubtypes 00000000  0000000000000000  0000000000000000
>   00002152  2**0
>   33 .debug_str    00000069  0000000000000000  0000000000000000
>   00002166  2**0
>   34 .debug_ranges 00000000  0000000000000000  0000000000000000
>   000021d9  2**0
> $ ./bin/llvm-objdump -h a.out | grep debug
>   27 .zdebug_info  00000058 0000000000000000
>   28 .zdebug_abbrev 00000053 0000000000000000
>   29 .zdebug_aranges 00000014 0000000000000000
>   30 .zdebug_macinfo 00000014 0000000000000000
>   31 .zdebug_line  0000005b 0000000000000000
>   32 .zdebug_loc   00000014 0000000000000000
>   33 .zdebug_pubtypes 00000014 0000000000000000
>   34 .zdebug_str   00000073 0000000000000000
>   35 .zdebug_ranges 00000014 0000000000000000
>
> Decompression and proper handling of debug info sections may be needed
> in llvm-dwarfdump and llvm-symbolizer tools. We can implement this by:
> 1) Checking if zlib is present in the system during configuration.
> 2) Adding zlib decompression to llvm::MemoryBuffer, and section
> decompression to LLVMObject (this would require optional linking with -lz).
> 3) Using the methods in LLVM tools where needed.
>
> Does this make sense to you?

Yes, exactly. I'm not certain that MemoryBuffer and LLVMObject are the
right places, but it doesn't sound wrong.

Nick
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Using zlib to decompress debug info sections.

Michael Spencer-4
In reply to this post by Alexey Samsonov
On Tue, Apr 16, 2013 at 2:37 AM, Alexey Samsonov <[hidden email]> wrote:
Hi!

TL;DR WDYT of adding zlib decompression capabilities to LLVMObject library?

ld.gold from GNU binutils has --compress-debug-sections=zlib option,
which uses zlib to compress .debug_xxx sections and renames them to .zdebug_xxx.
binutils (and GDB) support this properly, while LLVM command line tools don't:

$ ld --version
GNU gold (GNU Binutils for Ubuntu 2.22) 1.11
$ ./bin/clang++ -g a.cc -Wl,--compress-debug-sections=zlib
$ objdump -h a.out | grep debug
 26 .debug_info   00000066  0000000000000000  0000000000000000  00002010  2**0
 27 .debug_abbrev 00000048  0000000000000000  0000000000000000  00002068  2**0
 28 .debug_aranges 00000000  0000000000000000  0000000000000000  000020bb  2**0
 29 .debug_macinfo 00000000  0000000000000000  0000000000000000  000020cf  2**0
 30 .debug_line   00000053  0000000000000000  0000000000000000  000020e3  2**0
 31 .debug_loc    00000000  0000000000000000  0000000000000000  0000213e  2**0
 32 .debug_pubtypes 00000000  0000000000000000  0000000000000000  00002152  2**0
 33 .debug_str    00000069  0000000000000000  0000000000000000  00002166  2**0
 34 .debug_ranges 00000000  0000000000000000  0000000000000000  000021d9  2**0
$ ./bin/llvm-objdump -h a.out | grep debug
 27 .zdebug_info  00000058 0000000000000000 
 28 .zdebug_abbrev 00000053 0000000000000000 
 29 .zdebug_aranges 00000014 0000000000000000 
 30 .zdebug_macinfo 00000014 0000000000000000 
 31 .zdebug_line  0000005b 0000000000000000 
 32 .zdebug_loc   00000014 0000000000000000 
 33 .zdebug_pubtypes 00000014 0000000000000000 
 34 .zdebug_str   00000073 0000000000000000 
 35 .zdebug_ranges 00000014 0000000000000000

Decompression and proper handling of debug info sections may be needed
in llvm-dwarfdump and llvm-symbolizer tools. We can implement this by:
1) Checking if zlib is present in the system during configuration.
2) Adding zlib decompression to llvm::MemoryBuffer, and section decompression to LLVMObject (this would require optional linking with -lz).
3) Using the methods in LLVM tools where needed.

Does this make sense to you?

-- 
Alexey Samsonov, MSK

I'm not sure MemoryBuffer is the right place to do this either. I'm also not sure if we want debug info decompression to be transparent in LLVMObject or not. I'm leaning towards no since it's not part of the standard yet, unless gold is actually using the SHF_COMPRESSED flag.

I think it should be part of Object, but as an external API that is used when you find a section you know from external factors (the name matches some list) is compressed.

- Michael Spencer
 

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Using zlib to decompress debug info sections.

Alexey Samsonov

On Tue, Apr 16, 2013 at 8:31 PM, Michael Spencer <[hidden email]> wrote:
On Tue, Apr 16, 2013 at 2:37 AM, Alexey Samsonov <[hidden email]> wrote:
Hi!

TL;DR WDYT of adding zlib decompression capabilities to LLVMObject library?

ld.gold from GNU binutils has --compress-debug-sections=zlib option,
which uses zlib to compress .debug_xxx sections and renames them to .zdebug_xxx.
binutils (and GDB) support this properly, while LLVM command line tools don't:

$ ld --version
GNU gold (GNU Binutils for Ubuntu 2.22) 1.11
$ ./bin/clang++ -g a.cc -Wl,--compress-debug-sections=zlib
$ objdump -h a.out | grep debug
 26 .debug_info   00000066  0000000000000000  0000000000000000  00002010  2**0
 27 .debug_abbrev 00000048  0000000000000000  0000000000000000  00002068  2**0
 28 .debug_aranges 00000000  0000000000000000  0000000000000000  000020bb  2**0
 29 .debug_macinfo 00000000  0000000000000000  0000000000000000  000020cf  2**0
 30 .debug_line   00000053  0000000000000000  0000000000000000  000020e3  2**0
 31 .debug_loc    00000000  0000000000000000  0000000000000000  0000213e  2**0
 32 .debug_pubtypes 00000000  0000000000000000  0000000000000000  00002152  2**0
 33 .debug_str    00000069  0000000000000000  0000000000000000  00002166  2**0
 34 .debug_ranges 00000000  0000000000000000  0000000000000000  000021d9  2**0
$ ./bin/llvm-objdump -h a.out | grep debug
 27 .zdebug_info  00000058 0000000000000000 
 28 .zdebug_abbrev 00000053 0000000000000000 
 29 .zdebug_aranges 00000014 0000000000000000 
 30 .zdebug_macinfo 00000014 0000000000000000 
 31 .zdebug_line  0000005b 0000000000000000 
 32 .zdebug_loc   00000014 0000000000000000 
 33 .zdebug_pubtypes 00000014 0000000000000000 
 34 .zdebug_str   00000073 0000000000000000 
 35 .zdebug_ranges 00000014 0000000000000000

Decompression and proper handling of debug info sections may be needed
in llvm-dwarfdump and llvm-symbolizer tools. We can implement this by:
1) Checking if zlib is present in the system during configuration.
2) Adding zlib decompression to llvm::MemoryBuffer, and section decompression to LLVMObject (this would require optional linking with -lz).
3) Using the methods in LLVM tools where needed.

Does this make sense to you?

-- 
Alexey Samsonov, MSK

I'm not sure MemoryBuffer is the right place to do this either. I'm also not sure if we want debug info decompression to be transparent in LLVMObject or not. I'm leaning towards no since it's not part of the standard yet,

Yeah, I also think that decompression should be explicitly requested by the user of LLVMObject.
 
unless gold is actually using the SHF_COMPRESSED flag. 

I think it should be part of Object, but as an external API that is used when you find a section you know from external factors (the name matches some list) is compressed.

- Michael Spencer
 


--
Alexey Samsonov, MSK

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Using zlib to decompress debug info sections.

Eric Christopher
On Tue, Apr 16, 2013 at 9:37 AM, Alexey Samsonov <[hidden email]> wrote:

>
> On Tue, Apr 16, 2013 at 8:31 PM, Michael Spencer <[hidden email]>
> wrote:
>>
>> On Tue, Apr 16, 2013 at 2:37 AM, Alexey Samsonov <[hidden email]>
>> wrote:
>>>
>>> Hi!
>>>
>>> TL;DR WDYT of adding zlib decompression capabilities to LLVMObject
>>> library?
>>>
>>>
>>> ld.gold from GNU binutils has --compress-debug-sections=zlib option,
>>> which uses zlib to compress .debug_xxx sections and renames them to
>>> .zdebug_xxx.
>>> binutils (and GDB) support this properly, while LLVM command line tools
>>> don't:
>>>
>>> $ ld --version
>>> GNU gold (GNU Binutils for Ubuntu 2.22) 1.11
>>> $ ./bin/clang++ -g a.cc -Wl,--compress-debug-sections=zlib
>>> $ objdump -h a.out | grep debug
>>>  26 .debug_info   00000066  0000000000000000  0000000000000000  00002010
>>> 2**0
>>>  27 .debug_abbrev 00000048  0000000000000000  0000000000000000  00002068
>>> 2**0
>>>  28 .debug_aranges 00000000  0000000000000000  0000000000000000  000020bb
>>> 2**0
>>>  29 .debug_macinfo 00000000  0000000000000000  0000000000000000  000020cf
>>> 2**0
>>>  30 .debug_line   00000053  0000000000000000  0000000000000000  000020e3
>>> 2**0
>>>  31 .debug_loc    00000000  0000000000000000  0000000000000000  0000213e
>>> 2**0
>>>  32 .debug_pubtypes 00000000  0000000000000000  0000000000000000
>>> 00002152  2**0
>>>  33 .debug_str    00000069  0000000000000000  0000000000000000  00002166
>>> 2**0
>>>  34 .debug_ranges 00000000  0000000000000000  0000000000000000  000021d9
>>> 2**0
>>> $ ./bin/llvm-objdump -h a.out | grep debug
>>>  27 .zdebug_info  00000058 0000000000000000
>>>  28 .zdebug_abbrev 00000053 0000000000000000
>>>  29 .zdebug_aranges 00000014 0000000000000000
>>>  30 .zdebug_macinfo 00000014 0000000000000000
>>>  31 .zdebug_line  0000005b 0000000000000000
>>>  32 .zdebug_loc   00000014 0000000000000000
>>>  33 .zdebug_pubtypes 00000014 0000000000000000
>>>  34 .zdebug_str   00000073 0000000000000000
>>>  35 .zdebug_ranges 00000014 0000000000000000
>>>
>>> Decompression and proper handling of debug info sections may be needed
>>> in llvm-dwarfdump and llvm-symbolizer tools. We can implement this by:
>>> 1) Checking if zlib is present in the system during configuration.
>>> 2) Adding zlib decompression to llvm::MemoryBuffer, and section
>>> decompression to LLVMObject (this would require optional linking with -lz).
>>> 3) Using the methods in LLVM tools where needed.
>>>
>>> Does this make sense to you?
>>>
>>> --
>>> Alexey Samsonov, MSK
>>
>>
>> I'm not sure MemoryBuffer is the right place to do this either. I'm also
>> not sure if we want debug info decompression to be transparent in LLVMObject
>> or not. I'm leaning towards no since it's not part of the standard yet,
>
>
> Yeah, I also think that decompression should be explicitly requested by the
> user of LLVMObject.
>
>>
>> unless gold is actually using the SHF_COMPRESSED flag.
>>
>>
>> I think it should be part of Object, but as an external API that is used
>> when you find a section you know from external factors (the name matches
>> some list) is compressed.
>>

Definitely want the feature :)

I don't see SHF_COMPRESSED (unless readelf just isn't showing it to
me), but it wouldn't be too hard to get binutils to mark them as such.
Right now the convention is .z<foo> are compressed, but that's not as
precise as we'd like it to be. There's been some talk on the binutils
list about it, but it hasn't been implemented yet.

-eric
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Using zlib to decompress debug info sections.

Alexey Samsonov
Just in case - do we want to link with libz.so installed in the system, or be self-contained and copy sources to LLVM repo?


On Tue, Apr 16, 2013 at 10:48 PM, Eric Christopher <[hidden email]> wrote:
On Tue, Apr 16, 2013 at 9:37 AM, Alexey Samsonov <[hidden email]> wrote:
>
> On Tue, Apr 16, 2013 at 8:31 PM, Michael Spencer <[hidden email]>
> wrote:
>>
>> On Tue, Apr 16, 2013 at 2:37 AM, Alexey Samsonov <[hidden email]>
>> wrote:
>>>
>>> Hi!
>>>
>>> TL;DR WDYT of adding zlib decompression capabilities to LLVMObject
>>> library?
>>>
>>>
>>> ld.gold from GNU binutils has --compress-debug-sections=zlib option,
>>> which uses zlib to compress .debug_xxx sections and renames them to
>>> .zdebug_xxx.
>>> binutils (and GDB) support this properly, while LLVM command line tools
>>> don't:
>>>
>>> $ ld --version
>>> GNU gold (GNU Binutils for Ubuntu 2.22) 1.11
>>> $ ./bin/clang++ -g a.cc -Wl,--compress-debug-sections=zlib
>>> $ objdump -h a.out | grep debug
>>>  26 .debug_info   00000066  0000000000000000  0000000000000000  00002010
>>> 2**0
>>>  27 .debug_abbrev 00000048  0000000000000000  0000000000000000  00002068
>>> 2**0
>>>  28 .debug_aranges 00000000  0000000000000000  0000000000000000  000020bb
>>> 2**0
>>>  29 .debug_macinfo 00000000  0000000000000000  0000000000000000  000020cf
>>> 2**0
>>>  30 .debug_line   00000053  0000000000000000  0000000000000000  000020e3
>>> 2**0
>>>  31 .debug_loc    00000000  0000000000000000  0000000000000000  0000213e
>>> 2**0
>>>  32 .debug_pubtypes 00000000  0000000000000000  0000000000000000
>>> 00002152  2**0
>>>  33 .debug_str    00000069  0000000000000000  0000000000000000  00002166
>>> 2**0
>>>  34 .debug_ranges 00000000  0000000000000000  0000000000000000  000021d9
>>> 2**0
>>> $ ./bin/llvm-objdump -h a.out | grep debug
>>>  27 .zdebug_info  00000058 0000000000000000
>>>  28 .zdebug_abbrev 00000053 0000000000000000
>>>  29 .zdebug_aranges 00000014 0000000000000000
>>>  30 .zdebug_macinfo 00000014 0000000000000000
>>>  31 .zdebug_line  0000005b 0000000000000000
>>>  32 .zdebug_loc   00000014 0000000000000000
>>>  33 .zdebug_pubtypes 00000014 0000000000000000
>>>  34 .zdebug_str   00000073 0000000000000000
>>>  35 .zdebug_ranges 00000014 0000000000000000
>>>
>>> Decompression and proper handling of debug info sections may be needed
>>> in llvm-dwarfdump and llvm-symbolizer tools. We can implement this by:
>>> 1) Checking if zlib is present in the system during configuration.
>>> 2) Adding zlib decompression to llvm::MemoryBuffer, and section
>>> decompression to LLVMObject (this would require optional linking with -lz).
>>> 3) Using the methods in LLVM tools where needed.
>>>
>>> Does this make sense to you?
>>>
>>> --
>>> Alexey Samsonov, MSK
>>
>>
>> I'm not sure MemoryBuffer is the right place to do this either. I'm also
>> not sure if we want debug info decompression to be transparent in LLVMObject
>> or not. I'm leaning towards no since it's not part of the standard yet,
>
>
> Yeah, I also think that decompression should be explicitly requested by the
> user of LLVMObject.
>
>>
>> unless gold is actually using the SHF_COMPRESSED flag.
>>
>>
>> I think it should be part of Object, but as an external API that is used
>> when you find a section you know from external factors (the name matches
>> some list) is compressed.
>>

Definitely want the feature :)

I don't see SHF_COMPRESSED (unless readelf just isn't showing it to
me), but it wouldn't be too hard to get binutils to mark them as such.
Right now the convention is .z<foo> are compressed, but that's not as
precise as we'd like it to be. There's been some talk on the binutils
list about it, but it hasn't been implemented yet.

-eric



--
Alexey Samsonov, MSK

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Using zlib to decompress debug info sections.

Eric Christopher
Historically we've done the former. The latter would require Chris
wanting to do that.

-eric

On Tue, Apr 16, 2013 at 11:52 AM, Alexey Samsonov <[hidden email]> wrote:

> Just in case - do we want to link with libz.so installed in the system, or
> be self-contained and copy sources to LLVM repo?
>
>
> On Tue, Apr 16, 2013 at 10:48 PM, Eric Christopher <[hidden email]>
> wrote:
>>
>> On Tue, Apr 16, 2013 at 9:37 AM, Alexey Samsonov <[hidden email]>
>> wrote:
>> >
>> > On Tue, Apr 16, 2013 at 8:31 PM, Michael Spencer <[hidden email]>
>> > wrote:
>> >>
>> >> On Tue, Apr 16, 2013 at 2:37 AM, Alexey Samsonov <[hidden email]>
>> >> wrote:
>> >>>
>> >>> Hi!
>> >>>
>> >>> TL;DR WDYT of adding zlib decompression capabilities to LLVMObject
>> >>> library?
>> >>>
>> >>>
>> >>> ld.gold from GNU binutils has --compress-debug-sections=zlib option,
>> >>> which uses zlib to compress .debug_xxx sections and renames them to
>> >>> .zdebug_xxx.
>> >>> binutils (and GDB) support this properly, while LLVM command line
>> >>> tools
>> >>> don't:
>> >>>
>> >>> $ ld --version
>> >>> GNU gold (GNU Binutils for Ubuntu 2.22) 1.11
>> >>> $ ./bin/clang++ -g a.cc -Wl,--compress-debug-sections=zlib
>> >>> $ objdump -h a.out | grep debug
>> >>>  26 .debug_info   00000066  0000000000000000  0000000000000000
>> >>> 00002010
>> >>> 2**0
>> >>>  27 .debug_abbrev 00000048  0000000000000000  0000000000000000
>> >>> 00002068
>> >>> 2**0
>> >>>  28 .debug_aranges 00000000  0000000000000000  0000000000000000
>> >>> 000020bb
>> >>> 2**0
>> >>>  29 .debug_macinfo 00000000  0000000000000000  0000000000000000
>> >>> 000020cf
>> >>> 2**0
>> >>>  30 .debug_line   00000053  0000000000000000  0000000000000000
>> >>> 000020e3
>> >>> 2**0
>> >>>  31 .debug_loc    00000000  0000000000000000  0000000000000000
>> >>> 0000213e
>> >>> 2**0
>> >>>  32 .debug_pubtypes 00000000  0000000000000000  0000000000000000
>> >>> 00002152  2**0
>> >>>  33 .debug_str    00000069  0000000000000000  0000000000000000
>> >>> 00002166
>> >>> 2**0
>> >>>  34 .debug_ranges 00000000  0000000000000000  0000000000000000
>> >>> 000021d9
>> >>> 2**0
>> >>> $ ./bin/llvm-objdump -h a.out | grep debug
>> >>>  27 .zdebug_info  00000058 0000000000000000
>> >>>  28 .zdebug_abbrev 00000053 0000000000000000
>> >>>  29 .zdebug_aranges 00000014 0000000000000000
>> >>>  30 .zdebug_macinfo 00000014 0000000000000000
>> >>>  31 .zdebug_line  0000005b 0000000000000000
>> >>>  32 .zdebug_loc   00000014 0000000000000000
>> >>>  33 .zdebug_pubtypes 00000014 0000000000000000
>> >>>  34 .zdebug_str   00000073 0000000000000000
>> >>>  35 .zdebug_ranges 00000014 0000000000000000
>> >>>
>> >>> Decompression and proper handling of debug info sections may be needed
>> >>> in llvm-dwarfdump and llvm-symbolizer tools. We can implement this by:
>> >>> 1) Checking if zlib is present in the system during configuration.
>> >>> 2) Adding zlib decompression to llvm::MemoryBuffer, and section
>> >>> decompression to LLVMObject (this would require optional linking with
>> >>> -lz).
>> >>> 3) Using the methods in LLVM tools where needed.
>> >>>
>> >>> Does this make sense to you?
>> >>>
>> >>> --
>> >>> Alexey Samsonov, MSK
>> >>
>> >>
>> >> I'm not sure MemoryBuffer is the right place to do this either. I'm
>> >> also
>> >> not sure if we want debug info decompression to be transparent in
>> >> LLVMObject
>> >> or not. I'm leaning towards no since it's not part of the standard yet,
>> >
>> >
>> > Yeah, I also think that decompression should be explicitly requested by
>> > the
>> > user of LLVMObject.
>> >
>> >>
>> >> unless gold is actually using the SHF_COMPRESSED flag.
>> >>
>> >>
>> >> I think it should be part of Object, but as an external API that is
>> >> used
>> >> when you find a section you know from external factors (the name
>> >> matches
>> >> some list) is compressed.
>> >>
>>
>> Definitely want the feature :)
>>
>> I don't see SHF_COMPRESSED (unless readelf just isn't showing it to
>> me), but it wouldn't be too hard to get binutils to mark them as such.
>> Right now the convention is .z<foo> are compressed, but that's not as
>> precise as we'd like it to be. There's been some talk on the binutils
>> list about it, but it hasn't been implemented yet.
>>
>> -eric
>
>
>
>
> --
> Alexey Samsonov, MSK
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Using zlib to decompress debug info sections.

Chris Lattner-2
On Apr 16, 2013, at 11:53 AM, Eric Christopher <[hidden email]> wrote:
> Historically we've done the former. The latter would require Chris
> wanting to do that.

This case isn't so clearcut.  We like to include libraries in the source to make it easy to get up and running without having to install a ton of dependencies.  However, this has license implications and is generally annoying.

Given that zlib is so widely available by default, and that the compiler can generate correct (albeit uncompressed) debug info, I think the best thing is to *not* include a copy in llvm.  Just detect and use it if we can find it, but otherwise generate uncompressed output.

-Chris
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Using zlib to decompress debug info sections.

Eric Christopher
On Tue, Apr 16, 2013 at 1:37 PM, Chris Lattner <[hidden email]> wrote:
> On Apr 16, 2013, at 11:53 AM, Eric Christopher <[hidden email]> wrote:
>> Historically we've done the former. The latter would require Chris
>> wanting to do that.
>
> This case isn't so clearcut.  We like to include libraries in the source to make it easy to get up and running without having to install a ton of dependencies.  However, this has license implications and is generally annoying.
>
> Given that zlib is so widely available by default, and that the compiler can generate correct (albeit uncompressed) debug info, I think the best thing is to *not* include a copy in llvm.  Just detect and use it if we can find it, but otherwise generate uncompressed output.
>

Sounds good to me. Thanks Chris!

-eric

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Using zlib to decompress debug info sections.

Alexey Samsonov
In reply to this post by Chris Lattner-2

On Wed, Apr 17, 2013 at 12:37 AM, Chris Lattner <[hidden email]> wrote:
On Apr 16, 2013, at 11:53 AM, Eric Christopher <[hidden email]> wrote:
> Historically we've done the former. The latter would require Chris
> wanting to do that.

This case isn't so clearcut.  We like to include libraries in the source to make it easy to get up and running without having to install a ton of dependencies.  However, this has license implications and is generally annoying.

Looks like zlib license is good enough to avoid implications, but I can't really judge.

Given that zlib is so widely available by default, and that the compiler can generate correct (albeit uncompressed) debug info, I think the best thing is to *not* include a copy in llvm.  Just detect and use it if we can find it, but otherwise generate uncompressed output.

Sure, I'll go this way then. Thanks! 

-- 
Alexey Samsonov, MSK

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Using zlib to decompress debug info sections.

Joerg Sonnenberger
In reply to this post by Chris Lattner-2
On Tue, Apr 16, 2013 at 01:37:18PM -0700, Chris Lattner wrote:
> On Apr 16, 2013, at 11:53 AM, Eric Christopher <[hidden email]> wrote:
> > Historically we've done the former. The latter would require Chris
> > wanting to do that.
>
> This case isn't so clearcut.  We like to include libraries in the
> source to make it easy to get up and running without having to install
> a ton of dependencies.  However, this has license implications and is
> generally annoying.

From a security perspective, bundling libraries that had issues in the
past and are not unlikely to have new issues in the future, it is highly
annoying. As such, I would strongly prefer to keep it optional.

Joerg
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Using zlib to decompress debug info sections.

Evgeniy Stepanov
In reply to this post by Alexey Samsonov
This might be a bit late, but I've got another argument for bundling
zlib source with LLVM.

Sanitizer tools need to symbolize stack traces in the reports. We've
been using standalone symbolizer binary until now; sanitizer runtime
spawns a new process as soon as an error is found, and communicates
with it over a pipe. This is very cumbersome to deploy, because we
need to keep another binary around, specify a path to it at runtime,
etc. LLVM lit.cfg already carries some of this burden.

A much better solution would be to statically link symbolization code
into the user application, the same as sanitizer runtime library.
Unfortunately, symbolizer depends on several LLVM libraries, C++
runtime, zlib, etc. Statically linking all that stuff with user code
results in symbol name conflicts.

We've come up with what seems to be a perfect solution (thanks to a
Chandler's advice at the recent developer meeting). We build
everything down to (but not including) libc into LLVM bitcode. This
includes LLVMSupport, LLVMObject, LLVMDebugInfo, libc++, libc++abi,
zlib (!). Then we bundle it all together and internalize all
non-interface symbols: llvm-link && opt -internalize. Then compile
down to a single object file.

This results in a perfect isolation of symbolizer internals. One
drawback is that this requires source for all the things that I
mentioned - and at the moment we've got everything but zlib.

We'd like this to be a part of the normal LLVM build, but that
requires zlib source available somewhere. We could add a
cmake/configure option to point to an externally available source, but
that sounds like a complication we would like to avoid.

WDYT?


On Wed, Apr 17, 2013 at 5:02 PM, Alexey Samsonov <[hidden email]> wrote:

>
> On Wed, Apr 17, 2013 at 12:37 AM, Chris Lattner <[hidden email]> wrote:
>>
>> On Apr 16, 2013, at 11:53 AM, Eric Christopher <[hidden email]> wrote:
>> > Historically we've done the former. The latter would require Chris
>> > wanting to do that.
>>
>> This case isn't so clearcut.  We like to include libraries in the source
>> to make it easy to get up and running without having to install a ton of
>> dependencies.  However, this has license implications and is generally
>> annoying.
>
>
> Looks like zlib license is good enough to avoid implications, but I can't
> really judge.
>>
>>
>> Given that zlib is so widely available by default, and that the compiler
>> can generate correct (albeit uncompressed) debug info, I think the best
>> thing is to *not* include a copy in llvm.  Just detect and use it if we can
>> find it, but otherwise generate uncompressed output.
>
>
> Sure, I'll go this way then. Thanks!
>
> --
> Alexey Samsonov, MSK
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Using zlib to decompress debug info sections.

Chandler Carruth-2
Lemme chat with Danny off list about the best way to do this, and I'll post an update.


On Tue, May 7, 2013 at 11:24 AM, Evgeniy Stepanov <[hidden email]> wrote:
This might be a bit late, but I've got another argument for bundling
zlib source with LLVM.

Sanitizer tools need to symbolize stack traces in the reports. We've
been using standalone symbolizer binary until now; sanitizer runtime
spawns a new process as soon as an error is found, and communicates
with it over a pipe. This is very cumbersome to deploy, because we
need to keep another binary around, specify a path to it at runtime,
etc. LLVM lit.cfg already carries some of this burden.

A much better solution would be to statically link symbolization code
into the user application, the same as sanitizer runtime library.
Unfortunately, symbolizer depends on several LLVM libraries, C++
runtime, zlib, etc. Statically linking all that stuff with user code
results in symbol name conflicts.

We've come up with what seems to be a perfect solution (thanks to a
Chandler's advice at the recent developer meeting). We build
everything down to (but not including) libc into LLVM bitcode. This
includes LLVMSupport, LLVMObject, LLVMDebugInfo, libc++, libc++abi,
zlib (!). Then we bundle it all together and internalize all
non-interface symbols: llvm-link && opt -internalize. Then compile
down to a single object file.

This results in a perfect isolation of symbolizer internals. One
drawback is that this requires source for all the things that I
mentioned - and at the moment we've got everything but zlib.

We'd like this to be a part of the normal LLVM build, but that
requires zlib source available somewhere. We could add a
cmake/configure option to point to an externally available source, but
that sounds like a complication we would like to avoid.

WDYT?


On Wed, Apr 17, 2013 at 5:02 PM, Alexey Samsonov <[hidden email]> wrote:
>
> On Wed, Apr 17, 2013 at 12:37 AM, Chris Lattner <[hidden email]> wrote:
>>
>> On Apr 16, 2013, at 11:53 AM, Eric Christopher <[hidden email]> wrote:
>> > Historically we've done the former. The latter would require Chris
>> > wanting to do that.
>>
>> This case isn't so clearcut.  We like to include libraries in the source
>> to make it easy to get up and running without having to install a ton of
>> dependencies.  However, this has license implications and is generally
>> annoying.
>
>
> Looks like zlib license is good enough to avoid implications, but I can't
> really judge.
>>
>>
>> Given that zlib is so widely available by default, and that the compiler
>> can generate correct (albeit uncompressed) debug info, I think the best
>> thing is to *not* include a copy in llvm.  Just detect and use it if we can
>> find it, but otherwise generate uncompressed output.
>
>
> Sure, I'll go this way then. Thanks!
>
> --
> Alexey Samsonov, MSK
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Using zlib to decompress debug info sections.

James Courtier-Dutton-4
In reply to this post by Evgeniy Stepanov


On May 7, 2013 10:27 AM, "Evgeniy Stepanov" <[hidden email]> wrote:
>
> This might be a bit late, but I've got another argument for bundling
> zlib source with LLVM.
>
> Sanitizer tools need to symbolize stack traces in the reports. We've
> been using standalone symbolizer binary until now; sanitizer runtime
> spawns a new process as soon as an error is found, and communicates
> with it over a pipe. This is very cumbersome to deploy, because we
> need to keep another binary around, specify a path to it at runtime,
> etc. LLVM lit.cfg already carries some of this burden.
>
> A much better solution would be to statically link symbolization code
> into the user application, the same as sanitizer runtime library.
> Unfortunately, symbolizer depends on several LLVM libraries, C++
> runtime, zlib, etc. Statically linking all that stuff with user code
> results in symbol name conflicts.
>
> We've come up with what seems to be a perfect solution (thanks to a
> Chandler's advice at the recent developer meeting). We build
> everything down to (but not including) libc into LLVM bitcode. This
> includes LLVMSupport, LLVMObject, LLVMDebugInfo, libc++, libc++abi,
> zlib (!). Then we bundle it all together and internalize all
> non-interface symbols: llvm-link && opt -internalize. Then compile
> down to a single object file.
>
> This results in a perfect isolation of symbolizer internals. One
> drawback is that this requires source for all the things that I
> mentioned - and at the moment we've got everything but zlib.
>
> We'd like this to be a part of the normal LLVM build, but that
> requires zlib source available somewhere. We could add a
> cmake/configure option to point to an externally available source, but
> that sounds like a complication we would like to avoid.
>
> WDYT?
>
>
> On Wed, Apr 17, 2013 at 5:02 PM, Alexey Samsonov <[hidden email]> wrote:
> >
> > On Wed, Apr 17, 2013 at 12:37 AM, Chris Lattner <[hidden email]> wrote:
> >>
> >> On Apr 16, 2013, at 11:53 AM, Eric Christopher <[hidden email]> wrote:
> >> > Historically we've done the former. The latter would require Chris
> >> > wanting to do that.
> >>
> >> This case isn't so clearcut.  We like to include libraries in the source
> >> to make it easy to get up and running without having to install a ton of
> >> dependencies.  However, this has license implications and is generally
> >> annoying.
> >
> >
> > Looks like zlib license is good enough to avoid implications, but I can't
> > really judge.
> >>
> >>
> >> Given that zlib is so widely available by default, and that the compiler
> >> can generate correct (albeit uncompressed) debug info, I think the best
> >> thing is to *not* include a copy in llvm.  Just detect and use it if we can
> >> find it, but otherwise generate uncompressed output.
> >
> >
> > Sure, I'll go this way then. Thanks!
> >
> > --

It is possible to do both. Include an internal one and also link to and external one, and make it a compile time option which to use.


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Using zlib to decompress debug info sections.

Reid Kleckner-2
In reply to this post by Evgeniy Stepanov
You shouldn't need to use bitcode and opt -internalize to hide the
symbols.  You can do it with objcopy --localize-hidden like we did for
DynamoRIO, but I assume you prefer this route because it ports nicely
to Mac.  :)

On Tue, May 7, 2013 at 5:24 AM, Evgeniy Stepanov
<[hidden email]> wrote:

> This might be a bit late, but I've got another argument for bundling
> zlib source with LLVM.
>
> Sanitizer tools need to symbolize stack traces in the reports. We've
> been using standalone symbolizer binary until now; sanitizer runtime
> spawns a new process as soon as an error is found, and communicates
> with it over a pipe. This is very cumbersome to deploy, because we
> need to keep another binary around, specify a path to it at runtime,
> etc. LLVM lit.cfg already carries some of this burden.
>
> A much better solution would be to statically link symbolization code
> into the user application, the same as sanitizer runtime library.
> Unfortunately, symbolizer depends on several LLVM libraries, C++
> runtime, zlib, etc. Statically linking all that stuff with user code
> results in symbol name conflicts.
>
> We've come up with what seems to be a perfect solution (thanks to a
> Chandler's advice at the recent developer meeting). We build
> everything down to (but not including) libc into LLVM bitcode. This
> includes LLVMSupport, LLVMObject, LLVMDebugInfo, libc++, libc++abi,
> zlib (!). Then we bundle it all together and internalize all
> non-interface symbols: llvm-link && opt -internalize. Then compile
> down to a single object file.
>
> This results in a perfect isolation of symbolizer internals. One
> drawback is that this requires source for all the things that I
> mentioned - and at the moment we've got everything but zlib.
>
> We'd like this to be a part of the normal LLVM build, but that
> requires zlib source available somewhere. We could add a
> cmake/configure option to point to an externally available source, but
> that sounds like a complication we would like to avoid.
>
> WDYT?
>
>
> On Wed, Apr 17, 2013 at 5:02 PM, Alexey Samsonov <[hidden email]> wrote:
>>
>> On Wed, Apr 17, 2013 at 12:37 AM, Chris Lattner <[hidden email]> wrote:
>>>
>>> On Apr 16, 2013, at 11:53 AM, Eric Christopher <[hidden email]> wrote:
>>> > Historically we've done the former. The latter would require Chris
>>> > wanting to do that.
>>>
>>> This case isn't so clearcut.  We like to include libraries in the source
>>> to make it easy to get up and running without having to install a ton of
>>> dependencies.  However, this has license implications and is generally
>>> annoying.
>>
>>
>> Looks like zlib license is good enough to avoid implications, but I can't
>> really judge.
>>>
>>>
>>> Given that zlib is so widely available by default, and that the compiler
>>> can generate correct (albeit uncompressed) debug info, I think the best
>>> thing is to *not* include a copy in llvm.  Just detect and use it if we can
>>> find it, but otherwise generate uncompressed output.
>>
>>
>> Sure, I'll go this way then. Thanks!
>>
>> --
>> Alexey Samsonov, MSK
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> [hidden email]         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: RFC: Using zlib to decompress debug info sections.

Evgeniy Stepanov
Portability is always good.

But objdump method does not seem to work well when there is code we
don't fully control. Hidden visibility is overridable, and there is
enough cases of that in libcxx and libcxxabi to cause problems. Entire
exception interface, for example.


On Tue, May 7, 2013 at 5:06 PM, Reid Kleckner <[hidden email]> wrote:

> You shouldn't need to use bitcode and opt -internalize to hide the
> symbols.  You can do it with objcopy --localize-hidden like we did for
> DynamoRIO, but I assume you prefer this route because it ports nicely
> to Mac.  :)
>
> On Tue, May 7, 2013 at 5:24 AM, Evgeniy Stepanov
> <[hidden email]> wrote:
>> This might be a bit late, but I've got another argument for bundling
>> zlib source with LLVM.
>>
>> Sanitizer tools need to symbolize stack traces in the reports. We've
>> been using standalone symbolizer binary until now; sanitizer runtime
>> spawns a new process as soon as an error is found, and communicates
>> with it over a pipe. This is very cumbersome to deploy, because we
>> need to keep another binary around, specify a path to it at runtime,
>> etc. LLVM lit.cfg already carries some of this burden.
>>
>> A much better solution would be to statically link symbolization code
>> into the user application, the same as sanitizer runtime library.
>> Unfortunately, symbolizer depends on several LLVM libraries, C++
>> runtime, zlib, etc. Statically linking all that stuff with user code
>> results in symbol name conflicts.
>>
>> We've come up with what seems to be a perfect solution (thanks to a
>> Chandler's advice at the recent developer meeting). We build
>> everything down to (but not including) libc into LLVM bitcode. This
>> includes LLVMSupport, LLVMObject, LLVMDebugInfo, libc++, libc++abi,
>> zlib (!). Then we bundle it all together and internalize all
>> non-interface symbols: llvm-link && opt -internalize. Then compile
>> down to a single object file.
>>
>> This results in a perfect isolation of symbolizer internals. One
>> drawback is that this requires source for all the things that I
>> mentioned - and at the moment we've got everything but zlib.
>>
>> We'd like this to be a part of the normal LLVM build, but that
>> requires zlib source available somewhere. We could add a
>> cmake/configure option to point to an externally available source, but
>> that sounds like a complication we would like to avoid.
>>
>> WDYT?
>>
>>
>> On Wed, Apr 17, 2013 at 5:02 PM, Alexey Samsonov <[hidden email]> wrote:
>>>
>>> On Wed, Apr 17, 2013 at 12:37 AM, Chris Lattner <[hidden email]> wrote:
>>>>
>>>> On Apr 16, 2013, at 11:53 AM, Eric Christopher <[hidden email]> wrote:
>>>> > Historically we've done the former. The latter would require Chris
>>>> > wanting to do that.
>>>>
>>>> This case isn't so clearcut.  We like to include libraries in the source
>>>> to make it easy to get up and running without having to install a ton of
>>>> dependencies.  However, this has license implications and is generally
>>>> annoying.
>>>
>>>
>>> Looks like zlib license is good enough to avoid implications, but I can't
>>> really judge.
>>>>
>>>>
>>>> Given that zlib is so widely available by default, and that the compiler
>>>> can generate correct (albeit uncompressed) debug info, I think the best
>>>> thing is to *not* include a copy in llvm.  Just detect and use it if we can
>>>> find it, but otherwise generate uncompressed output.
>>>
>>>
>>> Sure, I'll go this way then. Thanks!
>>>
>>> --
>>> Alexey Samsonov, MSK
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> [hidden email]         http://llvm.cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>
>> _______________________________________________
>> LLVM Developers mailing list
>> [hidden email]         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev