[llvm-dev] EuroLLVM 2019 - LLVM Binutils BoF notes

Previous Topic Next Topic
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

[llvm-dev] EuroLLVM 2019 - LLVM Binutils BoF notes

Amara Emerson via llvm-dev
Hi All,

Petr Hosek kindly took notes for the LLVM Binutils BoF from the recent Euro LLVM, and I am putting them up here for all to see (see below). I'll separately email around my own notes from the round table that happened the following day.

Thanks for all the contributions!



LLVM binary utilities, originally used in testing LLVM components, now being used more widely as a replacement for GNU binutils.

Lot of bugs coming and being resolved (70 open, 81 resolved in last 2 years). GSoC 2018 project by Paul Semel. 920 commits to LLVM binary utilities (excluding related libraries).

LLVM tools should be "drop-in replacements" for GNU binutils tools. These tools are often being used in configure-style scripts, so it's important to have identical output and switches (even if it means e.g. using special name for the tool or a flag).

This was an important design point for llvm-objcopy: If GNU objcopy's behavior makes sense, we copy that behavior, else we try to support the specific use case. Some use cases may not be supported e.g. because they're weird, having different name for compatibility reason might be a reasonable solution, e.g. --strip-all-gnu in llvm-objcopy. GNU objcopy works differently, it basically relinks the file. LLVM objcopy is architected differently, but we could always find a way forward.

lld already uses similar mode of operation behaving differently based on the name (e.g. ld.lld vs lld-link). llvm-readelf is equivalent to llvm-readobj --elf-output-style=GNU and changes some other flags. Similar patch for llvm-symbolize and addr2line is currently under review.

Original purpose of LLVM tools was for testing, not for human consumption, but it seems like based on the number of requests and bugs, we should provide byte-for-byte identical output. configure scripts often breaks in a subtle way when the output isn't identical, they rely on human output even though it's consumed by machine. It might be valuable to support both machine readable (e.g. JSON) and human readable output (for legacy vs future scripts).

There are three different types of object files (code generated by GCC or Clang, GCC LTO and Clang LTO), neither LLVM nor GNU tools handle these correctly. GNU binutils tools use the plugin to handle LTO, so someone could write a plugin to handle LLVM's LTO code in binutils and vice-versa LLVM binutils can support the plugin interface.

What about backward compatibility guarantees? In general breaking backward compatibility between releases is not that bad as long as there's a path forward. Best time to break the compatibility is now because not very many projects use LLVM versions, that won't be the case in the future.

Most tools still use LLVM's cl::opt, haven't moved to tablegen, it depends on the tool, but in some cases matching binutils would require a complete re-architecture. Right now we have the best possible opportunity to change the architecture if we want to because we're still in the ramp up stage.

--help has a lot of options generated from default opt, it makes the output unusable, could we remove all the useless options to make the help output more useful? If you're aware of specific instances of this, please file bugs, people could pick these up as starter bugs since these are usually very easy to fix.

In case of llvm-objdump -D we're actually disassembling, it's really hard to match byte-for-byte output, but we can also get much much better than binutils' objdump output, trying to do identical output may not be worth it. What would be the improvement? Support for Thumb and Arm in the same output, constant islands, control-flow visualization, etc.

Is it possible to implement all these tools as library with driver that's as thin as possible? We don't have a good solution right now for most of these tools, but we should really aspire to this as a noble goal. Any patches that get us closer to that goal are very welcome. Writing things as library first is a great goal.

Each tool currently links in big portion of LLVM backend inflating the size. Using dynamic linking introduces a significant performance hit. We could busyboxify LLVM tools to reduce size without getting the performance hit.

LLVM Developers mailing list
[hidden email]